Build inference that’s measurable, explainable, and boring in production.

AtlasInference publishes small, composable open-source tools for inference pipelines, evaluation harnesses, and deployment patterns. The goal is fewer surprises, not more magic.

docs-first reproducible runs schema-driven interfaces observability-ready

Explore starter repos How to contribute

System map

Inference batching • routing • caching

Evaluation tasks • metrics • scorecards

Registry manifests • versions • constraints

Deployment blueprints • traces • runbooks

Click a node to see the emphasis shift across the site.

Overview

AtlasInference is structured like a map: clear boundaries, named interfaces, and documented routes between components.

Why it exists

Because “works in the demo” is not a reliability strategy

Inference systems fail at the seams: untracked changes, silent regressions, and missing observability. We focus on the seams.

failure modes budgeted latency diffable behavior

How it’s built

Small repos with stable contracts

Adopt one component at a time. Every repo should make it obvious what it does and what it does not do.

schemas versioning docs-first

What we build

Infrastructure components that can be inspected and reasoned about. The emphasis is on clarity.

Inference

Adapters and controls

Request contracts, batching, caching, routing rules, and tracing context that survives across services.

OpenTelemetrycost controlsqueues

Evaluation

Harnesses and scorecards

Repeatable test suites for models and prompts that make regressions visible before users find them.

benchmarksfixturesreports

Registry

Model manifests

Metadata and compatibility constraints that travel with your model, not with tribal knowledge.

schemasdiffsconstraints

Deployment

Blueprints and runbooks

Reference deployments, operational checklists, and observability patterns that scale across teams.

runbooksSLOsalerts

Starter repos

These are “credible defaults” you can publish first, then evolve. Keep the surface area small.

repo

inference-kit

Minimal toolkit for adapters, batching, caching, tracing hooks, and request schemas.

TypeScriptPythonOTel

repo

eval-harness

CLI runner + metrics library for repeatable evaluations and regression scorecards.

CLIreportsfixtures

repo

model-registry

Schema-driven manifest standard for versions, constraints, expected behavior, and notes.

JSON Schemadiffableportable

repo

deploy-blueprints

Reference deployments + runbooks for predictable operations, tracing, and rollback patterns.

K8sSLOsrunbooks

Principles

We bias toward interfaces that are inspectable, testable, and stable over time.

Make behavior testable

If a change matters, you should be able to detect it. Prefer harnesses over opinions.

Make tradeoffs explicit

Latency and cost budgets are product decisions. Tooling should help teams choose clearly.

Favor boring interfaces

Stability is adoption. Simple contracts make integration obvious and maintenance cheaper.

Observe what matters

If you cannot explain performance, you cannot improve it. Traces and metrics come first.

Contribute

Start with docs, examples, and tests. Reliability improves fastest when the seams are documented.

Good first contributions

Make one path clearer

Add an example, tighten a schema, or document a failure mode. Small improvements compound.

docs examples tests

Community

Open an issue with context

Describe the environment, the expected behavior, and what you observed. Repro beats rhetoric.

Open on GitHub