Introduction - Majordomo

Majordomo sits between your application and your LLM providers. Every request is logged with model, tokens, cost, and latency. You get a dashboard, a query layer, and tools to test model changes before they ship — without touching your application code.

What you get

Cost & Usage Tracking

Every request logged. Break down spend by team, feature, environment, or user with custom metadata headers.

Replay

Run real production traffic against a candidate model. Get actual cost, latency, and quality numbers on your workload before you switch.

Evals

Build test suites from logged requests. Define scoring criteria. Run scored evaluations against any model before a change ships.

Multi-provider Routing

OpenAI, Anthropic, and Gemini from a single endpoint. Switch models in the dashboard without touching application code.

Two deployment modes

Managed — Majordomo runs Steward and the dashboard. Point your SDK at the gateway endpoint, create an API key, and you’re logging requests within minutes. No infrastructure to operate. Self-hosted Steward — You run Steward inside your own VPC. Your prompts and completions never leave your infrastructure. Majordomo receives only metadata — token counts, cost, latency, model name. The dashboard works identically. The right choice for teams with data residency requirements or enterprise customers who need to control where AI data is processed. How it works →

How integration works

One config change. Everything else stays the same.

# Before
client = OpenAI(api_key="sk-...")

# After
client = OpenAI(
    base_url="https://gateway.gomajordomo.com/v1",
    api_key="sk-...",
    default_headers={"X-Majordomo-Key": "mdm_sk_..."}
)

Open source

The gateway is open source. The code that processes your requests is public and auditable. GitHub →

Quick Start

​What you get