Providers & models

aigw puts every model behind one OpenAI-compatible endpoint. You either let us broker the providers (one contract, one bill) or bring your own provider keys — or mix both.

The model catalog

GET /v1/models returns the models your virtual key may use. Each model maps to a provider and a price. To call a model, just set the model string on a normal OpenAI-style request:

client.chat.completions.create(model="claude-sonnet-4-6", messages=msgs)
client.chat.completions.create(model="gpt", messages=msgs)
client.chat.completions.create(model="mistral-large", messages=msgs)

Adapters: one front door, many providers

Requests dispatch to an adapter keyed by the provider:

OpenAI and any OpenAI-compatible provider (Groq, Together, Mistral, Azure, DeepSeek, vLLM, …) work through the universal path — onboard a new one by config alone, no code.
Anthropic is translated both ways: your OpenAI-shaped request becomes a Messages call, and the response (including streaming) comes back as OpenAI chunks. Your code doesn’t change.

Bring your own keys, or let us broker

Brokered — aigw holds the provider relationship. You sign one contract with us and use any model; we meter usage at cost. This is what kills the per-provider procurement cycle.
BYO — supply your own provider key. It’s sealed at rest; a gateway can spend it but cannot decrypt any other tenant secret. Good as a lock-in escape hatch or for private upstreams.

Failover & retries

ProvidersFor(tenant, kind) returns your enabled providers in priority order (name them 1-primary, 2-backup, …). On a transient upstream error or 429/5xx, aigw retries on the same provider, then fails over to the next — before any bytes are relayed, so streaming is never retried mid-flight. Unsupported operations short-circuit without failover.

Self-hosted & private upstreams

Run the stateless gateway in your own VPC and point it at a private vLLM or Ollama endpoint. Private upstreams are gated behind an explicit opt-in so the gateway’s SSRF guard stays on by default.