OpenAI-compatible LLM gateway
One API key for Codex and Claude.
LLMApi puts your local AI engines behind a single OpenAI-compatible endpoint — issue scoped keys, enforce quotas and concurrency, stream responses, generate images, and meter every token. Drop it into any OpenAI SDK.
Open the consoleOne gateway. Every engine.
Your apps speak the OpenAI API. LLMApi sits in the middle — it authenticates the key, enforces scopes and quotas, meters every call, then routes it to the right local engine through its adapter.
Your apps
OpenAI-compatible clients
Change the base URL + key. Nothing else.
LLMApi gateway
the door between your apps and engines
Local engines · via adapters
Codex
GPT-5 family
Claude
Opus · Sonnet · Haiku
Everything a gateway needs
OpenAI-compatible
Point any OpenAI SDK at /v1 — chat completions, token-level streaming and image generation work unchanged.
Codex + Claude
Two AI engines behind one unified interface, with live per-account model discovery — no unsupported-model surprises.
Sensible API keys
SHA-256-hashed keys with scopes, daily quotas, concurrency limits and instant revocation.
Usage metering
Every request is metered — requests, tokens and cost, broken down by model and by individual key.
Locked-down by default
Each request runs read-only, with no network, in a throwaway workspace. One fresh run per call.
Streaming & images
Server-sent token streaming plus aspect-ratio image generation, all behind the same key.
Drop-in OpenAI compatibility
Point any OpenAI SDK at your gateway — change the base URL and the key, nothing else.
curl https://your-host/v1/chat/completions \
-H "Authorization: Bearer sk-..." \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-6",
"messages": [{ "role": "user", "content": "Hello!" }],
"stream": true
}'/v1
OpenAI-compatible surface
2
AI engines: Codex + Claude
SHA-256
Hashed API keys
0
Plaintext secrets stored
Frequently asked
Is it really OpenAI-compatible?
Yes. /v1/chat/completions, /v1/models and /v1/images/generations follow the OpenAI wire format, so existing SDKs work by changing only the base URL and key.
Which models can I call?
Whatever your Codex and Claude accounts expose — the model list is discovered live from each CLI, so you never hit an unsupported-model error.
How are API keys secured?
Only a SHA-256 hash is stored; the secret is shown once at creation. Keys carry scopes, daily quotas and concurrency limits, and can be revoked instantly.
Can I see how much each key uses?
The console shows requests, tokens and cost over time, broken down by model and by individual API key.

