One API · every model · your infrastructure

Route every model.
Keep every byte.

The Cudator Platform is the routing layer for everything you build: one OpenAI-compatible endpoint that picks the best model per request, keeps sovereign workloads on ground you control, and meters spend to a single wallet.

Get an API key Read the docs

OpenAI-compatible Zero data retention One wallet, any currency

POST /v1/chat/completions

Powering AI at teams that can't afford to get infrastructure wrong

StrategicSyntax EclipseDC CloudMas

What the Platform does

One gateway between your app and every model you'll ever use.

Stop wiring keys, quotas, and failover logic per vendor. Point your SDK at Cudator once and let the platform handle routing, residency, and money.

Model routing

Send one request; Cudator picks the right model on price, latency, and quality — then load-balances and fails over across a pool of credentials.

Policy-based routing by cost, latency or task
Weighted load-balancing & automatic failover
One OpenAI-compatible endpoint for all vendors

Sovereign private routing

Pin sensitive traffic to providers and regions you approve — or to your own self-hosted models. Data never leaves the ground you control.

Route to on-prem & VPC-hosted models (vLLM)
Region & residency pinning, per workspace
Zero retention, full request-level audit trail

Wallet & payments

Every provider's spend, in every country, rolls into one wallet. Set caps per key, team, and legal entity — then settle in the currency you choose.

One wallet across subsidiaries & currencies
Spend limits per key, user & workspace
Itemised usage settlement & export

How it works

From request to settled invoice, in one hop.

Point your SDK

Swap your base URL and key. Cudator speaks the OpenAI API, so existing code keeps working unchanged.

Apply a routing policy

Choose the cheapest, fastest, or highest-quality path — and pin sensitive workloads to approved regions or self-hosted models.

Cudator routes & fails over

Requests load-balance across a credential pool. If a provider errors or rate-limits, traffic shifts automatically.

Spend settles to the wallet

Every token is metered, attributed, and deducted from one wallet — with caps enforced before the call ever leaves.

Drop-in API

Already built? Change two lines.

Cudator speaks the OpenAI API. Point the base URL at the gateway, use a Cudator key, and routing, residency, and global billing come along for free.

Drop-in OpenAI SDK compatibility — streaming, tools & embeddings
One policy header to pin region or self-hosted models
See the full reference in the Platform docs

route.py

from openai import OpenAI

client = OpenAI(
    base_url="https://api.cudator.ai/v1",
    api_key="cud_live_••••••••",
)

# route by policy — Cudator picks the model
resp = client.chat.completions.create(
    model="auto",
    extra_headers={"X-Cudator-Policy": "sovereign-eu"},
    messages=[{"role": "user",
               "content": "Summarise this record."}],
)

# → served by self-hosted vLLM · eu-west-1
# → $0.0004 metered to wallet · 38ms

Sovereign by design

Pick the provider, region, and terms — and prove it on every call.

For teams in finance, health, and the public sector, "where does this data go — and who can compel it?" isn't a footnote. Cudator turns provider choice, region, and data terms into a routing rule: enforced before the request leaves, proven after it lands.

Approved-pool routing
Route only to the providers, regions, and data-processing terms you've cleared. Out-of-policy credentials aren't in the pool, so sensitive traffic can't slip out the wrong door by accident.
Region & jurisdiction aware
Pin each workspace to approved regions, and watch for silent out-of-region fallback. Cudator surfaces the legal jurisdiction behind every provider — because EU-region hosting isn't the same as EU jurisdiction.
Zero retention, request-level proof
Payloads are never stored, and routing respects each provider's no-retention terms. Every call is logged with model, provider, region, and cost — export the trail for your auditors.

workspace · meridian-health

residency policy · active

Allowed providersAzure OpenAI (EU) · Mistral

Allowed regioneu-west-1

Out-of-region fallbackblocked

Unapproved providersblocked

Payload retentionnone

HIPAA GDPR ISO 27001 Data residency

Models

Bring the frontier, the open, and your own.

Hosted frontier labs, embedding specialists, and OpenAI-compatible self-hosted clusters — managed as one routable pool.

OOpenAIchat · embeddings

AAnthropicchat

GGooglechat

MMistralchat

CCohereembeddings

VVoyage AIembeddings

LLlama · vLLMself-hosted

DDeepSeekself-hosted

Compare every model — context, price, and sovereignty — in the full model catalog. Browse the catalog

"We moved seven AI products onto Cudator in a quarter. Our auditors got a request-level trail, our finance team got one invoice, and engineering stopped babysitting provider keys."

Mara Köhler

VP Platform Engineering · EclipseDC

providers, one pool

99.98%

routing uptime

<40ms

routing overhead

payloads retained

Start building

Ship AI on ground you control.

Create a workspace, drop in your base URL, and route your first request in minutes. No card required to start.

Get an API key See pricing

Prepaid or invoiced · multi-currency · cancel anytime

Route every model.Keep every byte.