The gateway exposes an OpenAI-compatible REST API. All authenticated endpoints require a tenant API key as a Bearer token.
Base URL: https://your-host/api/<tenant-slug>/v1
Authentication: Authorization: Bearer omp-<your-api-key>
Chat completions. Supports streaming and non-streaming.
Request:
{
"model": "gpt-4o",
"messages": [
{ "role": "system", "content": "You are a helpful assistant." },
{ "role": "user", "content": "Hello!" }
],
"temperature": 0.7,
"max_tokens": 1024,
"stream": false
}Set "model": "auto" to enable automatic routing. Open Model Prism will classify the request and select the best model.
max_tokens clamping: If max_tokens exceeds the selected model's maximum output limit, the server automatically clamps it to that limit rather than returning an error.
Model policy enforcement: The tenant's model access configuration (whitelist or blacklist) is enforced server-side. If the requested model is not permitted, the gateway returns a 403 error (see Error Responses).
Response (non-streaming):
{
"id": "chatcmpl-...",
"object": "chat.completion",
"choices": [
{
"index": 0,
"message": { "role": "assistant", "content": "Hello! How can I help?" },
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 24,
"completion_tokens": 9,
"total_tokens": 33
},
"cost_info": {
"actual_cost": 0.0000087,
"baseline_cost": 0.0000087,
"saved": 0,
"input_tokens": 24,
"output_tokens": 9
}
}When model=auto, an additional auto_routing field is included:
{
"auto_routing": {
"category": "smalltalk_simple",
"confidence": 0.94,
"complexity": "simple",
"cost_tier": "minimal",
"model_id": "gpt-4o-mini",
"override_applied": "",
"analysis_time_ms": 8,
"domain": "general",
"reasoning": "Short greeting, no technical content"
}
}When a context overflow fallback occurred:
{
"context_fallback": {
"original_model": "claude-sonnet-4-6",
"fallback_model": "claude-opus-4-6",
"reason": "context_overflow"
}
}Streaming: Set "stream": true to receive Server-Sent Events. The stream follows the standard OpenAI SSE format with data: {...} lines and a final data: [DONE].
Text embeddings. The tenant's model whitelist/blacklist policy is enforced; requests for disallowed models return 403.
Request:
{
"model": "text-embedding-3-small",
"input": "The quick brown fox"
}Response:
{
"object": "list",
"data": [
{
"object": "embedding",
"index": 0,
"embedding": [0.0023064255, -0.009327292, ...]
}
],
"model": "text-embedding-3-small",
"usage": { "prompt_tokens": 5, "total_tokens": 5 }
}List models available to this tenant. Respects the tenant's model access config (whitelist/blacklist).
Auth: Required (Bearer token)
Response:
{
"object": "list",
"data": [
{
"id": "gpt-4o",
"object": "model",
"created": 1715367049,
"owned_by": "openai"
}
]
}Same as /v1/models but no authentication required. Useful for client setup tools that probe available models before auth is configured.
Tenant health check. No authentication required.
Response (200 — healthy):
{
"status": "ok",
"tenant": "team-alpha",
"providers": [
{ "name": "OpenAI", "status": "ok", "models": 47 }
],
"timestamp": "2026-04-01T12:00:00.000Z"
}Response (503 — degraded):
{
"status": "degraded",
"tenant": "team-alpha",
"providers": [
{ "name": "OpenAI", "status": "error", "error": "Connection refused" }
]
}All gateway errors follow the OpenAI error format:
{
"error": {
"message": "API key expired",
"type": "authentication_error",
"code": "key_expired"
}
}| HTTP | Type | Code | Cause |
|---|---|---|---|
| 401 | authentication_error |
missing_api_key |
No Bearer token |
| 401 | authentication_error |
invalid_api_key |
Key not found |
| 401 | authentication_error |
key_disabled |
Key disabled |
| 401 | authentication_error |
key_expired |
Key past expiry date |
| 429 | rate_limit_error |
rate_limit_exceeded |
Per-tenant rate limit hit |
| 400 | invalid_request_error |
— | Malformed request body |
| 403 | access_denied |
— | Model '...' is not allowed by tenant model policy |
| 503 | provider_error |
— | Upstream provider unavailable |
All admin endpoints require a valid JWT Bearer token. Role requirements are noted per endpoint.
Base URL: https://your-host/api/prism/admin
Authentication: Authorization: Bearer <jwt-token>
Login: POST /api/prism/auth/login → { token: "eyJ..." }
| Method | Endpoint | Role | Description |
|---|---|---|---|
GET |
/providers |
maintainer+ | List all providers |
POST |
/providers |
maintainer+ | Create provider |
PUT |
/providers/:id |
maintainer+ | Update provider |
DELETE |
/providers/:id |
maintainer+ | Delete provider |
POST |
/providers/:id/check |
maintainer+ | Test connection with detailed log |
POST |
/providers/:id/discover |
maintainer+ | Discover and save models |
POST |
/providers/:id/chat |
maintainer+ | Test chat request (setup wizard) |
GET |
/providers/models/all |
maintainer+ | Flat list of all models across providers |
GET |
/providers/models/suggest |
maintainer+ | Auto-suggest metadata for a model ID |
PATCH |
/providers/:id/models/:modelId |
maintainer+ | Update model metadata |
POST |
/providers/models/reorder-tier |
maintainer+ | Bulk reorder priorities within a tier |
| Method | Endpoint | Role | Description |
|---|---|---|---|
GET |
/tenants |
maintainer+ | List all tenants |
POST |
/tenants |
maintainer+ | Create tenant |
PUT |
/tenants/:id |
maintainer+ | Update tenant |
DELETE |
/tenants/:id |
maintainer+ | Delete tenant |
POST |
/tenants/:id/rotate-key |
maintainer+ | Rotate API key |
POST |
/tenants/:id/set-key |
maintainer+ | Set custom API key |
PUT |
/tenants/:id/model-config |
maintainer+ | Update model access config |
set-key request body:
{
"apiKey": "omp-custom-key-value",
"keyLifetimeDays": 90
}| Method | Endpoint | Role | Description |
|---|---|---|---|
GET |
/categories |
any user | List all routing categories |
POST |
/categories |
maintainer+ | Create category |
PUT |
/categories/:id |
maintainer+ | Update category |
DELETE |
/categories/:id |
maintainer+ | Delete category |
POST |
/categories/reset-defaults |
maintainer+ | Re-seed deleted built-in categories |
GET |
/categories/presets |
any user | List available preset profiles |
POST |
/categories/apply-preset |
maintainer+ | Apply preset profiles (assigns default models) |
apply-preset request body:
{
"profileIds": ["software_development", "data_operations"],
"providerId": "<optional-provider-id>"
}apply-preset response:
{
"profiles": ["software_development", "data_operations"],
"categoriesConsidered": 14,
"updated": 11,
"skipped": 3,
"assignments": [
{ "category": "code_generation", "model": "deepseek-coder-v2", "tier": "medium", "score": 91 }
]
}| Method | Endpoint | Role | Description |
|---|---|---|---|
GET |
/dashboard/summary |
any user | KPI summary (cost, tokens, requests, savings) |
GET |
/dashboard/daily |
any user | Daily time-series (cost + tokens per day) |
GET |
/dashboard/models |
any user | Model usage breakdown |
GET |
/dashboard/categories |
any user | Category usage breakdown |
GET |
/dashboard/users |
any user | Per-user usage breakdown |
GET |
/dashboard/requests |
any user | Paginated request log |
All dashboard endpoints accept ?days=7|30|90 and ?tenantId=<id>. Tenant-viewer and tenant-admin roles are automatically scoped to their assigned tenants.
| Method | Endpoint | Role | Description |
|---|---|---|---|
GET |
/users |
admin | List all users |
POST |
/users |
admin | Create user |
PUT |
/users/:id |
admin | Update user (role, tenants, password, active) |
DELETE |
/users/:id |
admin | Delete user (cannot delete self or last admin) |
| Method | Endpoint | Role | Description |
|---|---|---|---|
GET |
/ldap |
admin | Get LDAP configuration |
PUT |
/ldap |
admin | Update LDAP configuration |
POST |
/ldap/test |
admin | Test LDAP connection and group mapping |
| Method | Endpoint | Role | Description |
|---|---|---|---|
POST |
/routing/rule-sets |
maintainer+ | Create, update, or delete routing rule sets |
| Method | Endpoint | Role | Description |
|---|---|---|---|
GET |
/system/overview |
admin | System overview (pods, counters) |
GET |
/system/log-config |
admin | Get file logging configuration |
PUT |
/system/log-config |
admin | Update file logging configuration |
DELETE |
/system/pods/:podId |
admin | Evict pod |
| Method | Endpoint | Role | Description |
|---|---|---|---|
GET |
/tokenize |
any user | Estimate tokens for a text string (?text=...) |
POST |
/tokenize |
any user | Estimate tokens for a messages array |
POST body:
{
"messages": [
{ "role": "user", "content": "Hello, how are you?" }
]
}Response:
{
"estimated_tokens": 12,
"method": "heuristic",
"chars": 20
}Self-service API for tenant-admin role. Also accessible by admin and maintainer.
Base URL: https://your-host/api/tenant-portal
| Method | Endpoint | Description |
|---|---|---|
GET |
/mine |
List own tenants |
GET |
/:id |
Get tenant config |
PUT |
/:id/model-config |
Update model access (mode + list) |
GET |
/:id/models |
List accessible models |
model-config request body:
{
"mode": "whitelist",
"list": ["gpt-4o", "gpt-4o-mini", "claude-sonnet-4-6"]
}| Method | Endpoint | Auth | Description |
|---|---|---|---|
GET |
/health |
None | Server health + DB status |
GET |
/metrics |
None | Prometheus metrics |
GET |
/api/prism/setup/status |
None | Whether first-run setup is complete |
POST |
/api/prism/setup/admin |
None | Create initial admin account (setup only) |
POST |
/api/prism/setup/complete |
JWT | Mark setup as complete |
POST |
/api/prism/auth/login |
None | Login → JWT |
GET |
/api/prism/auth/me |
JWT | Current user info |