Open Model Prism

Idea & Architecture by weisser-dev — Developed 100% by Claude Sonnet & Opus

A multi-tenant, OpenAI-API-compatible LLM gateway with intelligent model routing, cost tracking, and a full admin UI.

Connect any LLM provider (OpenAI, Anthropic, AWS Bedrock, Azure OpenAI, Ollama, vLLM, OpenRouter, and more) through a single unified endpoint. Automatically classify incoming requests and route them to the optimal model based on task type, content signals, cost tier, and capability benchmarks.

Key Features

Multi-provider gateway — OpenAI, Anthropic, AWS Bedrock, Azure OpenAI, Ollama, vLLM, OpenRouter, and more through a single OpenAI-compatible API
Intelligent auto-routing — LLM classifier + keyword rules + system prompt role detection route each request to the optimal model and tier
Four-step force-route strictness ^v2.1 — off / fim_only / smart / all. smart keeps the user's deliberate model choice whenever the classified category is substantial (coding, reasoning, system design, analysis …) and only re-routes trivial categories like smalltalk or chat title generation. A configurable per-request opt-out token (--model-prism-accept-model by default) lets senior developers bypass force-routing for a single prompt
Category-specific target system prompts ^v2.1.1 — Each routing category can carry an optional system prompt that is injected into the forwarded request, priming the target model for the specific task type (e.g. coding categories → "You are an expert software engineer…"; legal → "You are a careful legal analyst…"). Configurable per category in the admin UI; 10 built-in categories ship with sensible defaults
8-tier cost hierarchy — micro, minimal, low, medium, advanced, high, ultra, critical — with configurable cost mode (economy/balanced/quality) and explicit tier boost
Test Route — dry-run any prompt through the full routing pipeline and see a step-by-step trace of every decision (signal extraction, rule matching, classifier output, overrides, model selection)
Synthetic Tests ^AI — generate test prompts using AI, run them against the current routing config, and evaluate results with AI-powered analysis including quality and cost optimization suggestions
Routing Debug Panel — every auto-routed request in the log shows an expandable debug view with extracted signals, pre-routing status, classifier confidence, applied overrides, and final model selection
Configurable override rules — vision upgrade, tool call minimum tier, frustration detection, conversation turn escalation, domain gate, confidence fallback — each with description tooltips
Multi-tenant isolation — per-tenant API keys, model whitelists, rate limits, quotas, and independent routing configuration
Cost tracking & savings — real-time dashboard with spending, savings vs baseline, model distribution, and daily trends
Guided tour — interactive walkthrough highlighting key UI areas including the new routing debug and synthetic test features

Requirements

Dependency	Version	Notes
Node.js	22+	Backend + frontend build
Docker	24+	MongoDB in dev, full stack in prod
Docker Compose	v2 (`docker compose`)	Not `docker-compose` v1
just	any	Optional but recommended command runner
MongoDB	7	Provided via Docker Compose

Install just on macOS:

brew install just

On Linux:

curl --proto '=https' --tlsv1.2 -sSf https://just.systems/install.sh | bash -s -- --to /usr/local/bin

Local Development

1. Clone and enter the repo

git clone https://github.com/weisser-dev/open-model-prism
cd model-prism

2. Copy environment file

cp .env.example .env

Edit .env — at minimum set JWT_SECRET and ENCRYPTION_KEY:

JWT_SECRET=your-random-secret-at-least-32-chars
ENCRYPTION_KEY=exactly-32-chars-here-!-padding

Generate secure values:

# JWT_SECRET
openssl rand -hex 32

# ENCRYPTION_KEY (must be exactly 32 characters)
openssl rand -hex 16

3. Start everything

just dev

This single command:

Starts MongoDB in Docker
Waits for MongoDB to be healthy
Starts the backend on http://localhost:3000 (hot reload via nodemon)
Starts the frontend on http://localhost:5173 (hot reload via Vite HMR)

Open http://localhost:5173 and follow the 4-step setup wizard.

The setup wizard creates your first admin user, connects your first provider, and sets up your first tenant. Takes about 2 minutes.

4. Available dev commands

just dev          # Start MongoDB + backend + frontend with hot reload
just dev-clean    # Wipe DB, fresh start (setup wizard reappears)
just logs         # Tail backend logs
just mongo        # Open MongoDB shell
just build        # Build production Docker image
just up           # Start via Docker Compose (production-like)
just down         # Stop Docker Compose
just clean        # Remove all containers, volumes, node_modules

5. Manual setup (without just)

# Terminal 1 — MongoDB
docker compose up -d mongodb

# Terminal 2 — Backend
cd server && npm install && npm run dev

# Terminal 3 — Frontend
cd frontend && npm install && npm run dev

Environment Variables

All configuration after initial setup lives in the admin UI. Only infrastructure-level settings need env vars:

Variable	Required	Default	Description
`JWT_SECRET`	yes	—	JWT signing secret. Min 32 chars. Never rotate without invalidating all sessions.
`ENCRYPTION_KEY`	yes	—	AES-256-GCM key for encrypting provider API keys at rest. Exactly 32 chars. Changing this breaks all saved provider credentials.
`MONGODB_URI`	no	`mongodb://mongodb:27017/openmodelprism`	MongoDB connection string
`PORT`	no	`3000`	Backend HTTP port
`NODE_ENV`	no	`development`	`production` enables structured JSON logs
`NODE_ROLE`	no	`full`	`full` / `control` / `worker` — see Horizontal Scaling
`CORS_ORIGINS`	no	`*`	Comma-separated allowed origins
`LOG_LEVEL`	no	`info`	`debug` / `info` / `warn` / `error`
`OFFLINE`	no	`false`	`true` disables all outbound internet (air-gapped mode)

Architecture

Single-pod (default) — everything in one container:

Clients  →  Model Prism (NODE_ROLE=full)  →  MongoDB
                 Admin UI + Gateway + API

Scaled deployment — control plane + worker pods:

Clients (Continue · Cursor · Claude Code · Open WebUI · SDK …)
  │
  ├─ /api/:tenant/v1/*  ───────────────────► Worker pods  (NODE_ROLE=worker, scale freely)
  ├─ /api/v1/*  (default tenant shorthand) ┘
  ├─ /v1/*      (no-prefix shorthand)      ┘
  │
  └─ /* (admin UI, /api/prism/admin/*, auth) ────► Control plane  (NODE_ROLE=control, 1–2 pods)

All pods share one MongoDB — config changes propagate via Change Streams (<500ms)
or 15s polling fallback on standalone MongoDB.

Gateway request pipeline:

POST /api/{tenant}/v1/chat/completions
  ├─ Tenant Auth (API key → SHA-256 hash, expiry, enabled check)
  ├─ Per-Tenant Rate Limiting (sliding window, in-memory)
  ├─ Model Policy (whitelist/blacklist gate — auto-prism always passes)
  ├─ Signal Extraction (token count, keywords, system prompt, code language)
  ├─ Override Rules (vision, domain gate, security escalation, budget cap, …)
  ├─ [LLM Classifier — only called when pre-routing confidence < threshold]
  ├─ Model Selection (category → benchmark-weighted price-performance matching)
  ├─ Context Pre-flight (token estimate vs context window → auto-upgrade)
  ├─ max_tokens Clamping (auto-clamp to model output limit)
  ├─ Budget Guard (auto-economy mode when threshold reached)
  ├─ Provider Adapter (OpenAI-compat · Bedrock · Azure · Ollama)
  ├─ Response Enrichment (cost_info, auto_routing, context_fallback)
  └─ Async Analytics (RequestLog, DailyStat — fire-and-forget)

Gateway Usage

# Standard OpenAI-compatible request
curl https://your-host/api/my-team/v1/chat/completions \
  -H "Authorization: Bearer omp-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello"}]}'

# Auto-routing — Model Prism classifies and routes to the optimal model
curl ... -d '{"model": "auto", "messages": [...]}'

Auto-routing responses include:

{
  "auto_routing": { "category": "code_generation", "model_id": "deepseek-coder-v2", "confidence": 0.91 },
  "cost_info": { "actual_cost": 0.0004, "baseline_cost": 0.0031, "saved": 0.0027 }
}

Client tool integration — use the per-tenant Generate Config button in the admin UI for ready-to-paste snippets for Continue, OpenCode, Cursor, Claude Code, Open WebUI, Python/Node.js SDKs.

Production Deployment

Docker Compose (single pod)

git clone https://github.com/weisser-dev/open-model-prism
cd model-prism

# Create .env with real secrets
cat > .env <<EOF
JWT_SECRET=$(openssl rand -hex 32)
ENCRYPTION_KEY=$(openssl rand -hex 16)
NODE_ENV=production
EOF

docker compose up -d

The app is now running on port 3000. Put a reverse proxy (Caddy, nginx, Traefik) in front.

Caddy example:

prism.yourdomain.com {
    reverse_proxy localhost:3000
}

Kubernetes (Helm)

A Helm chart is included in the helm/ directory:

helm install model-prism ./helm \
  --namespace model-prism \
  --create-namespace

kubectl port-forward svc/model-prism 3000:80 -n model-prism
# → http://localhost:3000 — setup wizard

See helm/README.md for the full configuration reference.

First-run checklist

JWT_SECRET set to a random 32+ char string
ENCRYPTION_KEY set to exactly 32 characters — write this down, cannot be changed later without re-entering all provider API keys
MongoDB data directory persisted (bind mount or named volume)
Reverse proxy configured with TLS
Setup wizard completed (admin user, first provider, first tenant)

Operations & Maintenance

Horizontal Scaling

# Scale worker pods (gateway only)
docker compose -f docker-compose.yml -f docker-compose.scaled.yml up --scale worker=3 -d

Worker pods (NODE_ROLE=worker) handle gateway traffic. The control pod (NODE_ROLE=control) runs the admin UI and analytics. All pods share one MongoDB.

Logs

# Docker
docker logs open-model-prism -f

# Just
just logs

# Adjust log level without restart — via admin UI: System → Log Level

MongoDB Backup & Restore

Manual backup:

docker exec open-model-prism-mongodb \
  sh -lc 'mongodump --archive --gzip' > backup_$(date +%Y%m%d).archive.gz

Restore:

cat backup_20260101.archive.gz | docker exec -i open-model-prism-mongodb \
  sh -lc 'mongorestore --archive --gzip --drop'

Upgrading

git pull
docker compose build --no-cache
docker compose up -d
docker image prune -f

The GitHub Actions workflow does this automatically on push to main.

Rotating JWT_SECRET

All active admin sessions will be invalidated immediately. Provider API keys are not affected.

Stop the app
Update JWT_SECRET in .env
Start the app — all users must log in again

Rotating ENCRYPTION_KEY

⚠️ This is destructive. All saved provider API keys will become undecryptable.

Export all provider configs before rotating
Update ENCRYPTION_KEY
Restart — re-enter all provider API keys via the admin UI

Troubleshooting

Setup wizard doesn't appear → The app detects an existing admin user. If starting fresh, run just dev-clean to wipe the DB.

Provider connection fails ("Could not reach provider") → Use the "Check Connection" button — it runs a detailed probe and shows exactly which URL/path failed. Common causes: wrong base URL (should not include /v1), wrong API key, SSL certificate issues (enable "Skip SSL" for self-signed certs).

"Cannot read properties of undefined (reading 'toLowerCase')"on Models page → A provider is missing the providerName field. Re-save the provider in the admin UI to trigger model re-discovery.

Charts show no data on dashboard → Analytics are written asynchronously. Wait for a few requests to complete, then refresh. If still empty, check LOG_LEVEL=debug for analytics errors.

Context overflow / model escalation not working → Ensure the provider's models have contextWindow values in the model registry. Models with no context window set will not trigger auto-upgrade.

LDAP login fails → Check the LDAP config in Settings → LDAP. The bindDN user needs read access to the user search base. Enable debug log level to see the full LDAP bind attempt.

Documentation

Document	Contents
docs/getting-started.md	Installation, setup wizard, first tenant, production deployment
docs/architecture.md	System overview, request flow, directory structure, security, RBAC
docs/routing.md	Full routing pipeline: signal extraction, override rules, LLM classifier, categories, presets, model selection
docs/providers.md	Provider types, connection testing, model discovery, adapters, model registry
docs/tenants.md	Tenant config, API key lifecycle, model access control, routing config, self-service portal
docs/analytics.md	Cost tracking, token stats, dashboard metrics, Prometheus
docs/api-reference.md	All gateway and admin API endpoints with request/response examples
docs/operations.md	Production deployment: scaling, capacity planning, nginx config, security hardening, backup, upgrades
CHANGELOG.md	Version history

License

Licensed under the Apache License 2.0.

See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
docs-site		docs-site
docs		docs
frontend		frontend
helm		helm
server		server
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
benchmark-agent.md		benchmark-agent.md
design.md		design.md
docker-compose.scaled.yml		docker-compose.scaled.yml
docker-compose.yml		docker-compose.yml
justfile		justfile
requirements.md		requirements.md
roadmap.md		roadmap.md
tasks.md		tasks.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Open Model Prism

Key Features

Table of Contents

Requirements

Local Development

1. Clone and enter the repo

2. Copy environment file

3. Start everything

4. Available dev commands

5. Manual setup (without just)

Environment Variables

Architecture

Gateway Usage

Production Deployment

Docker Compose (single pod)

Kubernetes (Helm)

First-run checklist

Operations & Maintenance

Horizontal Scaling

Logs

MongoDB Backup & Restore

Upgrading

Rotating JWT_SECRET

Rotating ENCRYPTION_KEY

Troubleshooting

Documentation

License

About

Uh oh!

Releases

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Open Model Prism

Key Features

Table of Contents

Requirements

Local Development

1. Clone and enter the repo

2. Copy environment file

3. Start everything

4. Available dev commands

5. Manual setup (without just)

Environment Variables

Architecture

Gateway Usage

Production Deployment

Docker Compose (single pod)

Kubernetes (Helm)

First-run checklist

Operations & Maintenance

Horizontal Scaling

Logs

MongoDB Backup & Restore

Upgrading

Rotating JWT_SECRET

Rotating ENCRYPTION_KEY

Troubleshooting

Documentation

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Contributors

Uh oh!

Languages