Self-Hosted LLM Infrastructure

Multi-tenant, OpenAI-compatible LLM gateway. Route requests to cloud GPU pods (RunPod, Vast.ai, Lambda Labs), a local GPU, or commercial APIs — all behind one API endpoint.

Use cases:

Self-host large models (7B–70B) on rented GPUs; pay only when running
Drop-in replacement for OpenAI API — works with Open WebUI, Cursor, any OpenAI client
Mix providers: local GPU for dev, cloud GPU for heavy loads, OpenAI as fallback
Per-user billing, quotas, and tier restrictions for team deployments

Index

Quickstart
Endpoints
Tiers & Models
Providers — RunPod, local GPU, OpenAI, Groq, DeepSeek, etc.
Claude Code Router — use Claude Code with this bridge via ccr
CLI Reference — user management, API keys, billing, observability
Embeddings
Preprocessing Pipeline
Development & Architecture
Deployment Modes — solo, hosted multi-user, gateway client, full self-hosted
Workspace Tools — Open WebUI file I/O + code execution tool; includes recommended system prompt
Website & Wiki Deployment — GitHub Pages, Netlify, Vercel, GitHub Wiki

Quickstart

# 1. Clone and configure
cp .env.example .env
$EDITOR .env   # set provider keys, DB password, JWT secret

# 2. One-shot setup (builds images, starts stack, runs migrations, bootstraps admin)
bash scripts/setup.sh

# 3. Smoke test
curl -H "Authorization: Bearer $API_KEY" \
     -H "Content-Type: application/json" \
     -d '{"model":"llm-simple","messages":[{"role":"user","content":"hello"}]}' \
     http://localhost:8000/v1/chat/completions

Endpoints

Service	Port	URL
Bridge API	8000	http://localhost:8000
Dashboard	8501	http://localhost:8501
Open WebUI	3000	http://localhost:3000
CCR (claude-code-router)	3456	http://localhost:3456
Gateway (optional)	8080	http://localhost:8080
Postgres	5432	(internal)
Redis	6379	(internal)
Ollama	11434	(internal)

Basic Usage

Send requests using the OpenAI API format:

# Use a specific tier
curl -H "Authorization: Bearer $API_KEY" \
     -H "Content-Type: application/json" \
     -d '{"model":"llm-architecture","messages":[{"role":"user","content":"review this code"}]}' \
     http://localhost:8000/v1/chat/completions

# Let the router pick (based on complexity)
curl ... -d '{"model":"llm-auto","messages":[...]}'

# Force a tier per-request
curl -H "X-Tier: simple" ...

Available models: llm-simple, llm-mid, llm-architecture, llm-maximum, llm-ultra, llm-vision, llm-auto

See Tiers & Models for full details and tier locking options.

Deployment Modes

Mode	Bridge	Gateway	Notes
Solo	Local	—	`OPENWEBUI_BRIDGE_API_KEY` = your key
Hosted multi-user	Shared server	—	Per-user keys via gpu-relay Pipeline
Gateway client	Remote (host's)	Local :8080	Point Open WebUI at gateway
Full self-hosted	Own server	Optional	Complete control

Gateway (local proxy to remote bridge)

# Standalone — no main stack required
GATEWAY_BRIDGE_URL=https://your-bridge.example.com docker compose -f docker/docker-compose.gateway.yml up -d

Point any OpenAI client at http://localhost:8080 with your sk-llm- key. The gateway forwards requests to the upstream bridge with auth unchanged.

Hosted multi-user (admin)

# .env — configure sync
OPENWEBUI_ADMIN_EMAIL=admin@example.com
OPENWEBUI_ADMIN_PASSWORD=<password>
PIPELINES_API_KEY=<from Open WebUI Admin → Pipelines>

# Create user with Open WebUI account + pipeline key mapping in one step
llmctl users add alice@example.com --sync-openwebui
llmctl users keys-add alice@example.com --sync-pipeline

Testing Without a GPU

MOCK_PROVIDERS=1 ./scripts/smoke_test.sh

Routes all requests to the local Ollama service — no cloud account, no billing. See Development.

Name		Name	Last commit message	Last commit date
Latest commit History 103 Commits
.streamlit		.streamlit
bridge		bridge
cli		cli
config		config
dashboard		dashboard
database		database
docker		docker
docs		docs
gateway		gateway
pipelines		pipelines
providers		providers
scripts		scripts
tests		tests
workspace_tools		workspace_tools
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
alembic.ini		alembic.ini
claude.md		claude.md
docker-compose.gateway.yml		docker-compose.gateway.yml
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Self-Hosted LLM Infrastructure

Index

Quickstart

Endpoints

Basic Usage

Deployment Modes

Gateway (local proxy to remote bridge)

Hosted multi-user (admin)

Testing Without a GPU

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Self-Hosted LLM Infrastructure

Index

Quickstart

Endpoints

Basic Usage

Deployment Modes

Gateway (local proxy to remote bridge)

Hosted multi-user (admin)

Testing Without a GPU

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages