An open-source AI Gateway built for stability, extensibility, and operability.
Multi-provider / multi-model access with first-class observability, dynamic configuration, and graceful operations.
English | 简体中文
TiyGate is an independent AI Gateway product written in Rust. It sits between your applications and upstream LLM providers (OpenAI, Anthropic, Bedrock, and any OpenAI-compatible service) and gives you a single, stable control point for routing, observability, and policy.
The two things it does best:
- Multi-backend / multi-model access — one canonical entry, many providers. Cross-protocol translation (e.g. OpenAI
chat_completions→ Anthropicmessages) is a first-class capability, not a hack. - Logs and analytics — every request is captured, structured, and routed to a hot-path-safe async pipeline. No blocking the request path. No silent drops.
Most gateways optimize for one dimension. TiyGate is engineered to hold three at once.
| Quality goal | What carries it |
|---|---|
| Stability | Per-instance circuit breaker + fine-grained FallbackPolicy (error classification, retry vs. failover separated, global attempt/time budget, idempotency gate), respect for upstream Retry-After, ingress body/slow-read/concurrency limits, SIGTERM graceful drain, telemetry off the hot path |
| Extensibility | Trait + inventory decentralized registration (adding a provider = new file + one submit!); hook pipeline; Executor escape hatch for SDK-style providers; three-segment protocol identity; pluggable strategies, cache, and log sinks |
| Maintainability | core has zero dependencies on concrete providers/protocols/DB; canonical IR collapses N×N protocol translation to N; field-level capability matrix makes lossiness explicit; heavy dependencies isolated in dedicated crates |
The field-level lossiness matrix used by lossy_default_reject lives in docs/protocol-capability-matrix.md.
tiygate/
├── crates/
│ ├── core/ # Canonical IR, traits, pipeline. Zero I/O, zero concrete deps.
│ ├── protocols/ # Protocol codecs (chat_completions, messages, responses, gemini, embeddings)
│ ├── providers/ # Built-in provider metadata + auth
│ ├── provider-bedrock/ # SDK-shape provider (Executor escape hatch), heavy deps isolated
│ ├── store/ # Config OLTP (SQLite/Postgres) + pluggable log sinks
│ ├── cache/ # Embedding cache (deterministic, LLM chat/completion are NOT cached)
│ ├── admin/ # Admin REST API + OAuth flows
│ └── server/ # Ingress, data/control plane assembly, deployment modes
├── webui/ # Embedded admin console (React + TS + Vite, served at /admin/ui)
├── docs/ # Architecture design + protocol capability matrix
└── scripts/ # Operational scripts
- Rust 1.88+ (
rustup update stable) - Node.js 20+ (for building the embedded WebUI)
- No upstream provider key needed to start — providers are configured in the Admin Console after launch
git clone https://github.com/tiylabs/tiygate.git
cd tiygateConfigure environment variables by copying the template, then fill in the required values:
cp .env.example .envEdit .env — the three variables you must set for a working WebUI:
# SQLite is the easiest local backend (file is created on first run)
TIYGATE_DATABASE_URL=sqlite://./tiygate.db?mode=rwc
# Admin API token — the WebUI login screen asks for this exact value
TIYGATE_ADMIN_TOKEN=dev-admin-token-change-me
# (Optional but recommended) AES-GCM master key to encrypt provider keys
# / OAuth tokens / S3 credentials at rest. See the Security section below.
# TIYGATE_MASTER_KEY=4f1a2b3c4d5e6f708192a3b4c5d6e7f8091a2b3c4d5e6f708192a3b4c5d6e7f8Everything else — listen address, deployment mode, logging level — is covered in .env.example. Runtime-tunable parameters (routing strategy, ingress limits, upstream streaming timeouts, connection-pool tuning, header-forwarding deny-lists, payload-archive to S3, background-task intervals, etc.) are managed through the Admin Console at /admin/ui/settings. On first start the env values are seeded into the settings table as initial defaults; after that the settings table is the single source of truth and changes apply without a restart. The server loads .env automatically at startup when the dotenv feature is on.
Start the gateway with the embedded WebUI:
make devmake dev builds the frontend first (so rust-embed can embed it), then runs the server with the webui feature. The default listen address is 0.0.0.0:3000.
Once the server is running, open http://localhost:3000/admin/ui in your browser. Paste your TIYGATE_ADMIN_TOKEN on the login screen to enter the console. From there you can manage providers, routes, API keys, runtime settings, and view analytics.
curl -sS http://localhost:3000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"messages": [{"role": "user", "content": "Say hi in one short sentence."}]
}'For streaming, add "stream": true. The server speaks Server-Sent Events end-to-end.
The same gateway will accept chat_completions and translate it to messages (Anthropic) when you route to that provider — the field-level capability matrix decides what's lossless and rejects combinations that aren't.
The tiygate binary supports three modes (selected via --mode / env / config):
| Mode | What it runs | When to use |
|---|---|---|
all |
Data plane + control plane + DB in one process | Local dev, single-node, small teams |
proxy |
Data plane only (stateless, horizontally scalable) | Production data plane |
admin |
Control plane only (Admin API + WebUI) | Production control plane |
Health probes are wired by default:
GET /healthz— liveness, returns 200 even while draining (so you don't get killed mid-roll)GET /readyz— readiness, returns 503 once the pod enters draining (so the load balancer stops sending traffic)
In all / admin modes the binary serves an embedded React console at /admin/ui (e.g. http://localhost:8080/admin/ui). It covers the full control plane — providers, routes, API keys (with one-time secret + quota editing and live usage), the OAuth authorization-code flow, runtime settings (routing, ingress, upstream, header forwarding, payload archive, background tasks) — plus analytics: per-model / provider / API-key stats, circuit-breaker status, request-log drill-down with replay, and the audit trail. It is bilingual (English / 简体中文).
Authentication reuses the single TIYGATE_ADMIN_TOKEN: paste it on the login screen (validated against the Admin API, stored in the browser). The UI is compiled into the binary via rust-embed (the opt-in webui feature), so the frontend must be built before the Rust crate — run scripts/build-with-webui.sh, or cd webui && npm install && npm run build followed by cargo build -p tiygate-server --features webui. See webui/README.md for development details.
Send SIGTERM (or K8s preStop) and the gateway:
- Flips
/readyzto503so the load balancer removes it from the pool - Refuses new requests with
503 + Retry-After - Lets in-flight requests (including long SSE streams) finish naturally
- On
drain_timeout(default 30s, must be ≥ single-requestdeadline), sends a protocol-native error frame to any still-open streams and runsUsageAccumulatorto prevent billing drift. The streaming path is implemented incrates/server/src/ingress.rs::drive_upstream_stream— it also adds a 120s idle timer (tunable via the Admin Console's Upstream settings), an opt-in total wall-clock budget (default disabled), and a 30s SSE keepalive (SseKeepaliveStream) so middleboxes do not silently drop long-quiet streams - Flushes the telemetry channel, releases resources, exits
TiyGate configuration is split into two layers:
1. Startup-only environment variables — read once at process start, require a restart to change:
| Variable | Default | Purpose |
|---|---|---|
TIYGATE_LISTEN_ADDR |
0.0.0.0:3000 |
Listen address for the HTTP server. |
TIYGATE_MODE |
all |
Deployment mode. all (data + control in one process), proxy (data plane only), admin (control plane only). |
TIYGATE_DATABASE_URL |
unset | Database connection string (SQLite or Postgres). When unset, the server falls back to a legacy in-memory config store with no Admin API. |
TIYGATE_ADMIN_TOKEN |
unset | Bearer token required by the Admin API. When unset, Admin API requests are rejected. |
TIYGATE_MASTER_KEY |
unset | AES-256-GCM master key used to encrypt provider keys, OAuth tokens, and S3 credentials at rest. Accepts 64 hex chars or standard base64. When unset, secrets are stored in cleartext (the server logs a warning; acceptable for local dev only). |
TIYGATE_REDIS_URL |
unset | When set (and built with the redis-quota feature), quota counters are shared across replicas via Redis instead of per-replica in-memory. |
RUST_LOG |
info |
tracing / tracing-subscriber filter. Examples: info, tiygate=debug, tiygate_server::ingress=trace. |
2. Runtime-tunable settings — managed through the Admin Console at /admin/ui/settings (backed by the settings table, exposed via GET/PUT /admin/v1/settings). These are hot-reloaded: the data plane polls for changes and atomically switches to the new snapshot without a restart.
On first start, the env values below are seeded into the settings table as initial defaults; after that, the settings table is the single source of truth — editing .env again has no effect unless the settings table is cleared.
The Settings page is organized into five cards:
| Card | What it controls | Seeded from env |
|---|---|---|
| Routing & Ingress | Default routing strategy, max body bytes, max in-flight, max queue depth, acquire timeout, raw-envelope capture media types | TIYGATE_ROUTING_STRATEGY, TIYGATE_MAX_BODY_BYTES, TIYGATE_MAX_INFLIGHT, TIYGATE_RAW_ENVELOPE_CAPTURE_MEDIA |
| Upstream | Stream idle / total timeouts, TCP keepalive, pool idle timeout, TCP nodelay | TIYGATE_UPSTREAM_STREAM_IDLE_TIMEOUT_SECS, TIYGATE_UPSTREAM_STREAM_TOTAL_TIMEOUT_SECS, TIYGATE_UPSTREAM_TCP_KEEPALIVE_SECS, TIYGATE_UPSTREAM_POOL_IDLE_TIMEOUT_SECS, TIYGATE_UPSTREAM_TCP_NODELAY |
| Header Forwarding | Request / response header deny-lists (comma-separated) | TIYGATE_FORWARD_REQUEST_HEADER_DENY, TIYGATE_FORWARD_RESPONSE_HEADER_DENY |
| Payload Archive | S3-compatible object-storage archiving of full request/response payloads (enabled flag, endpoint, region, bucket, credentials, prefix, force-path-style, scan interval, batch size, concurrency, timeout, max retries) | TIYGATE_PAYLOAD_ARCHIVE_* family |
| Background Tasks | Log retention interval & days, epoch poll interval, token-stats interval & lookback days | TIYGATE_LOG_RETENTION_*, TIYGATE_EPOCH_POLL_INTERVAL_SECS, TIYGATE_TOKEN_STATS_* |
- Epoch versioning: the data plane polls for config changes and atomically switches to the new snapshot; in-flight requests keep the old epoch until they finish — no half-old, half-new state mid-request.
- Secret encryption: provider keys / OAuth tokens / encrypted S3 settings are AES-GCM encrypted at rest using
TIYGATE_MASTER_KEY. Encrypted settings are redacted onGET /admin/v1/settings.
Only embedding requests are cached. LLM chat/completion is not cached — by design (non-determinism makes response caching value-low and risk-high). The cache is pluggable: process-local LRU by default, Redis shared backend for multi-replica deployments.
When enabled, a background worker gzip-compresses the full request/response payload detail of each request (8 objects per request — raw body + parsed metadata for each of the 4 hops: client→gateway, gateway→provider, provider→gateway, gateway→client), uploads them to S3-compatible object storage, verifies sha256/size, and then clears the payload text from the database in the same transaction. This keeps the DB lean for high-volume deployments while preserving full replay fidelity.
The Admin Console's request replay feature transparently hydrates archived objects back from S3 on demand (verify → decompress → return), so the user experience is unchanged whether a request's payloads live in the DB or in object storage.
Object lifecycle is decoupled from DB retention — the worker never deletes from S3; use bucket lifecycle policies for expiry.
Enable and configure payload archiving in the Admin Console under Settings → Payload Archive. The env variables (TIYGATE_PAYLOAD_ARCHIVE_*) only seed the initial defaults on first start; after that the settings table is authoritative and changes apply without a restart. See .env.example for the full variable list.
W3C traceparent / tracestate are extracted from the inbound request and re-injected on the upstream call. The gateway span attaches to the caller's trace as a parent. Logs and traces are cross-linkable by trace_id.
# Run the full test suite
cargo test --all-features
# Lint (workspace lints forbid unsafe_code and deny unwrap/expect/panic in libs)
cargo clippy --all-features -- -D warnings
# Format check
cargo fmt --all -- --check
# Workspace-wide dependency tree
cargo tree --workspace
# Verify a heavy-dep crate is isolated (e.g. AWS SDK stays out of core)
cargo tree -p tiygate-core | grep -i aws # should be empty
cargo tree -p tiygate-provider-bedrock | head # AWS SDK lives here onlyThe CI baseline is strict: no #[allow(...)] workarounds, no unwrap/expect/panic! in library code, no dead code.
The tiygate binary is feature-gated. Pick the smallest set that matches your deployment so you don't pay compile time or binary size for components you don't ship.
| Feature | What it pulls in | When you need it |
|---|---|---|
admin |
tiygate-admin (control plane, Admin API, OAuth) |
admin / all deploy mode |
cache |
tiygate-cache (in-memory response cache) |
Anywhere that benefits from caching |
providers |
tiygate-providers (OpenAI / Anthropic / generic OpenAI-compatible) |
Any non-Bedrock LLM traffic |
bedrock |
tiygate-provider-bedrock (AWS SDK) |
Routes that target AWS Bedrock |
tracing |
tracing-subscriber with JSON formatter |
The default tiygate binary |
dotenv |
dotenvy — auto-load .env at startup |
Local development |
webui |
rust-embed — embeds webui/dist and serves the admin console at /admin/ui |
admin / all deploy mode with a UI |
Defaults: admin, cache, providers, tracing, dotenv — the common case. bedrock is opt-in (it pulls the heavy AWS SDK) — add it explicitly if you need AWS Bedrock routes. webui is also opt-in: it embeds webui/dist at compile time, so build the frontend first (cd webui && npm install && npm run build) and then build with --features webui, or just run scripts/build-with-webui.sh which does both in order.
# Default build (everything except Bedrock — that's now opt-in)
cargo build -p tiygate-server --release
# Add Bedrock back when you need it
cargo build -p tiygate-server --release --features bedrock
# Minimal data-plane proxy — drop admin / cache / bedrock
cargo build -p tiygate-server --release \
--no-default-features --features "providers,tracing,dotenv"
# Bedrock-only — skip OpenAI / Anthropic to keep the binary lean
cargo build -p tiygate-server --release \
--no-default-features --features "bedrock,tracing,dotenv"
# Control-plane only — for the `admin` deploy mode
cargo build -p tiygate-server --release \
--no-default-features --features "admin,tracing,dotenv"
# Inspect what's actually compiled in
cargo tree -p tiygate-server -e features --depth 1
bedrockis opt-in by design. Compiling the AWS SDK is the single biggest hit to your cold-build time, so we keep it out of the default. If you route to Bedrock, opt in explicitly:cargo build -p tiygate-server --release --features bedrockCI smoke matrix —
bash scripts/verify-deps.shwill still pass under any feature combination, because dependency isolation lives incore/providersand is enforced separately from theserverbuild matrix.
Issues and pull requests are welcome. The design is opinionated, and contributions that fight the layering (e.g. adding a concrete provider dependency to core, or introducing allow_lossy) will be declined.