ccproxy reads a single configuration file: ccproxy.yaml.
Discovery order (highest to lowest precedence):
$CCPROXY_CONFIG_DIR/ccproxy.yaml~/.config/ccproxy/ccproxy.yaml
Install ccproxy via uv:
uv tool install ai-ccproxyInitialize the config file:
ccproxy initThis writes ~/.config/ccproxy/ccproxy.yaml with defaults. Use --force to overwrite an existing file.
ccproxy:
host: 127.0.0.1 # Listen address
port: 4000 # Reverse proxy listener port
log_level: INFO # Root logger level: DEBUG, INFO, WARNING, ERROR, CRITICAL
# Daemon log file path. Relative to config dir, or absolute.
# Set to null to disable file logging. Only `ccproxy start` writes here.
# log_file: ccproxy.log
providers: # Provider entries keyed by sentinel suffix
anthropic:
auth:
type: command
command: "jq -r '.claudeAiOauth.accessToken' ~/.claude/.credentials.json"
host: api.anthropic.com
path: /v1/messages
type: anthropic # adapter-family name (drives wire-format dispatch)
hooks:
inbound:
- ccproxy.hooks.inject_auth
- ccproxy.hooks.extract_session_id
outbound:
- ccproxy.hooks.gemini_cli
- ccproxy.hooks.pplx_stamp_headers
- ccproxy.hooks.inject_mcp_notifications
- ccproxy.hooks.verbose_mode
- ccproxy.hooks.commitbee_compat
- ccproxy.hooks.shape
gemini_capacity:
enabled: true
fallback_models:
- gemini-3-flash-preview
- gemini-2.5-pro
- gemini-2.5-flash
lightllm:
transforms: [] # Transform override rules (see Transform Rules)
inspector:
port: 8083 # mitmweb UI port
provider_map: # Hostname → OTel gen_ai.system tag
api.anthropic.com: anthropic
api.openai.com: openai
otel:
enabled: false
endpoint: "http://localhost:4317"| Field | Type | Default | Description |
|---|---|---|---|
host |
string | 127.0.0.1 |
Reverse proxy listen address |
port |
int | 4000 |
Reverse proxy listen port |
log_level |
string | INFO |
Root logger level: DEBUG, INFO, WARNING, ERROR, CRITICAL |
log_file |
path | ccproxy.log |
Daemon log file path. Relative to config dir, or absolute. null disables. |
use_journal |
bool | false |
Route daemon logging to systemd journal (requires journal extra) |
journal_identifier |
string | — | SYSLOG_IDENTIFIER for journal handler. Derived from config-dir basename when unset. |
provider_timeout |
float | — | Timeout budget (seconds) for upstream httpx calls. null disables the timeout. |
providers |
map | {} |
Provider entries keyed by sentinel suffix (auth + destination + format) |
hooks |
object | — | Two-stage hook pipeline (inbound/outbound) |
gemini_capacity |
object | — | Sticky-retry + fallback chain for Gemini RESOURCE_EXHAUSTED (see below) |
lightllm |
object | — | Cross-format routing and transform overrides |
inspector |
object | — | mitmweb listener, UI, and capture settings |
otel |
object | — | OpenTelemetry export settings |
shaping |
object | — | Request shaping configuration (see shaping.md) |
flows |
object | — | Flow CLI defaults (see below) |
ccproxy writes to three potential sinks simultaneously: stderr (always), a log file (daemon mode), and the systemd journal (optional).
ccproxy:
log_level: INFO
log_file: ccproxy.log
use_journal: false
journal_identifier: nullRoot Python logger level, applied uniformly to all loggers (ccproxy, mitmproxy, httpx, httpcore). One of DEBUG, INFO, WARNING, ERROR, CRITICAL. DEBUG emits library internals — noisy but useful for tracing request/response cycles through the pipeline.
Daemon log file path. Relative paths resolve against the config file's directory (ccproxy.yaml's parent); absolute paths pass through. Set to null to disable file logging entirely. Only ccproxy start writes here — one-shot CLI commands (run, status, flows) always write to stderr. The file is truncated on each daemon restart. Access the resolved path via ccproxy logs.
When use_journal: true, ccproxy attaches a systemd.journal.JournalHandler to the root logger so daemon output is routed to the systemd journal. Requires the journal optional extra (pip install ai-ccproxy[journal]). Falls back to stderr with a warning when systemd-python is unavailable or the host lacks systemd. Only applies to ccproxy start.
journal_identifier sets the SYSLOG_IDENTIFIER field in journal entries. When unset (default), it derives from the config-dir basename:
| Config dir | Derived identifier |
|---|---|
~/.config/ccproxy/ |
ccproxy |
~/dev/projects/foo/.ccproxy/ |
ccproxy-foo |
~/.config/myapp/ |
ccproxy-myapp |
Override via this field or the CCPROXY_JOURNAL_IDENTIFIER env var. View journal output with:
journalctl --user -t ccproxy # default
journalctl --user -t ccproxy-myproject # custom identifierccproxy:
provider_timeout: nullprovider_timeout sets a timeout budget (seconds) for httpx-based upstream HTTP calls inside ccproxy — specifically auth token refresh and the 401-retry path. It applies uniformly across connect, read, write, and pool phases.
When null (default), there is no enforced timeout. This matches mitmproxy's default main-forward path and Portkey AI's upstream behavior — requests can take as long as the upstream needs (important for long-running streaming inference). Set to a positive float to opt into a bounded timeout for internal calls.
This does NOT affect the main request/response forwarding path (mitmproxy handles that independently). It only gates ccproxy's own outbound HTTP calls.
providers maps a sentinel suffix to a Provider entry: an auth source, a single destination (host + path), and an adapter-family type identifier that names the wire format the destination speaks (one of anthropic, openai, google / gemini / vertex_ai / vertex_ai_beta, perplexity_pro; Anthropic-compatible forks like deepseek and zai use type: anthropic). When ccproxy sees a sentinel key matching sk-ant-oat-ccproxy-{name}, the matching Provider drives both auth injection (inject_auth) and routing (auto-redirect or cross-format transform via lightllm).
Simple form — auth dispatched as a bare shell command:
ccproxy:
providers:
anthropic:
auth: "jq -r '.claudeAiOauth.accessToken' ~/.claude/.credentials.json"
host: api.anthropic.com
path: /v1/messages
type: anthropicFull form — explicit auth discriminator and per-provider auth header:
ccproxy:
providers:
anthropic:
auth:
type: command
command: "jq -r '.claudeAiOauth.accessToken' ~/.claude/.credentials.json"
host: api.anthropic.com
path: /v1/messages
type: anthropic
gemini:
auth:
type: command
command: "jq -r '.access_token' ~/.gemini/oauth_creds.json"
host: cloudcode-pa.googleapis.com
path: "/v1internal:{action}"
type: gemini
deepseek:
auth:
type: command
command: "printenv DEEPSEEK_API_KEY"
header: x-api-key # send token as `x-api-key: <token>` (not `Authorization: Bearer …`)
host: api.deepseek.com
path: /anthropic/v1/messages
type: anthropic # DeepSeek's anthropic-compat endpoint speaks the anthropic formatProvider entry fields:
| Field | Description |
|---|---|
auth |
Discriminated auth source. Bare strings coerce to {type: command, command: <str>}. |
host |
Single destination hostname (e.g. api.anthropic.com). |
path |
Destination path. Supports {model} and {action} templating substituted from the body / URL at routing time. Defaults to /. |
type |
Wire-format identifier (anthropic, gemini, openai, openai_responses, perplexity_pro, …). When the incoming format matches type, the routing handler just rewrites the destination; when they differ, the body is rewritten via lightllm. |
Auth source types (the type: discriminator inside auth:):
type |
Required keys | Behavior |
|---|---|---|
command |
command |
Shell command whose stdout is the token. Bare strings under auth: coerce to this. |
file |
file |
File path; contents stripped of whitespace are the token. |
anthropic_oauth |
file_path (default ~/.config/ccproxy/oauth/anthropic.json) |
Refreshes Anthropic OAuth tokens in-process via claude.ai/v1/oauth/token. Atomically writes refreshed tokens back to file_path. |
google_oauth |
client_id, client_secret, file_path (default ~/.gemini/oauth_creds.json) |
Refreshes Google/Gemini OAuth tokens in-process via oauth2.googleapis.com. Preserves on-disk refresh_token when the refresh response omits it (gemini-cli #21691). |
codex_oauth |
file_path (default ~/.codex/auth.json) |
Refreshes Codex ChatGPT OAuth tokens in-process via auth.openai.com, atomically writes the updated auth envelope, and stamps account-routing companion headers. |
The auth.header field (inside any auth: block) overrides the default Authorization: Bearer {token} injection. Set it to a custom header name (e.g. x-api-key) when the destination expects the raw token in a non-Bearer header.
The packaged codex provider uses type: openai_responses and routes
same-format Responses requests to ChatGPT's Codex backend
(/backend-api/codex/responses). That backend follows Codex CLI semantics, not
the full public OpenAI Responses API surface: use streaming requests, leave
public-only fields such as max_output_tokens unset, and let the
ccproxy.shaping.codex shape hook normalize string input and enforce
store: false.
Configuration values dispatch through a small Pydantic class hierarchy:
AuthFields # base — only `header`
├── CommandAuthSource type: command → run a shell command, return stdout
├── FileAuthSource type: file → read a file, return contents
├── CodexAuthSource type: codex_oauth → refresh Codex ChatGPT JWTs + account headers
└── AuthSource # OAuth refresh-capable base
├── AnthropicAuthSource type: anthropic_oauth
└── GoogleAuthSource type: google_oauth
AuthFields carries only the optional target-header override. CommandAuthSource and FileAuthSource extend it as static credential value loaders — they have no expiry awareness and never POST to a refresh endpoint. They suit any long-lived API key (DeepSeek, Z.AI, OpenRouter) wired through opnix/SOPS, printenv, or a managed secret file; rotation happens out-of-band through whichever secret manager produced the value.
AuthSource is the OAuth refresh-capable base. It owns the read → check expiry → refresh-if-near-expiry → atomic write-back template method. Subclasses provide only:
- defaults for
type(theLiteraldiscriminator),file_path,endpoint,client_id, optionalclient_secret, anddefault_expires_in_seconds; - a
_build_refresh_body(refresh_token) -> dict[str, str]that returns the per-provider POST body (Anthropic usesgrant_type=refresh_token+client_id; Google addsclient_secret).
CodexAuthSource is separate because Codex's auth file stores JWTs and
ChatGPT account metadata under a different envelope, and the refresh endpoint
expects JSON rather than the form body used by Anthropic and Google. The
discriminator literal mirrors the distinction in YAML: bare command / file
for the static loaders, *_oauth for the refresh sources. Pick the right one
for the credential's lifecycle, not for the brand of the destination —
pointing a Gemini destination at type: command is legal, but ccproxy will
not refresh anything in that case (see "Why Gemini wants google_oauth"
below).
Iteration order is load-bearing. inject_auth walks providers in insertion order to pick a fallback when no sentinel key is present on the request — the first provider with a cached token wins. Keep the highest-priority provider (typically anthropic) first.
SDK clients can use a sentinel API key to trigger token substitution without modifying request logic:
client = Anthropic(api_key="sk-ant-oat-ccproxy-anthropic")When ccproxy sees a key matching sk-ant-oat-ccproxy-{name}, it substitutes the actual token from providers[name].auth, sets the auth header (Authorization: Bearer … by default, or providers[name].auth.header when set), and routes the request to providers[name].host / providers[name].path. If the incoming wire format doesn't match providers[name].type, lightllm rewrites the body too.
Tokens are loaded at startup via _load_credentials() and cached in memory.
For refresh-capable providers (anthropic_oauth, google_oauth,
codex_oauth), resolve() rotates the cached access token in-process
whenever it is near expiry and atomically writes the updated credentials back
to file_path.
On a 401 response from upstream, AuthAddon.response() calls config.resolve_auth_token(provider) to re-resolve the credential source — for OAuth sources this triggers another refresh attempt; for static command / file loaders it just re-reads. The request is then replayed with whatever token the resolver returns; if the resolver returns nothing (empty token, refresh failed), the 401 propagates to the client.
AuthSource.resolve() implements the in-process refresh template method shared by anthropic_oauth and google_oauth:
- Read. Open
file_path, parse JSON, pull(access_token, refresh_token, expiry)via the configured glom paths (access_path,refresh_path,expiry_path). - Check expiry. A 60-second headroom (
_REFRESH_HEADROOM_MS = 60_000) — if the cached access token is more than 60 seconds away from expiry, return it unchanged. - Refresh. Otherwise POST
_build_refresh_body(refresh_token)toendpoint(form-encoded). On HTTP error or non-JSON response, give up and returnNone. - Merge.
copy.deepcopy(creds)so the original dict is untouched, thenglom.assign(merged, access_path, new_access, missing=dict)for each of the three paths.missing=dictcreates intermediate dicts when the credential file uses a nested envelope likeclaudeAiOauth.accessToken. Sibling fields the host CLI maintains —scopes,subscriptionType, anything else under that envelope or at the top level — survive verbatim. - Write back atomically.
atomic_write_back(path, merged):tempfile.NamedTemporaryFilein the same directory,tf.flush(),os.fsync(tf.fileno()),tmp.chmod(0o600),tmp.replace(path). The rename is atomic on the same filesystem, so a concurrent reader (the host CLI, another ccproxy instance) sees either the old file or the new file, never a partial write.
The gemini-cli #21691 workaround lives at the merge step: new_refresh = payload.get("refresh_token") or refresh. Google's OAuth response sometimes omits refresh_token; the fallback keeps the on-disk value so the next refresh still has a valid grant.
from_yaml() calls _load_credentials() before the inspector listeners come up. _load_credentials() iterates every providers[name] whose auth is set and calls auth.resolve(label=name), populating _cached_auth_tokens[name]. For anthropic_oauth, google_oauth, and codex_oauth entries, that single call performs the full read → expiry-check → refresh → write-back dance, so the cached token is fresh by the time mitmweb starts accepting traffic.
This ordering matters most for Gemini. The prewarm_project() hook in ccproxy.hooks.gemini_cli runs once after readiness, POSTs to https://cloudcode-pa.googleapis.com/v1internal:loadCodeAssist with the cached gemini token, and stashes the resulting cloudaicompanionProject for the process lifetime:
from_yaml()
└── _load_credentials() # iterates providers, calls auth.resolve() for each
└── GoogleAuthSource.resolve() # refresh-if-near-expiry, atomic write-back
└── _cached_auth_tokens["gemini"] = <fresh token>
[mitmweb starts, addons register, ready signal]
prewarm_project()
└── token = config.resolve_auth_token("gemini") # reads or refreshes the configured token
└── POST cloudcode-pa.../v1internal:loadCodeAssist with Bearer <fresh>
└── _cached_project = response["cloudaicompanionProject"]
prewarm_project() requires a valid bearer token. With type: google_oauth, _load_credentials() rotates an expired Gemini token before prewarm_project() runs, so the loadCodeAssist POST succeeds and the cloudaicompanionProject is cached for every subsequent Gemini request.
With type: command (e.g. jq -r '.access_token' ~/.gemini/oauth_creds.json), CommandAuthSource.resolve() just runs jq and returns whatever's in the file — no refresh. If the file holds an expired token at startup, prewarm_project() silently fails (loadCodeAssist returned 401; project field will be omitted) and every subsequent Gemini request lacks the project field.
For Gemini the recommended setup is therefore type: google_oauth with file_path: ~/.gemini/oauth_creds.json and gemini-cli's installed-app credentials. The client_id and client_secret are public installed-app values embedded in the gemini-cli npm distribution — ccproxy does not vendor them; supply them in your config:
ccproxy:
providers:
gemini:
auth:
type: google_oauth
file_path: ~/.gemini/oauth_creds.json
client_id: <gemini-cli installed-app client_id>
client_secret: <gemini-cli installed-app client_secret>
header: authorization
host: cloudcode-pa.googleapis.com
path: "/v1internal:{action}"
type: geminiWhen you run both ccproxy and the Claude Code CLI on the same machine, the recommended setup is to point the anthropic provider at the CLI's own credential file (~/.claude/.credentials.json). Both tools then read and write the same JSON, so a refresh performed by either side is visible to the other on the next read — eliminating token desync.
ccproxy:
providers:
anthropic:
auth:
type: anthropic_oauth
file_path: ~/.claude/.credentials.json
access_path: claudeAiOauth.accessToken
refresh_path: claudeAiOauth.refreshToken
expiry_path: claudeAiOauth.expiresAt
header: authorization
host: api.anthropic.com
path: /v1/messages
type: anthropicThe Claude Code CLI stores its OAuth state under a claudeAiOauth envelope:
{
"claudeAiOauth": {
"accessToken": "...",
"refreshToken": "...",
"expiresAt": 1735689600000,
"scopes": ["org:create_api_key", "user:profile"],
"subscriptionType": "max"
}
}The four glom path fields declare where each credential lives inside that file:
| Field | Purpose | Example |
|---|---|---|
file_path |
Path to the credential file on disk. ~ is expanded. |
~/.claude/.credentials.json |
access_path |
Glom dot-path to the access token (read on every request, written after refresh). | claudeAiOauth.accessToken |
refresh_path |
Glom dot-path to the refresh token (used to mint a new access token). | claudeAiOauth.refreshToken |
expiry_path |
Glom dot-path to the expiry timestamp (millis since epoch; ccproxy refreshes a few minutes before expiry). | claudeAiOauth.expiresAt |
Write-back is atomic — tmpfile → fsync → rename → chmod 0600 — and only the three values addressed by the glom paths are mutated. Sibling fields the CLI maintains (scopes, subscriptionType, anything else under claudeAiOauth or at the top level) are preserved verbatim, so the CLI keeps working without re-authentication after ccproxy refreshes the token.
Hooks run in two stages: inbound (before the request reaches the provider) and outbound (before the response reaches the client).
Simple form — module path string:
ccproxy:
hooks:
inbound:
- ccproxy.hooks.inject_auth
- ccproxy.hooks.extract_session_id
outbound:
- ccproxy.hooks.gemini_cli
- ccproxy.hooks.pplx_stamp_headers
- ccproxy.hooks.inject_mcp_notificationsParameterized form — dict with hook and params keys:
ccproxy:
hooks:
outbound:
- hook: ccproxy.hooks.some_hook
params:
key: value| Hook | Stage | Purpose |
|---|---|---|
ccproxy.hooks.inject_auth |
inbound | Substitutes sentinel keys (sk-ant-oat-ccproxy-{name}) with the cached auth token from providers[name].auth; injects Authorization: Bearer … (or the custom auth.header when set) and stamps ctx.metadata.auth_provider / ctx.metadata.auth_injected for downstream routing and retry logic |
ccproxy.hooks.extract_session_id |
inbound | Reads metadata.user_id via glom(ctx._body, 'metadata.user_id') and stores session_id on ctx.metadata.session_id for downstream use |
ccproxy.hooks.gemini_cli |
outbound | Single hook for all Gemini sentinel-key traffic. Wraps standard Gemini bodies in the v1internal envelope, conditionally masquerades google-genai-sdk/* UAs as Gemini CLI, rewrites paths to cloudcode-pa, and unwraps the {response: {...}} envelope on the way back. |
ccproxy.hooks.pplx_stamp_headers |
outbound | Converts Perplexity Pro's injected bearer placeholder into the cookie-auth browser header bundle expected by the WebUI endpoint. |
ccproxy.hooks.inject_mcp_notifications |
outbound | Injects buffered MCP terminal events as synthetic tool_use/tool_result blocks |
ccproxy.hooks.verbose_mode |
outbound | Strips redact-thinking-* flags from the anthropic-beta header |
ccproxy.hooks.shape |
outbound | Picks a per-provider packaged or local shape, injects content fields from the incoming request, applies it to the outbound flow. The shape carries the native client identity envelope — no separate identity-injection hook is needed. |
ccproxy.hooks.commitbee_compat |
outbound | Last-mile compatibility shim for the commitbee tool. |
Use the @hook decorator with reads/writes for DAG ordering. Declarations support glom dot-paths (e.g. "metadata.user_id") — the DAG extracts root fields for dependency resolution:
from glom import assign, glom
from ccproxy.pipeline.context import Context
from ccproxy.pipeline.hook import hook
@hook(reads=["metadata.user_id"], writes=["metadata.tracking_id"])
def my_hook(ctx: Context, params: dict) -> Context:
# Typed layer: ctx.messages, ctx.system, ctx.tools (Pydantic AI objects)
# Raw body layer: glom/assign/delete over ctx._body (standard primitive)
user_id = glom(ctx._body, "metadata.user_id", default="")
if user_id:
assign(ctx._body, "metadata.tracking_id", f"track-{user_id}")
return ctxRegister in config:
hooks:
outbound:
- mypackage.my_hookForce-run or force-skip hooks via header:
x-ccproxy-hooks: +inject_mcp_notifications,-verbose_mode
The gemini_capacity block configures sticky-retry + fallback chain behavior for Gemini RESOURCE_EXHAUSTED (429/503) responses. This is managed by GeminiAddon internally — there is no separate hook to configure.
ccproxy:
gemini_capacity:
enabled: true
fallback_models:
- gemini-3-flash-preview
- gemini-2.5-pro
- gemini-2.5-flash
sticky_retry_attempts: 3
sticky_retry_max_delay_seconds: 60
terminal_delay_threshold_seconds: 300
total_retry_budget_seconds: 120| Field | Type | Default | Description |
|---|---|---|---|
enabled |
bool | true |
Master switch. When false, capacity errors pass through unchanged. |
fallback_models |
list | [] |
Models tried in order after sticky retries on the original are exhausted. |
sticky_retry_attempts |
int | 3 |
Same-model retries on the original model before falling through. Range 0–10. |
sticky_retry_max_delay_seconds |
float | 60.0 |
Per-attempt cap on retryDelay. If the server asks for longer, skip remaining sticky attempts and move to next candidate. |
terminal_delay_threshold_seconds |
float | 300.0 |
Hard ceiling. retryDelay above this halts the entire chain — the server is signaling sustained outage. |
total_retry_budget_seconds |
float | 120.0 |
Wall-clock budget for the entire retry chain across all candidates. |
- Sticky phase: On 429/503, retry the same model up to
sticky_retry_attemptstimes, honoringRetryInfo.retryDelay(capped bysticky_retry_max_delay_seconds). - Fallback phase: If sticky retries are exhausted, walk
fallback_modelsin order, trying each once. - Terminal: If any
retryDelayexceedsterminal_delay_threshold_seconds, or the wall clock exceedstotal_retry_budget_seconds, stop and return the error to the client.
The default lightllm.transforms list is empty: routing comes from sentinel-key resolution against the providers map. When a sentinel key arrives, ccproxy resolves the matching Provider, sets ctx.metadata.auth_provider, and either redirects (incoming format matches the provider type) or cross-transforms via lightllm (formats differ). Most users never need a TransformOverride.
lightllm.transforms is an ordered list of TransformOverride entries layered on top of Provider auto-routing. The first regex match wins. Use overrides for edge cases — bypassing auth for a specific host, forcing a particular destination for a path/model combo, etc.
ccproxy:
lightllm:
transforms:
# Bypass interception for a host: forward unchanged to its original destination.
- action: passthrough
match_host: cloudcode-pa\.googleapis\.com
# Force a specific provider for a path. dest_provider resolves to providers["anthropic"]
# for host/path/auth — no separate api-key reference is required.
- match_path: ^/v1/messages$
action: redirect
dest_provider: anthropic
# Cross-format transform: OpenAI-shape requests for gpt-4o get rewritten to Anthropic's
# /v1/messages format and routed through providers["anthropic"].
- match_path: ^/v1/chat/completions$
match_model: ^gpt-4o
action: transform
dest_provider: anthropic
dest_model: claude-haiku-4-5-20251001| Field | Type | Default | Description |
|---|---|---|---|
action |
string | redirect |
redirect: rewrite destination, preserve body (same-format). transform: rewrite both destination and body via lightllm (cross-format). passthrough: forward unchanged. |
match_host |
regex | — | Optional. Matched against pretty_host, the Host header, and X-Forwarded-Host. |
match_path |
regex | .* |
Matched against the request path. |
match_model |
regex | — | Matched against glom(body, "model"). |
dest_provider |
string | — | ccproxy provider name. Resolves to a providers entry for host/path/auth/format. The provider's auth is applied automatically — no separate api-key field is required. |
dest_model |
string | — | Rewrites body['model']. Only used in transform mode. |
dest_host |
string | — | Raw host override. Bypasses Provider lookup. |
dest_path |
string | — | Raw path override. Bypasses Provider lookup. |
dest_vertex_project |
string | — | GCP project ID for Vertex AI transforms. Required for context caching with vertex_ai/vertex_ai_beta providers. |
dest_vertex_location |
string | — | GCP region for Vertex AI transforms (e.g. us-central1). |
match_* fields are full regex (compiled with re.compile). All match fields are optional and ANDed together. A rule with no match fields matches every request — use as a catch-all at the end of the list. Auth resolves via dest_provider lookup; there is no separate api-key reference field.
ccproxy:
inspector:
port: 8083
cert_dir: ~/.config/ccproxy
provider_map:
api.anthropic.com: anthropic
api.openai.com: openai
generativelanguage.googleapis.com: google
openrouter.ai: openrouter
mitmproxy:
ssl_insecure: true
web_host: 127.0.0.1
web_password: null
web_open_browser: false
ignore_hosts: []
allow_hosts: []
stream_large_bodies: null
body_size_limit: null
termlog_verbosity: warn
flow_detail: 0| Field | Type | Default | Description |
|---|---|---|---|
port |
int | 8083 |
mitmweb UI listen port |
cert_dir |
path | — | mitmproxy CA certificate store directory. Populates mitmproxy.confdir. |
provider_map |
map | — | Hostname → gen_ai.system value for OTel span attributes |
The inspector.mitmproxy block passes options directly to mitmproxy's OptManager via --set flags:
| Field | Type | Default | Description |
|---|---|---|---|
ssl_insecure |
bool | true |
Skip upstream TLS certificate verification |
web_host |
string | 127.0.0.1 |
mitmweb browser UI bind address |
web_password |
string | — | mitmweb UI password. Plain string, or a file/command credential source dict. null generates a random token on each startup. |
web_open_browser |
bool | false |
Auto-open browser when mitmweb starts |
ignore_hosts |
list | [] |
Regex patterns for hosts to bypass (no TLS interception) |
allow_hosts |
list | [] |
Regex patterns for hosts to intercept (exclusive allowlist) |
stream_large_bodies |
string | — | Stream bodies larger than this threshold. null disables streaming so the transform handler can inspect and rewrite all bodies. |
body_size_limit |
string | — | Hard limit on buffered body size. Bodies exceeding this are dropped. null means unlimited. |
termlog_verbosity |
string | warn |
mitmproxy terminal log level: debug, info, warn, error |
flow_detail |
int | 0 |
Flow output verbosity: 0=none, 1=url+status, 2=headers, 3=truncated body, 4=full body |
Before ccproxy accepts traffic, it verifies it can reach the open internet. This catches broken routes, DNS failures, missing CA bundles, or namespace egress problems at startup — before any real requests are accepted. The probe is configured by three top-level fields (siblings of host/port, not under inspector):
ccproxy:
verify_readiness_on_startup: true # false to skip (e.g. air-gapped environments)
readiness_probe_url: "https://1.1.1.1/"
readiness_probe_timeout_seconds: 5.0At startup, ccproxy issues HEAD <url> via httpx. Any HTTP response (200, 301, 404) proves the full network stack works. Any exception is a hard failure: ccproxy refuses to start.
| Field | Type | Default | Description |
|---|---|---|---|
verify_readiness_on_startup |
bool | true |
Master switch. false skips the probe entirely. |
readiness_probe_url |
string | https://1.1.1.1/ |
Canary URL. Defaults to Cloudflare's 1.1.1.1 DNS (direct IP, no DNS resolution required, globally reliable). |
readiness_probe_timeout_seconds |
float | 5.0 |
Total timeout budget. Short by design — the probe is trivial. |
The daemon hosts a FastMCP streamable-HTTP server for flow inspection, shape capture, and Perplexity library tools. It binds its own internal port, and the proxy listener also forwards /mcp to it — MCP clients can use either http://{mcp.http.host}:{mcp.http.port}/mcp directly or http://{host}:{port}/mcp on the proxy socket. POST /mcp/notify on the proxy socket ingests fire-and-forget MCP terminal events into the notification buffer (drained by the inject_mcp_notifications hook; see docs/mcp.md).
ccproxy:
mcp:
http:
enabled: true
host: 127.0.0.1
port: 4030
auth: null
buffer:
max_events_per_task: 65536
ttl_seconds: 600| Field | Type | Default | Description |
|---|---|---|---|
http.enabled |
bool | true |
Host the MCP server (and the /mcp forward on the proxy listener). |
http.host |
string | 127.0.0.1 |
Internal bind address for the MCP uvicorn server. |
http.port |
int | 4030 |
Internal bind port. |
http.auth |
auth source | null |
Bearer token for MCP clients — plain string, or a command/file credential source. null disables auth. |
buffer.max_events_per_task |
int | 65536 |
Notification buffer capacity per task; oldest events drop first on overflow. |
buffer.ttl_seconds |
int | 600 |
Buffered events expire after this many seconds. |
POST /mcp/notify requires no auth (fire-and-forget contract): it always answers 200, with {"status": "ok"} on ingestion or {"status": "error"} for malformed payloads.
Request shaping stamps packaged or local compliance envelopes onto proxied requests. See shaping.md for the full reference.
ccproxy:
shaping:
enabled: true
shapes_dir: ~/.config/ccproxy/shapes
providers:
anthropic:
billing:
salt: "${CCPROXY_BILLING_SALT}"
seed: "${CCPROXY_BILLING_SEED}"
content_fields:
- model
- messages
- tools
- tool_choice
- system
- thinking
- context_management
- stream
- max_tokens
- temperature
- top_p
- top_k
- stop_sequences
merge_strategies:
system: "prepend_shape:2"
shape_hooks:
- ccproxy.shaping.regenerate
- hook: ccproxy.shaping.caching.strip
params:
paths: ["system.*.cache_control"]
- hook: ccproxy.shaping.caching.insert
params:
path: "system.-1.cache_control"
value: {type: ephemeral}
preserve_headers:
- authorization
- x-api-key
- x-goog-api-key
- host
strip_headers:
- authorization
- x-api-key
- x-goog-api-key
- content-length
- host
- transfer-encoding
- connection
capture:
path_pattern: "^/v1/messages"shape_hooks entries are either bare module path strings or {hook, params} dicts for parameterized hooks. See shaping.md for the full shape hooks reference including the cache breakpoint hooks.
The Anthropic shaping profile includes a billing sub-block for the regenerate_billing_header shape hook. Both fields accept either literal values or ${VAR} environment variable references. When either resolves to None, the billing header regeneration silently no-ops.
shaping:
providers:
anthropic:
billing:
salt: "${CCPROXY_BILLING_SALT}" # Hex salt for SHA-256 cc_version suffix
seed: "${CCPROXY_BILLING_SEED}" # xxhash64 seed for the 5-hex cch field| Field | Type | Description |
|---|---|---|
billing.salt |
string | Hex salt for the SHA-256 cc_version 3-hex suffix. Supports ${VAR} expansion. |
billing.seed |
string | xxhash64 seed for the 5-hex cch field (hex, with or without 0x prefix). Supports ${VAR} expansion. |
The salt is a static reverse-engineered constant (it does not rotate per release). It is never committed — supply via ccproxy.yaml or the CCPROXY_BILLING_SALT / CCPROXY_BILLING_SEED environment variables.
| Field | Type | Description |
|---|---|---|
enabled |
bool | Enable/disable shaping globally (default true) |
shapes_dir |
string | Directory for .mflow overrides and provider patch queues |
providers |
map | Per-provider shaping profiles (see shaping.md) |
ccproxy:
flows:
default_jq_filters:
- 'map(select(.request.path | startswith("/v1/messages")))'| Field | Type | Description |
|---|---|---|
default_jq_filters |
list | jq flow-set selectors applied before CLI --jq filters. Each must consume a JSON array and produce one JSON array of flow objects. |
All CCPROXY_ prefixed environment variables override their corresponding YAML field. For example, CCPROXY_PORT=4001 overrides ccproxy.port.
| Variable | Description |
|---|---|
CCPROXY_CONFIG_DIR |
Override the config directory (takes precedence over ~/.config/ccproxy) |
CCPROXY_HOST |
Override the listen address |
CCPROXY_PORT |
Override the listen port |
CCPROXY_LOG_LEVEL |
Override log_level |
CCPROXY_LOG_FILE |
Override log_file |
CCPROXY_JOURNAL_IDENTIFIER |
Override journal_identifier |
CCPROXY_BILLING_SALT |
Hex salt for Anthropic billing header cc_version suffix |
CCPROXY_BILLING_SEED |
xxhash64 seed for Anthropic billing header cch field |
MITMPROXY_SSLKEYLOGFILE |
Path for TLS keylog (auto-exported by ccproxy start to {config_dir}/tls.keylog) |