🌐 English · 한국어
Expose xAI Grok through Hermes OAuth as an OpenAI-compatible local API.
A lightweight, local reverse proxy that exposes the xAI Grok API through your existing Hermes Agent OAuth session. No API key required — it reuses the browser-based OAuth tokens already stored in ~/.hermes/auth.json.
Built with FastAPI and designed to drop into any OpenAI-compatible client (e.g., LiteLLM, OpenAI Python SDK, curl).
| Area | Summary |
|---|---|
| Client surface | OpenAI-compatible /v1 routes for SDKs, LiteLLM, curl, and agents. |
| Credential source | Imports xAI OAuth from Hermes; no xAI API key is bundled or required. |
| Runtime safety | Fail-closed startup, loopback-first binding, PROXY_API_KEY required off-host. |
| Token lifecycle | Independent refresh loop, prewarm, Hermes auth watcher, sanitized diagnostics. |
| Operations | Desktop quickstart, headless export/import flow, systemd and LaunchAgent docs. |
The hero image summarizes the trust boundary: OpenAI-compatible clients call the local proxy, the proxy injects a short-lived xAI OAuth Bearer token upstream, and local auth state stays permission-locked and separate from public docs.
A healthy local run should prove both the local proxy and upstream OAuth path in one copy-paste block:
PROXY=http://127.0.0.1:9996; \
curl -fsS "$PROXY/health" && echo && \
curl -fsS "$PROXY/v1/chat/completions" \
-H "Content-Type: application/json" \
-d '{"model":"grok-4.3","messages":[{"role":"user","content":"Reply with OK only."}]}'Expected proof: /health returns status, provider, and token-expiry diagnostics; the chat request returns an OpenAI-compatible response from Grok.
- Zero-config OAuth — Automatically copies and manages xAI OAuth tokens from Hermes Agent.
- Independent token lifecycle — Runs its own token refresh loop so it never races with Hermes.
- Token prewarm — Refreshes the access token in the background before it expires.
- Hermes auth.json watcher — Detects re-authentication in Hermes and re-imports tokens automatically.
- Streaming support — Full SSE streaming for
/v1/chat/completions. - Upstream retry — Retries idempotent requests on 502/503/429 and transient connection failures; avoids duplicate-generating POST retries.
- Prometheus metrics — Built-in
/metricsendpoint for request counts, durations, token expiry, OAuth refresh diagnostics, and 401 refresh-retry counters. - Deep health checks —
/health?deep=1performs an actual upstream ping to verify end-to-end connectivity. - Secure file permissions — Local token copy is written with
0o600permissions.
Mobile-readable flow:
- OpenAI-compatible clients call the local
/v1/*proxy. - The proxy reads permission-locked Hermes-derived OAuth state and injects a short-lived xAI Bearer token.
api.x.aisees only the proxy's upstream request; client credentials and hop-by-hop headers are stripped.
┌─────────────────┐ HTTP ┌──────────────────────┐ HTTPS + Bearer ┌─────────────┐
│ Your Client │ ─────────────>│ Grok OAuth Proxy │ ─────────────────────>│ api.x.ai │
│ (LiteLLM, etc) │ OpenAI fmt │ (127.0.0.1:9996) │ OAuth token │ (xAI) │
└─────────────────┘ └──────────────────────┘ └─────────────┘
│
│ reads / refreshes
▼
┌──────────────┐
│ auth_state │
│ .json │ (copied from Hermes, 0o600)
└──────────────┘
- On startup, the proxy first verifies that the Hermes CLI is installed.
- It then verifies that Hermes has
xai-oauthcredentials in~/.hermes/auth.json. - It copies the OAuth tokens and public
client_idclaim from Hermes into a localauth_state.json. - All subsequent token refreshes are performed independently against
https://auth.x.ai/oauth2/tokenusing that imported client id. - Incoming requests are forwarded to
https://api.x.ai/v1/*with the current Bearer token injected.
- Python 3.9+
- An active Hermes Agent installation on the machine where the proxy will run, or a headless import created from a desktop Hermes install
- xAI Grok OAuth already configured in Hermes (
xai-oauthin~/.hermes/auth.json)
The proxy starts fail-closed: if Hermes CLI, Hermes auth, xai-oauth, or a runtime-importable OAuth client_id claim is missing, startup fails instead of falling back to anonymous or hard-coded credentials.
Use this path when the proxy runs on the same Mac where Hermes completed browser OAuth login.
git clone https://github.com/yelixir-dev/grok-oauth-proxy.git
cd grok-oauth-proxy
./install.sh
source .venv/bin/activate
python main.pyThe proxy will start on http://127.0.0.1:9996 (it scans upward if the port is taken). Leave PROXY_HOST unset for local-only use.
One-command smoke test:
PROXY=http://127.0.0.1:9996; \
curl -fsS "$PROXY/health" && echo && \
curl -fsS "$PROXY/v1/chat/completions" \
-H "Content-Type: application/json" \
-d '{"model":"grok-4.3","messages":[{"role":"user","content":"Reply with OK only."}]}'The first response proves the local server is healthy; the second proves the proxy can inject OAuth and receive a Grok response.
If you want the proxy to survive logouts/reboots on macOS, use the LaunchAgent recipe in services/README.md.
git clone https://github.com/yelixir-dev/grok-oauth-proxy.git
cd grok-oauth-proxy
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install -r requirements.txt
python main.pyUse this path for a VPS, cloud instance, or server where browser OAuth login is not available. Export only the Hermes xai-oauth provider from a desktop machine with scripts/export_xai_oauth.py; do not copy your full ~/.hermes/auth.json. The export contains only the provider name, OAuth client claim, and token fields needed by scripts/import_xai_oauth.py.
# Desktop/browser machine after Hermes xAI OAuth login
git clone https://github.com/yelixir-dev/grok-oauth-proxy.git
cd grok-oauth-proxy
python scripts/export_xai_oauth.py > ~/xai-oauth.json
chmod 600 ~/xai-oauth.json
scp ~/xai-oauth.json user@your-server:/tmp/xai-oauth.json
# Headless server
git clone https://github.com/yelixir-dev/grok-oauth-proxy.git
cd grok-oauth-proxy
./install.sh --headless --enable-service
python scripts/import_xai_oauth.py /tmp/xai-oauth.json
rm -f /tmp/xai-oauth.json
curl -fsS http://127.0.0.1:9996/health?deep=1For a remote bind, set a strong PROXY_API_KEY before starting the service; startup refuses non-loopback PROXY_HOST values without it. Prefer private networking (for example Tailscale/WireGuard/VPC) or TLS reverse proxy in front of any non-loopback listener.
This project is designed to work on headless servers (VPS, cloud instances, containers, etc.) where you cannot open a browser for OAuth login. The safe pattern is: desktop OAuth login, minimal xai-oauth export, server import, stale proxy state reset, service restart, and desktop re-authentication if you want separate refresh-token chains.
For the most reliable long-running setup, give the proxy its own refresh-token chain instead of making Hermes and the proxy share one live chain:
Hermes local OAuth login
→ transfer the resulting xAI OAuth refresh-token chain to grok-oauth-proxy
(local proxy or headless server)
→ re-authenticate Hermes locally
→ Hermes and grok-oauth-proxy now refresh independently
Why: xAI/Grok access tokens are short-lived, and live testing showed refresh
tokens rotate when refreshed. A transferred chain can keep the proxy alive, while
a second Hermes login gives your desktop Hermes a separate chain. If xAI changes
its session policy later, fall back to one active owner and rerun
refresh_remote_xai_oauth.py whenever you re-authenticate Hermes.
-
On a machine with browser (your laptop or desktop):
- Install Hermes
- Run
hermes modeland complete xAI Grok OAuth login - Verify the token exists:
python -c 'import json, pathlib; data=json.load(open(pathlib.Path.home()/".hermes/auth.json")); print("xai-oauth present:", "xai-oauth" in data.get("providers", {}))'
-
Copy only the xAI OAuth credentials to the server (Recommended)
On the machine with browser, run:
cd grok-oauth-proxy python scripts/export_xai_oauth.py > ~/xai-oauth.json chmod 600 ~/xai-oauth.json
Copy the exported file to the server:
scp ~/xai-oauth.json user@your-server:/tmp/xai-oauth.jsonOn the headless server, import it:
cd grok-oauth-proxy python scripts/import_xai_oauth.py /tmp/xai-oauth.json rm -f /tmp/xai-oauth.json chmod 700 ~/.hermes chmod 600 ~/.hermes/auth.json sudo systemctl restart grok-oauth-proxy curl -fsS http://127.0.0.1:9996/health?deep=1
After the server is healthy, delete the desktop export too:
rm -f ~/xai-oauth.jsonBy default,
import_xai_oauth.pyalso removes the proxy's stale localauth_state.json, so the next restart rehydrates from the newly imported Hermes credentials. Use--no-reset-proxy-stateonly if you intentionally want to leave the running proxy token state untouched.Or refresh a remote headless server in one step from your browser machine:
python scripts/refresh_remote_xai_oauth.py \ --host ubuntu@100.113.251.30 \ --identity ~/.ssh/commandcode-bridge-debug/emil_commandcode_debug_ed25519 \ --print-reauth-commandThe one-step helper exports only
xai-oauth, copies it over SSH, imports it, resets stale proxy token state, restartsgrok-oauth-proxy, and runs a deep health check. With--print-reauth-command, it also prints the final Hermes re-auth command for the recommended split-chain flow.This approach only exports the
xai-oauthsection, which is much safer than copying the entire~/.hermes/auth.json.If the proxy later reports
invalid_grant, repeated 401 refresh failures, or a stale refresh chain, re-run the transfer from a freshly authenticated desktop Hermes session. The default import path intentionally removes~/.local/state/grok-oauth-proxy/auth_state.json(orGROK_PROXY_AUTH_STATE) so the next proxy restart uses the imported Hermes credentials. Do not pass--no-reset-proxy-stateunless you are deliberately preserving the proxy's current token chain. -
On the headless server (Recommended)
git clone https://github.com/yelixir-dev/grok-oauth-proxy.git cd grok-oauth-proxy # Basic headless install ./install.sh --headless # Or install + enable systemd service at once ./install.sh --headless --enable-service
The
install.sh --headlessscript will:- Check for the exported
xai-oauth.jsonfile - Create a virtual environment
- Install dependencies
- Import your xAI OAuth credentials
On first start the proxy will:
- Detect Hermes CLI
- Read
~/.hermes/auth.json - Extract
xai-oauthtokens +client_idfrom JWT claims - Create
~/.local/state/grok-oauth-proxy/auth_state.json(0o600)
- Check for the exported
systemd (Linux) example:
# /etc/systemd/system/grok-oauth-proxy.service
[Unit]
Description=Grok OAuth Proxy for Hermes
After=network.target
[Service]
Type=simple
User=youruser
WorkingDirectory=/home/youruser/grok-oauth-proxy
Environment=HOME=/home/youruser
Environment=HERMES_AUTH_PATH=/home/youruser/.hermes/auth.json
Environment=PATH=/home/youruser/grok-oauth-proxy/.venv/bin:/home/youruser/.local/bin:/usr/local/bin:/usr/bin:/bin
ExecStart=/home/youruser/grok-oauth-proxy/.venv/bin/python main.py
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.targetsudo systemctl daemon-reload
sudo systemctl enable --now grok-oauth-proxymacOS LaunchAgent example is documented in services/README.md.
Note: systemd services do not always inherit the interactive shell environment. The installer writes HOME, HERMES_AUTH_PATH, and a PATH that includes both the project virtualenv and ~/.local/bin so the service can find Hermes CLI on headless hosts.
For remote access, edit the rendered systemd service or a drop-in to include both a non-loopback host and a strong proxy key:
Environment=PROXY_HOST=0.0.0.0
Environment=PROXY_API_KEY=replace-with-a-long-random-valueThen call the proxy with either Authorization: Bearer PROXY_API_KEY_VALUE or X-Proxy-Api-Key: PROXY_API_KEY_VALUE. The proxy strips that client credential before forwarding and injects its own xAI OAuth bearer token upstream.
All settings are optional and read from environment variables.
| Variable | Default | Description |
|---|---|---|
PROXY_HOST |
127.0.0.1 |
Bind address. Non-loopback binds require PROXY_API_KEY. |
PROXY_PORT |
9996 |
Base port. If occupied, scans +1 up to 20 times |
PROXY_API_KEY |
unset | Optional local proxy auth key. Required when binding outside loopback. Accepted as Authorization: Bearer PROXY_API_KEY_VALUE or X-Proxy-Api-Key: PROXY_API_KEY_VALUE. |
GROK_PROXY_AUTH_STATE |
~/.local/state/grok-oauth-proxy/auth_state.json |
Local proxy-owned token state path. |
LOG_LEVEL |
INFO |
Logging level: DEBUG, INFO, WARNING, ERROR |
HERMES_AUTH_PATH |
~/.hermes/auth.json |
Path to Hermes auth store |
TOKEN_REFRESH_WINDOW |
300 |
Seconds before expiry to trigger a background refresh |
HERMES_POLL_INTERVAL |
60 |
Seconds between Hermes auth.json change checks |
UPSTREAM_RETRY_ATTEMPTS |
2 |
Max attempts for idempotent upstream requests (GET, HEAD, OPTIONS, TRACE) and transient connection errors. Non-idempotent requests such as model-generating POST calls are not retried on 502/503/429 to avoid duplicate billing/side effects. A 401 token-refresh retry is still performed once. |
UPSTREAM_RETRY_DELAY |
1.0 |
Base delay in seconds between retries |
PROXY_PORT=8080 LOG_LEVEL=DEBUG python main.pyThis proxy is path-transparent: anything under /{path:path} is forwarded to https://api.x.ai/{path} with the current Hermes xai-oauth bearer token injected. That makes it usable for the same direct-to-xAI surfaces described in the Hermes xAI Grok OAuth guide, not only chat.
Common xAI surfaces:
| Surface | Example path | Notes |
|---|---|---|
| Chat / responses-compatible clients | /v1/chat/completions, /v1/responses |
Supports normal and streaming requests; client-supplied Authorization is stripped and replaced. |
| Models | /v1/models |
Used by deep health checks and model discovery. |
| TTS | /v1/tts |
Reuses the same OAuth bearer token when the upstream endpoint is available to the account. |
| Image generation | /v1/images/generations or xAI image endpoints |
Path-transparent forwarding keeps non-chat xAI features available. |
| Video generation | xAI Grok Imagine video endpoints | Forwarded unchanged; large/streaming responses are streamed back. |
| Transcription / audio | xAI audio endpoints | Forwarded unchanged. |
| X Search via Responses | /v1/responses with xAI search tools |
Works as a normal Responses API request when the account/provider supports it. |
Local management endpoints:
| Endpoint | Method | Description |
|---|---|---|
/{path:path} |
Any | Proxies to https://api.x.ai/{path} |
/health |
GET |
Proxy status, token expiry, and sanitized OAuth refresh diagnostics |
/health?deep=1 |
GET |
Deep health: actually pings api.x.ai/v1/models |
/metrics |
GET |
Prometheus-compatible metrics |
{
"status": "ok",
"provider": "xai-oauth",
"api_base": "https://api.x.ai",
"token_expires_at": "2026-05-17T11:46:33Z",
"token_seconds_to_expiry": 20891.2,
"token_endpoint": "https://auth.x.ai/oauth2/token",
"oauth_refresh": {
"last_refresh_at": "2026-05-17T05:46:33Z",
"last_refresh_status": "success",
"last_refresh_error_class": null,
"refresh_token_rotated": true,
"refresh_success_count": 4,
"refresh_failure_count": 0
}
}Refresh diagnostics are intentionally sanitized: they expose timestamps, counts,
status, exception class, and whether the most recent successful refresh rotated
the refresh token, but never access tokens, refresh tokens, client credentials,
or upstream OAuth response bodies. /metrics exposes matching Prometheus series
including proxy_token_seconds_to_expiry, proxy_oauth_refresh_success_total,
proxy_oauth_refresh_failure_total, proxy_oauth_refresh_token_rotated, and
proxy_oauth_401_refresh_*_total counters for requests that received an upstream
401 and triggered the proxy's one-time forced-refresh retry.
The proxy is path-transparent, so the correct base_url depends on whether the client appends /v1 for you. Always use a dummy client API key unless PROXY_API_KEY is enabled; the proxy strips client credentials and injects the real xAI OAuth bearer token upstream.
| Client | Base URL to configure | API key value | Why |
|---|---|---|---|
raw curl |
http://127.0.0.1:9996 and include the full /v1/... path |
omit for loopback, or use proxy key header when enabled | curl does not append paths automatically. |
| OpenAI Python SDK | http://127.0.0.1:9996/v1 |
dummy for loopback, PROXY_API_KEY if proxy auth is enabled |
the SDK appends endpoint paths such as /chat/completions. |
| LiteLLM | http://127.0.0.1:9996/v1 for OpenAI-compatible models |
dummy for loopback, PROXY_API_KEY if proxy auth is enabled |
LiteLLM's openai/ provider expects an OpenAI-style /v1 base. |
| Hermes OpenAI-compatible provider | http://127.0.0.1:9996/v1 |
dummy for loopback, or a non-secret env var containing PROXY_API_KEY |
Hermes will send normal OpenAI-compatible requests; the proxy handles xAI OAuth. |
from openai import OpenAI
client = OpenAI(
base_url="http://127.0.0.1:9996/v1",
api_key="dummy", # proxy injects the real OAuth bearer token
)
response = client.chat.completions.create(
model="grok-4.3",
messages=[{"role": "user", "content": "Hello from the proxy"}],
)
print(response.choices[0].message.content)model_list:
- model_name: grok-4.3
litellm_params:
model: openai/grok-4.3
api_base: http://127.0.0.1:9996/v1
api_key: "dummy" # use PROXY_API_KEY here only if proxy auth is enabledUse this proxy when you want Hermes to call xAI/Grok through an OpenAI-compatible provider without storing an xAI API key. Configure Hermes with an OpenAI-compatible provider whose base URL points at the proxy, and keep the provider API key as dummy for loopback-only use. If you expose the proxy beyond loopback, store the proxy key in an environment variable and reference that variable from Hermes config instead of putting the value in docs or source.
Illustrative Hermes provider shape (field names may differ by Hermes version):
providers:
grok-oauth-proxy:
type: openai_compatible
base_url: http://127.0.0.1:9996/v1
api_key_env: GROK_PROXY_CLIENT_KEY # set to dummy locally, or to PROXY_API_KEY for remote proxy auth
models:
- grok-4.3Hermes' native x_search tool is separate from ordinary OpenAI-compatible chat routing. For X Search through the xAI Responses API, send a normal /v1/responses request through this proxy only when the upstream account and model support xAI search tools. Do not assume every OAuth account has every Responses, image, audio, or search feature enabled; failures should be treated as upstream capability or account-policy issues, not token leakage.
Hermes and the proxy share the same xAI account and OAuth client identity, but not the same token state file:
- Hermes owns
~/.hermes/auth.json - The proxy owns
~/.local/state/grok-oauth-proxy/auth_state.jsonby default (created on first start,chmod 600; override withGROK_PROXY_AUTH_STATE) - The proxy does not ship an
XAI_CLIENT_IDconstant. It imports the public client id from Hermes token claims (client_id/aud) during first start and after Hermes re-authentication.
This means:
- Hermes can refresh its token without invalidating the proxy's session.
- The proxy can refresh its token without racing Hermes.
- If Hermes re-authenticates (new login), the background watcher detects the change and re-imports.
Two asyncio tasks run continuously while the proxy is up:
- Token Prewarm Watcher — Checks token expiry every
TOKEN_REFRESH_WINDOW / 2seconds. If the token is about to expire, it refreshes proactively so real API calls never hit a stale token. - Hermes File Watcher — Polls
~/.hermes/auth.jsonmtime everyHERMES_POLL_INTERVALseconds. On change, re-imports the latestxai-oauthcredentials.
See SECURITY.md for the private reporting route and supported security boundary.
- The proxy listens on
127.0.0.1by default. IfPROXY_HOSTis set to a non-loopback address such as0.0.0.0, startup is refused unlessPROXY_API_KEYis configured. - When
PROXY_API_KEYis set, proxy requests must include eitherAuthorization: Bearer PROXY_API_KEY_VALUEorX-Proxy-Api-Key: PROXY_API_KEY_VALUE. The client credential is stripped before forwarding; the proxy always injects its own xAI OAuth bearer token upstream. - Hop-by-hop headers, incoming client credentials (
Authorization,Proxy-Authorization,Connection,TE, etc.), cookies, and spoofable forwarding headers (Forwarded,X-Forwarded-*,X-Real-IP) are stripped before forwarding toapi.x.ai. - The local token state directory is created with
0o700permissions when the proxy creates it, andauth_state.jsonis written atomically with0o600permissions. Existing token-state files are permission-repaired before reads when possible. - Uvicorn access logs are disabled by default to avoid logging query strings; the app log records method/path/status only.
- The proxy uses the same OAuth
client_idthat Hermes obtained during xAI Grok OAuth login. The client id is imported from the local Hermes auth state at runtime, not hard-coded into the distributable source. This is technically a third-party client reuse; use at your own discretion with respect to xAI's Terms of Service.
| Symptom | What to check |
|---|---|
Startup says Hermes or xai-oauth is missing |
Run Hermes xAI OAuth login on a desktop, or import a fresh xai-oauth export on the headless server. The proxy intentionally fails closed. |
Startup refuses PROXY_HOST=0.0.0.0 |
Set a strong PROXY_API_KEY and send it to the proxy as Authorization: Bearer PROXY_API_KEY_VALUE or X-Proxy-Api-Key: PROXY_API_KEY_VALUE. |
OpenAI SDK gets 404 or double /v1/v1 paths |
Use base_url=http://127.0.0.1:9996/v1 for SDK/LiteLLM/Hermes providers; use http://127.0.0.1:9996 only when you manually include the full /v1/... path. |
| Headless server keeps using old credentials | Restart the service after import and make sure stale proxy state was removed. Default import_xai_oauth.py removes ~/.local/state/grok-oauth-proxy/auth_state.json; avoid --no-reset-proxy-state for normal refreshes. |
invalid_grant or repeated 401 refresh failures |
xAI refresh tokens may have rotated or been invalidated. Re-authenticate Hermes on a browser machine, transfer a fresh xai-oauth export, restart the proxy, then re-authenticate desktop Hermes if using the split-chain flow. |
/health?deep=1 fails but /health is ok |
Local token state exists, but upstream xAI reachability, account capability, proxy auth, network egress, or service DNS/TLS may be failing. Check service logs without printing token files. |
Never paste ~/.hermes/auth.json, exported xai-oauth.json, auth_state.json, refresh tokens, or PROXY_API_KEY values into issues or logs.
source .venv/bin/activate
python main.pypython -m venv .venv
source .venv/bin/activate
pip install -r requirements-dev.txt
pytest -qnohup python main.py > proxy.log 2>&1 &grok-oauth-proxy/
├── main.py # FastAPI app, proxy logic, background watchers
├── token_manager.py # Async-safe OAuth token read / refresh
├── config.py # Environment variable configuration
├── requirements.txt
└── README.md
Contributions are welcome when they preserve the proxy's fail-closed security model. Before opening a change, run:
python -m pytest
python -m ruff check .Security-sensitive changes should include tests for header stripping, non-loopback auth enforcement, token/credential redaction, and idempotent retry behavior. For private security reports, use the GitHub Security Advisory form at https://github.com/yelixir-dev/grok-oauth-proxy/security/advisories/new or email yelixir.dev@gmail.com.
MIT
