Use Claude Code with powerful NVIDIA-hosted models β for free. Skip the $100β$200/month Anthropic subscription. Point Claude Code at this proxy and get access to large frontier models (70B, 405B+) using NVIDIA's free NIM preview tier.
Claude Code is Anthropic's official AI coding agent β one of the most capable on the market. By default it requires an Anthropic API subscription that can cost $100β$200+ per month for heavy usage.
This proxy changes that. It sits between Claude Code and the internet, translating Anthropic API calls into OpenAI-compatible requests and forwarding them to NVIDIA NIM β which offers a free preview tier with access to large open-weight models (Llama 3.1 405B, Mistral, GLM, and more).
Claude Code / VS Code Extension
β Anthropic API format
βΌ
βββββββββββββββββββββββ
β nvidia-claude-proxy β β this repo (runs locally on port 8089)
βββββββββββββββββββββββ
β OpenAI API format
βΌ
NVIDIA NIM API (free tier)
| Without this proxy | With this proxy |
|---|---|
| Pay $100β$200/month for Claude API | Free NVIDIA NIM preview models |
| Limited to Anthropic's model lineup | Access to 70Bβ405B+ open-weight models |
| Single provider dependency | Any OpenAI-compatible backend |
| No request visibility | Real-time dashboard with logs & analytics |
- Full Anthropic β OpenAI request translation
- System prompts (string or block array)
- User / assistant messages (plain text, multi-part, images via base64 or URL)
- Tool definitions (
tools,tool_choice) - Assistant messages with tool calls
- Tool result messages
- Full OpenAI β Anthropic response translation
- Non-streaming JSON responses
- Streaming SSE (
message_start,content_block_start/delta/stop,message_delta,message_stop) - Tool-call streaming with
input_json_deltaevents
- Configuration via
.envβ no JSON config file needed - Optional inbound auth β protect the proxy with a
SERVER_API_KEY - Structured logging β request summaries, forwarded headers (key redacted), body previews
python-proxy/
βββ main.py # FastAPI app & HTTP handlers
βββ config.py # .env loader & ServerConfig dataclass
βββ models.py # Pydantic models (Anthropic + OpenAI schemas)
βββ translation.py # Conversion logic (Anthropic β OpenAI)
βββ requirements.txt # Python dependencies
βββ .env # Your local config (gitignored)
βββ .env.example # Template β copy to .env and fill in values
- Python 3.11+
pip- An NVIDIA NIM API key β free tier available
Free models: NVIDIA offers a generous set of free preview models (no billing required). Browse the full list at build.nvidia.com/models β Free Preview and use any model ID (e.g.
z-ai/glm4.7) as the value forYOUR_MODEL_HEREin your settings.
git clone https://github.com/Jeanrooy/nvidia-claude-proxy.git
cd nvidia-claude-proxypip install -r requirements.txtCopy the example and fill in your values:
cp .env.example .envEdit .env:
# Address to listen on
ADDR=127.0.0.1:8089
# NVIDIA NIM endpoint (or any OpenAI-compatible URL)
UPSTREAM_URL=https://integrate.api.nvidia.com/v1/chat/completions
# Your NVIDIA API key
PROVIDER_API_KEY=nvapi-your-key-here
# Optional: protect this proxy with a Bearer / x-api-key token
SERVER_API_KEY=
# Upstream request timeout in seconds (non-streaming)
UPSTREAM_TIMEOUT_SECONDS=300
# Max chars to log for request/response bodies (0 = disabled)
LOG_BODY_MAX_CHARS=4096
# Max chars of streamed text to preview in logs (0 = disabled)
LOG_STREAM_TEXT_PREVIEW_CHARS=256python main.pyOr via uvicorn directly (with auto-reload for development):
uvicorn main:app --host 127.0.0.1 --port 8089 --reloadThe proxy exposes the same endpoint as the Anthropic API:
| Method | Path | Description |
|---|---|---|
GET |
/ |
Health check |
POST |
/v1/messages |
Anthropic Messages API proxy |
curl -N http://127.0.0.1:8089/v1/messages \
-H "Content-Type: application/json" \
-d '{
"model": "z-ai/glm4.7",
"max_tokens": 256,
"stream": true,
"messages": [{"role": "user", "content": "hello"}]
}'curl http://127.0.0.1:8089/v1/messages \
-H "Content-Type: application/json" \
-d '{
"model": "z-ai/glm4.7",
"max_tokens": 256,
"messages": [{"role": "user", "content": "What is 2+2?"}]
}'If SERVER_API_KEY is set in .env, every request must include a matching key:
curl http://127.0.0.1:8089/v1/messages \
-H "Authorization: Bearer your-server-api-key" \
-H "Content-Type: application/json" \
-d '{ ... }'| Variable | Default | Description |
|---|---|---|
ADDR |
127.0.0.1:8089 |
host:port the server listens on |
UPSTREAM_URL |
(required) | OpenAI-compatible upstream endpoint |
PROVIDER_API_KEY |
(required) | API key forwarded to the upstream (as Bearer token) |
SERVER_API_KEY |
(empty = disabled) | Protect this proxy with inbound auth |
UPSTREAM_TIMEOUT_SECONDS |
300 |
Timeout for non-streaming upstream requests |
LOG_BODY_MAX_CHARS |
4096 |
Max chars logged for bodies (0 = disabled) |
LOG_STREAM_TEXT_PREVIEW_CHARS |
256 |
Max chars of streamed text previewed in logs |
There are two ways to configure Claude Code to use this proxy β choose whichever fits your workflow.
This is the cleanest approach and works for both the Claude Code CLI and the Claude Code VS Code extension. Each project that should use the proxy gets its own .claude/ folder.
# From your target project root
cp -r /path/to/nvidia-claude-proxy/.claude .claudeOr manually create .claude/settings.json in your project with the following content
(a ready-to-copy template is also shipped in this repo at .claude/settings.json):
{
"env": {
"ANTHROPIC_BASE_URL": "http://localhost:8089",
"ANTHROPIC_API_KEY": "YOUR_NVIDIA_API_KEY_HERE",
"ANTHROPIC_AUTH_TOKEN": "YOUR_NVIDIA_API_KEY_HERE",
"ANTHROPIC_DEFAULT_OPUS_MODEL": "YOUR_MODEL_HERE",
"ANTHROPIC_DEFAULT_SONNET_MODEL": "YOUR_MODEL_HERE",
"ANTHROPIC_DEFAULT_HAIKU_MODEL": "YOUR_MODEL_HERE"
},
"model": "YOUR_MODEL_HERE"
}Replace every YOUR_* placeholder with your real values, e.g.:
| Placeholder | Example value |
|---|---|
YOUR_NVIDIA_API_KEY_HERE |
nvapi-xxxxxxxxxxxxxxxxxxxx |
YOUR_MODEL_HERE |
z-ai/glm4.7 |
settings.json is safe to commit (it contains no real secrets when you use the template).
settings.local.json must never be committed β add it to your project's .gitignore:
# Claude Code local overrides β contains real API keys
.claude/settings.local.json# Terminal 1 β run the proxy
cd /path/to/nvidia-claude-proxy
python main.py
# Terminal 2 β use Claude Code CLI in your project
cd /path/to/your-project
claudeOr simply open your project in VS Code β the Claude Code extension automatically reads .claude/settings.json from the workspace root.
Tip: You only need to run the proxy once per machine session. Any number of projects can point at it simultaneously.
If you just want to test the proxy without modifying any project:
# Windows (PowerShell)
$env:ANTHROPIC_BASE_URL = "http://127.0.0.1:8089"
$env:ANTHROPIC_API_KEY = "nvapi-your-key"
claude
# Linux / macOS
ANTHROPIC_BASE_URL=http://127.0.0.1:8089 ANTHROPIC_API_KEY=nvapi-your-key claude- The
PROVIDER_API_KEYis never logged β only<redacted>appears in logs. - Inbound auth uses
hmac.compare_digestto prevent timing attacks. - Base64 image payloads are validated before forwarding.
