[Feat][Router] Support multiple API keys for multi-tenant authentication (#759) by prashansapkota · Pull Request #937 · vllm-project/production-stack

prashansapkota · 2026-04-29T04:57:06Z

Summary

Closes #759

Previously, vllmApiKey only accepted a single string value. Because vLLM's --api-key flag natively supports a comma-separated list of keys, users tried passing "key1,key2,key3" or a YAML list — but the Helm chart silently discarded everything after the first key, and the router had no enforcement layer at all. This blocked multi-tenant deployments where different teams need independent credentials to share the same vLLM instance.

This PR fixes the full stack. From Helm values through to the router's request validation.

What Changed

Helm (infrastructure layer)

helm/values.yaml

Extended the vllmApiKey schema annotation from type:[string, object] to type:[string, array, object]

Added inline documentation with examples for all three accepted forms:

vllmApiKey: "single-key"
vllmApiKey: "key1,key2,key3"
vllmApiKey: ["key1", "key2", "key3"]
vllmApiKey: {secretName: my-secret, secretKey: api-key}

helm/templates/secrets.yaml

Added a kindIs "slice" branch: when vllmApiKey is a YAML list, entries are joined with commas (join ",") before being base64-encoded into the Kubernetes Secret. The Secret always stores a single comma-separated string, the format vLLM natively expects for --api-key.

helm/templates/deployment-vllm-multi.yaml and helm/templates/deployment-router.yaml

Extended the kindIs "string" guard to or (kindIs "string") (kindIs "slice") so that list-typed keys also correctly resolve the VLLM_API_KEY env var from the generated Secret.

Router (application layer)

src/vllm_router/auth.py (new)

_parse_api_keys(raw): splits a comma-separated string into a frozenset, stripping whitespace and ignoring empty segments
get_allowed_api_keys(): reads VLLM_API_KEY from the environment at request time
verify_api_key(request): FastAPI dependency that:
- Is a no-op when VLLM_API_KEY is unset (preserves backward compatibility for unauthenticated deployments)
- Returns HTTP 401 if the Authorization header is missing or not a Bearer token
- Returns HTTP 401 if the token is not in the allowed set
- Accepts any of the configured keys, enabling true multi-tenant access

src/vllm_router/app.py

Wired verify_api_key as a router-level Depends on main_router: every inference endpoint (/v1/chat/completions, /v1/completions, /v1/embeddings, etc.) is protected automatically without touching individual route handlers.

Tests

src/tests/test_multi_api_key_auth.py (new, 24 tests)

Group	Coverage
`_parse_api_keys`	single key, comma-separated, whitespace stripping, empty segments, empty string, whitespace-only
`get_allowed_api_keys`	no env var, single key, multiple keys, space trimming
`verify_api_key`	no auth configured → allow all, valid single key, valid key among multiple, invalid key → 401, missing header → 401, wrong scheme (Basic) → 401, whitespace in env keys

How It Works End-to-End

helm install ... --set 'servingEngineSpec.vllmApiKey=["team-a-key","team-b-key"]'
        │
        ▼
secrets.yaml joins list → "team-a-key,team-b-key" → base64 → K8s Secret
        │
        ▼
deployment-vllm-multi.yaml injects VLLM_API_KEY="team-a-key,team-b-key" into vLLM container
deployment-router.yaml     injects VLLM_API_KEY="team-a-key,team-b-key" into router container
        │
        ▼
Router: verify_api_key dependency checks incoming Bearer token against frozenset{"team-a-key","team-b-key"}
vLLM:   natively validates the same comma-separated key list via --api-key

Backward Compatibility

Deployments with no vllmApiKey set: no change: verify_api_key no-ops
Deployments with a single string key: no change: Existing code path unchanged
Deployments referencing an existing Secret via {secretName, secretKey}: no change

Test Plan

pytest src/tests/test_multi_api_key_auth.py (24 tests pass)
pytest src/tests/test_stale_metrics.py src/tests/test_static_service_discovery.py src/tests/test_utils.py. All existing tests pass (51 total)
black, isort. All modified files pass linting

…roject#888) Signed-off-by: prashansapkota <prashan.sapkota3456@gmail.com>

…on (vllm-project#759) Previously, vllmApiKey only accepted a single string, silently discarding any keys after the first comma. This prevented teams from sharing a vLLM deployment with independent credentials. Changes: - helm/values.yaml: extend vllmApiKey schema to accept string, list, or existing-secret object. List entries are joined as a comma-separated string before being stored in the Kubernetes Secret, which matches the format vLLM's --api-key flag natively expects. - helm/templates/secrets.yaml: add kindIs "slice" branch to join a list of keys with commas before base64-encoding into the Secret. - helm/templates/deployment-vllm-multi.yaml, helm/templates/deployment-router.yaml: extend the kindIs "string" guard to also match "slice" so list-typed keys are correctly mapped to the VLLM_API_KEY env var from the generated Secret. - src/vllm_router/auth.py (new): self-contained auth module with _parse_api_keys(), get_allowed_api_keys(), and verify_api_key() FastAPI dependency. Reads VLLM_API_KEY, splits on commas, strips whitespace, and returns HTTP 401 for missing or invalid Bearer tokens. No-ops when VLLM_API_KEY is unset. - src/vllm_router/app.py: wire verify_api_key as a router-level dependency on main_router so every inference endpoint is protected without modifying individual route handlers. - src/tests/test_multi_api_key_auth.py (new): 24 tests covering key parsing, env var reading, and the FastAPI dependency (valid key, multi-key, invalid key → 401, missing header → 401, wrong scheme → 401, whitespace tolerance, no-auth passthrough). Signed-off-by: prashansapkota <prashan.sapkota3456@gmail.com>

gemini-code-assist

Code Review

This pull request introduces support for multiple API keys to enable multi-tenant access and addresses the issue of stale Prometheus metrics. Key changes include updating Helm templates to handle vllmApiKey as either a string or a list, implementing a new authentication module that parses multiple keys from environment variables, and adding a mechanism to clear label-based gauges in the metrics router. Additionally, a healthy field was added to EndpointInfo for better status tracking. Feedback was provided to improve the robustness of the Authorization header check in the authentication dependency.

gemini-code-assist · 2026-04-29T04:59:55Z

+    if not allowed_keys:
+        return
+
+    auth_header = request.headers.get("Authorization", "")


The request.headers.get call defaults to an empty string, but auth_header.startswith will fail if the header is missing or empty. While the logic handles this, it is cleaner to check for the presence of the header explicitly.

Suggested change

auth_header = request.headers.get("Authorization", "")

auth_header = request.headers.get("Authorization")

if not auth_header or not auth_header.startswith("Bearer "):

prashansapkota · 2026-04-29T05:13:36Z

Addressed the review feedback:

auth.py: Changed request.headers.get("Authorization", "") to request.headers.get("Authorization") and combined the missing-header check into if not auth_header or not auth_header.startswith("Bearer "). This makes the absence of the header explicit rather than relying on an empty string falling through startswith.

…tartswith Signed-off-by: prashansapkota <prashan.sapkota3456@gmail.com>

jeremych1000 · 2026-04-30T17:59:14Z

This is great, thanks. Would love to see this.

prashansapkota added 2 commits April 28, 2026 22:51

fix(router): clear stale gauge labels on metrics scrape (fixes vllm-p…

a8a1743

…roject#888) Signed-off-by: prashansapkota <prashan.sapkota3456@gmail.com>

gemini-code-assist Bot reviewed Apr 29, 2026

View reviewed changes

fix(auth): explicitly check for missing Authorization header before s…

174b606

…tartswith Signed-off-by: prashansapkota <prashan.sapkota3456@gmail.com>

prashansapkota force-pushed the feat-multi-api-key-759 branch from 9ac7ddd to 174b606 Compare April 29, 2026 05:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feat][Router] Support multiple API keys for multi-tenant authentication (#759)#937

[Feat][Router] Support multiple API keys for multi-tenant authentication (#759)#937
prashansapkota wants to merge 3 commits intovllm-project:mainfrom
prashansapkota:feat-multi-api-key-759

prashansapkota commented Apr 29, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 29, 2026

Uh oh!

prashansapkota commented Apr 29, 2026

Uh oh!

jeremych1000 commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	auth_header = request.headers.get("Authorization", "")
	auth_header = request.headers.get("Authorization")
	if not auth_header or not auth_header.startswith("Bearer "):

Conversation

prashansapkota commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What Changed

Helm (infrastructure layer)

Router (application layer)

Tests

How It Works End-to-End

Backward Compatibility

Test Plan

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

prashansapkota commented Apr 29, 2026

Uh oh!

jeremych1000 commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

prashansapkota commented Apr 29, 2026 •

edited

Loading