Skip to content

[Feat][Router] Support multiple API keys for multi-tenant authentication (#759)#937

Open
prashansapkota wants to merge 3 commits intovllm-project:mainfrom
prashansapkota:feat-multi-api-key-759
Open

[Feat][Router] Support multiple API keys for multi-tenant authentication (#759)#937
prashansapkota wants to merge 3 commits intovllm-project:mainfrom
prashansapkota:feat-multi-api-key-759

Conversation

@prashansapkota
Copy link
Copy Markdown

@prashansapkota prashansapkota commented Apr 29, 2026

Summary

Closes #759

Previously, vllmApiKey only accepted a single string value. Because vLLM's --api-key flag natively supports a comma-separated list of keys, users tried passing "key1,key2,key3" or a YAML list — but the Helm chart silently discarded everything after the first key, and the router had no enforcement layer at all. This blocked multi-tenant deployments where different teams need independent credentials to share the same vLLM instance.

This PR fixes the full stack. From Helm values through to the router's request validation.

What Changed

Helm (infrastructure layer)

helm/values.yaml

  • Extended the vllmApiKey schema annotation from type:[string, object] to type:[string, array, object]
  • Added inline documentation with examples for all three accepted forms:
    vllmApiKey: "single-key"
    vllmApiKey: "key1,key2,key3"
    vllmApiKey: ["key1", "key2", "key3"]
    vllmApiKey: {secretName: my-secret, secretKey: api-key}

helm/templates/secrets.yaml

  • Added a kindIs "slice" branch: when vllmApiKey is a YAML list, entries are joined with commas (join ",") before being base64-encoded into the Kubernetes Secret. The Secret always stores a single comma-separated string, the format vLLM natively expects for --api-key.

helm/templates/deployment-vllm-multi.yaml and helm/templates/deployment-router.yaml

  • Extended the kindIs "string" guard to or (kindIs "string") (kindIs "slice") so that list-typed keys also correctly resolve the VLLM_API_KEY env var from the generated Secret.

Router (application layer)

src/vllm_router/auth.py (new)

  • _parse_api_keys(raw): splits a comma-separated string into a frozenset, stripping whitespace and ignoring empty segments
  • get_allowed_api_keys(): reads VLLM_API_KEY from the environment at request time
  • verify_api_key(request): FastAPI dependency that:
    • Is a no-op when VLLM_API_KEY is unset (preserves backward compatibility for unauthenticated deployments)
    • Returns HTTP 401 if the Authorization header is missing or not a Bearer token
    • Returns HTTP 401 if the token is not in the allowed set
    • Accepts any of the configured keys, enabling true multi-tenant access

src/vllm_router/app.py

  • Wired verify_api_key as a router-level Depends on main_router: every inference endpoint (/v1/chat/completions, /v1/completions, /v1/embeddings, etc.) is protected automatically without touching individual route handlers.

Tests

src/tests/test_multi_api_key_auth.py (new, 24 tests)

Group Coverage
_parse_api_keys single key, comma-separated, whitespace stripping, empty segments, empty string, whitespace-only
get_allowed_api_keys no env var, single key, multiple keys, space trimming
verify_api_key no auth configured → allow all, valid single key, valid key among multiple, invalid key → 401, missing header → 401, wrong scheme (Basic) → 401, whitespace in env keys

How It Works End-to-End

helm install ... --set 'servingEngineSpec.vllmApiKey=["team-a-key","team-b-key"]'
        │
        ▼
secrets.yaml joins list → "team-a-key,team-b-key" → base64 → K8s Secret
        │
        ▼
deployment-vllm-multi.yaml injects VLLM_API_KEY="team-a-key,team-b-key" into vLLM container
deployment-router.yaml     injects VLLM_API_KEY="team-a-key,team-b-key" into router container
        │
        ▼
Router: verify_api_key dependency checks incoming Bearer token against frozenset{"team-a-key","team-b-key"}
vLLM:   natively validates the same comma-separated key list via --api-key

Backward Compatibility

  • Deployments with no vllmApiKey set: no change: verify_api_key no-ops
  • Deployments with a single string key: no change: Existing code path unchanged
  • Deployments referencing an existing Secret via {secretName, secretKey}: no change

Test Plan

  • pytest src/tests/test_multi_api_key_auth.py (24 tests pass)
  • pytest src/tests/test_stale_metrics.py src/tests/test_static_service_discovery.py src/tests/test_utils.py. All existing tests pass (51 total)
  • black, isort. All modified files pass linting

…roject#888)

Signed-off-by: prashansapkota <prashan.sapkota3456@gmail.com>
…on (vllm-project#759)

Previously, vllmApiKey only accepted a single string, silently
discarding any keys after the first comma. This prevented teams from
sharing a vLLM deployment with independent credentials.

Changes:
- helm/values.yaml: extend vllmApiKey schema to accept string, list,
  or existing-secret object. List entries are joined as a
  comma-separated string before being stored in the Kubernetes Secret,
  which matches the format vLLM's --api-key flag natively expects.
- helm/templates/secrets.yaml: add kindIs "slice" branch to join a
  list of keys with commas before base64-encoding into the Secret.
- helm/templates/deployment-vllm-multi.yaml,
  helm/templates/deployment-router.yaml: extend the kindIs "string"
  guard to also match "slice" so list-typed keys are correctly mapped
  to the VLLM_API_KEY env var from the generated Secret.
- src/vllm_router/auth.py (new): self-contained auth module with
  _parse_api_keys(), get_allowed_api_keys(), and verify_api_key()
  FastAPI dependency. Reads VLLM_API_KEY, splits on commas, strips
  whitespace, and returns HTTP 401 for missing or invalid Bearer
  tokens. No-ops when VLLM_API_KEY is unset.
- src/vllm_router/app.py: wire verify_api_key as a router-level
  dependency on main_router so every inference endpoint is protected
  without modifying individual route handlers.
- src/tests/test_multi_api_key_auth.py (new): 24 tests covering key
  parsing, env var reading, and the FastAPI dependency (valid key,
  multi-key, invalid key → 401, missing header → 401, wrong scheme
  → 401, whitespace tolerance, no-auth passthrough).

Signed-off-by: prashansapkota <prashan.sapkota3456@gmail.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for multiple API keys to enable multi-tenant access and addresses the issue of stale Prometheus metrics. Key changes include updating Helm templates to handle vllmApiKey as either a string or a list, implementing a new authentication module that parses multiple keys from environment variables, and adding a mechanism to clear label-based gauges in the metrics router. Additionally, a healthy field was added to EndpointInfo for better status tracking. Feedback was provided to improve the robustness of the Authorization header check in the authentication dependency.

Comment thread src/vllm_router/auth.py Outdated
if not allowed_keys:
return

auth_header = request.headers.get("Authorization", "")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The request.headers.get call defaults to an empty string, but auth_header.startswith will fail if the header is missing or empty. While the logic handles this, it is cleaner to check for the presence of the header explicitly.

Suggested change
auth_header = request.headers.get("Authorization", "")
auth_header = request.headers.get("Authorization")
if not auth_header or not auth_header.startswith("Bearer "):

@prashansapkota
Copy link
Copy Markdown
Author

Addressed the review feedback:

  • auth.py: Changed request.headers.get("Authorization", "") to request.headers.get("Authorization") and combined the missing-header check into if not auth_header or not auth_header.startswith("Bearer "). This makes the absence of the header explicit rather than relying on an empty string falling through startswith.

…tartswith

Signed-off-by: prashansapkota <prashan.sapkota3456@gmail.com>
@prashansapkota prashansapkota force-pushed the feat-multi-api-key-759 branch from 9ac7ddd to 174b606 Compare April 29, 2026 05:41
@jeremych1000
Copy link
Copy Markdown

This is great, thanks. Would love to see this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feature: allow multiple api keys in servingEngineSpec.vllmApiKey

2 participants