Conversation
* feat(auth): add downstream api key support * feat(auth): harden api key security * fix(auth): refine api key parsing * chore(responses): align review feedback
…odels and transport
bf89914 to
8634f30
Compare
There was a problem hiding this comment.
Pull request overview
This PR adds upstream-parity behavior for request/response forwarding (headers + error envelopes), introduces OpenAI responses passthrough support, and adds optional downstream API key protection.
Changes:
- Added
/responses+/v1/responsesroute and Copilotresponsesclient with payload normalization. - Introduced API-key middleware (with basic rate limiting on auth failures) and standardized upstream header passthrough (
request-idaliases, processing headers, etc.). - Expanded OpenAI↔Anthropic translation to include “thinking”/reasoning fields in both non-streaming and streaming flows, plus additional tests.
Reviewed changes
Copilot reviewed 29 out of 30 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/api-key.test.ts | Adds coverage for API key enforcement and auth-failure blocking. |
| tests/chat-completions-handler.test.ts | Adds coverage for request normalization + header passthrough + streaming DONE sentinel behavior. |
| tests/chat-completions-route.test.ts | Adds coverage for upstream-style error envelope passthrough. |
| tests/create-chat-completions.test.ts | Extends unit tests for header selection and payload normalization. |
| tests/create-responses.test.ts | Adds unit tests for Copilot responses client normalization and streaming passthrough. |
| tests/messages-route.test.ts | Adds coverage for Anthropic route header passthrough and OpenAI→Anthropic error envelope conversion. |
| tests/models-route.test.ts | Adds coverage for enriched /v1/models output. |
| tests/responses-route.test.ts | Adds coverage for /v1/responses header + error envelope passthrough and model name forwarding. |
| tests/anthropic-request.test.ts | Extends request translation coverage for tool_result blocks and reasoning field expectations. |
| tests/anthropic-response.test.ts | Extends response translation coverage for reasoning/thinking blocks (non-stream + stream). |
| src/start.ts | Adds --api-key CLI arg and env-var fallback wiring into runtime state. |
| src/server.ts | Swaps request logger, adds API key middleware, and mounts responses routes. |
| src/services/copilot/create-chat-completions.ts | Normalizes payload (developer→system, max tokens mapping, stream options), returns headers alongside body/stream. |
| src/services/copilot/create-responses.ts | Adds Copilot responses client with input normalization and initiator header logic. |
| src/services/copilot/get-models.ts | Extends model supports typing (vision). |
| src/routes/chat-completions/types.ts | Introduces a shared ChatCompletionResult union for streaming vs JSON results with headers. |
| src/routes/chat-completions/handler.ts | Updates handler to use new result type, passthrough headers, DONE sentinel enforcement, and model resolution. |
| src/routes/messages/handler.ts | Updates Anthropic handler to use new result type, passthrough headers, streaming header wiring, and model resolution. |
| src/routes/messages/non-stream-translation.ts | Adds reasoning/thinking translation and tool_result content handling improvements. |
| src/routes/messages/stream-translation.ts | Adds streaming thinking block support and refactors content-block management. |
| src/routes/messages/anthropic-types.ts | Updates types for thinking signature + tool_result richer content, stream state improvements. |
| src/routes/models/route.ts | Switches /models output to getPublicModels() enriched entries. |
| src/routes/responses/route.ts | Adds /responses route that forwards to Copilot and mirrors upstream headers/error envelopes. |
| src/lib/api-key.ts | Adds API key enforcement + auth-failure throttling and a custom request logger. |
| src/lib/error.ts | Preserves upstream error envelopes and converts OpenAI envelopes to Anthropic envelopes on Anthropic routes; passes through headers. |
| src/lib/models.ts | Adds model resolution + public model formatting helpers. |
| src/lib/state.ts | Adds apiKey and authFailures to global state. |
| src/lib/transport.ts | Adds header passthrough + request-id aliasing helpers. |
| README.md | Updates docs to reflect fork scope, API key protection, and responses support. |
| bun.lock | Lockfile metadata update (configVersion). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const normalizedPayload: ChatCompletionsPayload = { | ||
| ...payload, | ||
| max_tokens: payload.max_tokens ?? payload.max_completion_tokens ?? null, | ||
| stream_options: | ||
| payload.stream ? { include_usage: true } : payload.stream_options, | ||
| messages: payload.messages.map((message) => ({ | ||
| ...message, | ||
| role: message.role === "developer" ? "system" : message.role, | ||
| })), | ||
| } |
There was a problem hiding this comment.
normalizedPayload maps max_completion_tokens into max_tokens, but it still forwards the original max_completion_tokens field because it’s included in the spread (...payload). If Copilot upstream only understands max_tokens (which this mapping implies), forwarding both can cause upstream validation errors or ambiguous behavior. Consider omitting max_completion_tokens from the forwarded payload after mapping (e.g., destructure it out) so only max_tokens is sent.
| return { | ||
| ...payload, | ||
| input: normalizeInput(payload.input), | ||
| max_output_tokens: | ||
| payload.max_output_tokens ?? payload.max_tokens ?? undefined, | ||
| store: payload.store ?? false, | ||
| truncation: payload.truncation ?? "disabled", | ||
| include: [...include], | ||
| } |
There was a problem hiding this comment.
normalizeResponsesPayload derives max_output_tokens from max_tokens, but it keeps max_tokens on the payload. If Copilot’s /responses endpoint expects max_output_tokens (as the normalization suggests), forwarding max_tokens as an extra field may be rejected by upstream or lead to conflicting values. Consider removing max_tokens from the forwarded payload when you map it into max_output_tokens.
| return { | ||
| requestedModel: modelId, | ||
| resolvedModel: modelId, | ||
| canonicalModel: undefined as never, |
There was a problem hiding this comment.
resolveModel returns canonicalModel: undefined as never when no match is found. This is type-unsafe (it hides an undefined behind a never cast) and makes the return shape misleading. Prefer omitting canonicalModel entirely in the non-matching branch (or setting it to undefined without a cast) so callers can reliably treat it as optional.
| canonicalModel: undefined as never, |
| await next() | ||
| consola.info(`--> ${target} ${c.res.status} ${Date.now() - startedAt}ms`) |
There was a problem hiding this comment.
safeRequestLogger doesn’t use a try/finally, so if downstream middleware/handlers throw, the outbound log line (--> ...) will never be emitted. Wrapping await next() in try/finally (and logging in finally) will ensure consistent request timing logs even on errors.
| await next() | |
| consola.info(`--> ${target} ${c.res.status} ${Date.now() - startedAt}ms`) | |
| try { | |
| await next() | |
| } finally { | |
| consola.info(`--> ${target} ${c.res.status} ${Date.now() - startedAt}ms`) | |
| } |
| globalThis.fetch = originalFetch | ||
| }) | ||
|
|
||
| test("returns upstream-style error envelopes for locally rejected chat completion models", async () => { |
There was a problem hiding this comment.
This test name says the model is rejected “locally”, but the assertions depend on an upstream fetch response (status/header/body) being returned. Renaming the test to reflect that it’s verifying upstream error envelope/header passthrough would avoid confusion for future maintainers.
| test("returns upstream-style error envelopes for locally rejected chat completion models", async () => { | |
| test("passes through upstream error envelopes and headers for missing chat completion models", async () => { |
| expect(forwardedBody?.max_tokens).toBe(77) | ||
| }) | ||
|
|
||
| test("rejects unknown chat completion models before forwarding", async () => { |
There was a problem hiding this comment.
The test title says “rejects unknown ... before forwarding”, but it explicitly captures and asserts on the forwarded request body (and sets up fetch to return an upstream 404). Consider renaming the test to reflect the actual behavior under test (forwarding unknown models and preserving upstream error responses).
| test("rejects unknown chat completion models before forwarding", async () => { | |
| test("forwards unknown chat completion models and preserves upstream 404 responses", async () => { |
No description provided.