Skip to content

Capability-aware reasoning and extended thinking translation#238

Open
lyzgeorge wants to merge 9 commits intoericc-ch:masterfrom
lyzgeorge:master
Open

Capability-aware reasoning and extended thinking translation#238
lyzgeorge wants to merge 9 commits intoericc-ch:masterfrom
lyzgeorge:master

Conversation

@lyzgeorge
Copy link
Copy Markdown

Summary

  • Adds a capability-gated reasoning/thinking translation layer so both the OpenAI (/v1/chat/completions) and Anthropic (/v1/messages) surfaces route reasoning_effort, thinking_budget, and Anthropic thinking config to whichever upstream knob each Copilot model advertises under capabilities.supports.
  • Propagates reasoning_text / reasoning_opaque through non-stream responses and emits native content_block_start / thinking_delta / signature_delta / content_block_stop events on the streaming path, so Claude Code and similar clients see real thinking UIs.
  • Adds a new src/routes/reasoning-context.ts helper shared by both handlers, plus handler-level tests for /v1/chat/completions and /v1/messages that verify adaptive-thinking, reasoning-effort-only, disabled-thinking, and unsupported-model paths.

Why

Copilot advertises reasoning capabilities under capabilities.supports.adaptive_thinking and capabilities.supports.reasoning_effort, but neither surface read those fields. As a result, thinking: enabled on /v1/messages was silently dropped, and /v1/chat/completions would happily forward thinking_budget to models that don't accept it. The new helper reads the advertised capabilities and translates requests accordingly — e.g. Anthropic thinking: enabled becomes reasoning_effort: high on gpt-5 family models and reasoning_effort: high + thinking_budget on Claude Sonnet 4.6.

What changed

  • src/routes/reasoning-context.ts (new): buildAnthropicReasoningContext and buildOpenAIReasoningContext, gated on supports.reasoning_effort and supports.adaptive_thinking.
  • src/routes/messages/handler.ts: resolves the selected model, builds the reasoning context, passes it into translateToOpenAI, and debug-logs when an unsupported model causes thinking to be stripped.
  • src/routes/messages/non-stream-translation.ts: translateToOpenAI now takes a ReasoningContext and emits reasoning_effort / thinking_budget; translateToAnthropic preserves reasoning_text as a thinking block and reasoning_opaque on the response, maintaining per-choice ordering.
  • src/routes/messages/stream-translation.ts: tracks currentBlockType explicitly and emits Anthropic-native thinking/text/tool-use transitions, including signature_delta scoped to each thinking block. Forwards cached prompt tokens as cache_read_input_tokens on message_delta.
  • src/routes/chat-completions/handler.ts: builds an OpenAI reasoning context, drops unsupported thinking_budget with a debug log, and normalizes the payload before sending to Copilot.
  • src/routes/messages/count-tokens-handler.ts: reuses the same reasoning-context helper so token counts match what we actually send upstream.
  • src/services/copilot/create-chat-completions.ts: types reasoning_effort, thinking_budget, stream_options, and reasoning fields on Delta / ResponseMessage.
  • src/services/copilot/get-models.ts: extends ModelSupports with adaptive_thinking and reasoning_effort.
  • Tests: new tests/chat-completions-handler.test.ts and tests/messages-handler.test.ts (integration-style using fetch mocking so no module-level mocks leak across suites), plus reasoning-context helper coverage in tests/anthropic-request.test.ts and streaming/non-stream reasoning cases in tests/anthropic-response.test.ts.
  • README: new hook in the intro, a bullet in Features, and a dedicated Reasoning & Extended Thinking section with curl examples for both surfaces.

Test plan

  • bun test tests/ — 48 tests, all passing
  • bun run typecheck
  • bun run lint
  • Live /v1/chat/completions against gpt-5-mini and claude-sonnet-4.6 with reasoning_effort in minimal / low / medium / high
  • Live /v1/messages against gpt-5-mini and claude-sonnet-4.6 with thinking: enabled and thinking: disabled
  • Verified reasoning_text / reasoning_opaque flow through non-stream responses
  • Verified thinking-block stream events for Claude Sonnet 4.6

Notes

  • The fix for capability reads (capabilities.supports.*) is split out as its own commit (3a65946) so the history shows the root cause clearly.
  • .worktrees/ was added to .gitignore in a separate chore commit so local worktree directories don't pollute git status.

lyzgeorge and others added 9 commits April 13, 2026 19:59
Normalize reasoning effort, thinking budget, and Anthropic reasoning streams so both proxy surfaces stay aligned with Copilot model capabilities.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot advertises adaptive_thinking and reasoning_effort under `capabilities.supports`, not at the top level of `capabilities`. The previous gate looked at the wrong field, so Anthropic `thinking` was always stripped and reasoning never reached upstream for /v1/messages. Read the correct fields and gate each surface on what the model actually supports.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add a Reasoning & Extended Thinking section to the README, highlight the feature in the intro and features list, and cover the capability gating with new handler tests for the Anthropic /v1/messages surface and additional cases for /v1/chat/completions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…type

feat: allow arbitrary `reasoning_effort` values via pass-through
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants