Skip to content

feat(session): expose maxOutputTokens on llm adapters#63

Merged
uchouT merged 1 commit into
mainfrom
feat/adapter-max-output-tokens
May 6, 2026
Merged

feat(session): expose maxOutputTokens on llm adapters#63
uchouT merged 1 commit into
mainfrom
feat/adapter-max-output-tokens

Conversation

@uchouT
Copy link
Copy Markdown
Collaborator

@uchouT uchouT commented May 6, 2026

Summary

  • Add maxOutputTokens?: number to AnthropicAdapterOptions and OpenAICompatibleOptions so consumers can configure the per-request output cap once at adapter creation time
  • New fallback chain for the request max_tokens: completeOptions.maxTokens (per-call override) > options.maxOutputTokens (adapter default) > 4096 (built-in safety net) — preserves existing behaviour when neither is set

Why

Both adapters previously hard-coded ?? 4096 for the request max_tokens. This is too low for non-trivial outputs: long tool-call argument JSONs (e.g. multi-item results) and long synthesis/integration outputs hit the cap and get truncated. The downstream parsers then throw on the unterminated JSON. Without a way for embedders to raise the cap at adapter level, every consumer had to either patch the source or pass maxTokens on every single call site.

Note: Anthropic's messages.create API requires max_tokens, so the adapter must always send a value — maxOutputTokens only adjusts which value is sent, not whether one is sent.

Test plan

  • Unit tests for fallback precedence on both adapters (default 4096 / options.maxOutputTokens / completeOptions.maxTokens wins)
  • Stream path uses the same value as complete path (anthropic)
  • All existing session/core/devtools tests still pass (146 + 285 + 15)

@uchouT uchouT merged commit 111a671 into main May 6, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant