feat(worker-model): prefer non-reasoning models and disable thinking on worker calls#122
Open
feat(worker-model): prefer non-reasoning models and disable thinking on worker calls#122
Conversation
Replace path-only file dedup with range-aware coverage checks that consider offset/limit parameters. Earlier reads are only collapsed when a later read covers the same or wider line range, preventing over-deduplication where a narrow later read would incorrectly collapse an earlier full-file read. Closes #117
…on worker calls Part A: extend ModelInfo.capabilities with reasoning?: boolean and update selectWorkerCandidates() to sort non-reasoning models before reasoning models at equal cost. OpenCode's getProviderModels() now captures model.reasoning from provider.list() so the flag is populated at runtime. Part B: add thinking?: boolean to LLMClient.prompt() opts. All 7 core worker call sites (distillation, meta-distill, curation, consolidation, validation Phase 1+2, query expansion) pass thinking: false. Pi adapter forwards it as thinkingEnabled: false to complete(). Gateway and OpenCode adapters documented — Gateway never triggers thinking; OpenCode cannot honor the hint and relies on Part A routing to non-reasoning models instead. Adds 4 new tests for reasoning-preference ordering in selectWorkerCandidates.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Part A:
ModelInfo.capabilitiesgainsreasoning?: boolean.selectWorkerCandidates()sorts non-reasoning models before reasoning models at equal cost — so the cheapest non-reasoning variant wins validation before any reasoning-capable model at the same price tier. OpenCode'sgetProviderModels()now capturesmodel.reasoningfromprovider.list().Part B:
LLMClient.prompt()opts gainthinking?: boolean. All 7 core worker call sites passthinking: false: distillation (2), meta-distill (1), curation (2), worker-model validation Phase 1+2 (2), query expansion (1). Pi adapter forwards it asthinkingEnabled: falsetocomplete(). Gateway already doesn't trigger thinking; OpenCode SDK has no thinking toggle and relies on Part A instead.Why
Background workers (distillation, curation, query expansion) are single-turn text-in/text-out. Reasoning/thinking tokens are produced, billed, and silently discarded. Two independent levers close this:
Testing
selectWorkerCandidatestests: non-reasoning preferred at equal cost, ordering preserved, fallback when only reasoning model available,undefinedtreated as non-reasoning (backwards compat)