feat(worker-model): prefer non-reasoning models and disable thinking on worker calls by BYK · Pull Request #122 · BYK/loreai

BYK · 2026-05-05T09:33:41Z

Summary

Part A: ModelInfo.capabilities gains reasoning?: boolean. selectWorkerCandidates() sorts non-reasoning models before reasoning models at equal cost — so the cheapest non-reasoning variant wins validation before any reasoning-capable model at the same price tier. OpenCode's getProviderModels() now captures model.reasoning from provider.list().
Part B: LLMClient.prompt() opts gain thinking?: boolean. All 7 core worker call sites pass thinking: false: distillation (2), meta-distill (1), curation (2), worker-model validation Phase 1+2 (2), query expansion (1). Pi adapter forwards it as thinkingEnabled: false to complete(). Gateway already doesn't trigger thinking; OpenCode SDK has no thinking toggle and relies on Part A instead.

Why

Background workers (distillation, curation, query expansion) are single-turn text-in/text-out. Reasoning/thinking tokens are produced, billed, and silently discarded. Two independent levers close this:

Part A ensures the model selection itself routes away from reasoning-capable models when a cheaper non-reasoning variant exists and passes quality checks.
Part B is a defensive safety net for Pi (where thinking defaults off but this makes intent explicit) and documents the limitation on OpenCode.

Testing

4 new selectWorkerCandidates tests: non-reasoning preferred at equal cost, ordering preserved, fallback when only reasoning model available, undefined treated as non-reasoning (backwards compat)
Full suite: 677 pass, 0 fail

Replace path-only file dedup with range-aware coverage checks that consider offset/limit parameters. Earlier reads are only collapsed when a later read covers the same or wider line range, preventing over-deduplication where a narrow later read would incorrectly collapse an earlier full-file read. Closes #117

…on worker calls Part A: extend ModelInfo.capabilities with reasoning?: boolean and update selectWorkerCandidates() to sort non-reasoning models before reasoning models at equal cost. OpenCode's getProviderModels() now captures model.reasoning from provider.list() so the flag is populated at runtime. Part B: add thinking?: boolean to LLMClient.prompt() opts. All 7 core worker call sites (distillation, meta-distill, curation, consolidation, validation Phase 1+2, query expansion) pass thinking: false. Pi adapter forwards it as thinkingEnabled: false to complete(). Gateway and OpenCode adapters documented — Gateway never triggers thinking; OpenCode cannot honor the hint and relies on Part A routing to non-reasoning models instead. Adds 4 new tests for reasoning-preference ordering in selectWorkerCandidates.

BYK added 2 commits May 2, 2026 23:22

BYK enabled auto-merge (squash) May 5, 2026 09:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(worker-model): prefer non-reasoning models and disable thinking on worker calls#122

feat(worker-model): prefer non-reasoning models and disable thinking on worker calls#122
BYK wants to merge 2 commits intomainfrom
feat/non-reasoning-worker-models

BYK commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

BYK commented May 5, 2026

Summary

Why

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant