Model routing: use Flash for scout/tool-heavy work and Pro for synthesis

## Problem

Some work shapes are high-throughput and evidence-gathering heavy but not deep-reasoning heavy: GitHub triage, read-only repo scouting, repeated file search, issue clustering, PR/check status collection, and first-pass log categorization. These are good candidates for Flash/scout routing, with Pro reserved for planning, risk judgment, final synthesis, or high-stakes code changes.

Evidence from redacted maintainer-private CodeWhale log scans:

- 24 turns contained 8 or more GitHub-oriented tool calls.
- 22 of those GitHub-heavy turns had no observed delegation/RLM routing.
- 56 turns contained 8 or more read/search calls and no observed RLM routing.
- The largest parent GitHub-heavy turn had 114 GitHub-oriented calls in one turn.
- These are exactly the shapes where Flash/helper workers should collect structured evidence while the stronger model synthesizes.

No prompts, raw outputs, secrets, paths, or transcript text are copied here.

## Desired Behavior

CodeWhale should classify low-risk, tool-heavy scout work and route it to cheaper/faster helper models by default.

Suggested policy:

- Pro handles top-level planning, architecture, safety-sensitive judgment, final merge/release decisions, and synthesis.
- Flash handles read-only scouting, issue/PR inventory, search batches, duplicate detection, status collection, and first-pass summaries.
- Pro can override a Flash scout when the task becomes high-risk, ambiguous, destructive, or policy-sensitive.
- The route is visible in the UI: parent model, child model, reasoning effort, why the route was chosen, and whether the route is cost-saving or quality-preserving.

## Acceptance Criteria

- Add a routing classifier for scout/tool-heavy work: GitHub triage, read-only search, status polling, large-log first pass, and verification-only jobs.
- Child/scout creation can request `deepseek-v4-flash` with an appropriate effort level while the parent remains on `deepseek-v4-pro`.
- A route receipt explains why Flash or Pro was selected.
- Users can configure conservative/aggressive routing behavior.
- High-risk actions such as merges, pushes, release publishing, destructive shell, credentials, provider policy, and legal/branding stay parent/Pro-gated unless explicitly approved.
- Tests cover route selection for GitHub triage, read-only scouting, code modification, release decision, and destructive operations.

## Related

- #1676 Dual mode: Pro for reasoning plus Flash for execution.
- #2024 delegation opportunity detection.
- #2025 GitHub triage scout.
- #1888 control-plane semantics.
- #720 per-model prompt addendum dispatcher.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model routing: use Flash for scout/tool-heavy work and Pro for synthesis #2027

Problem

Desired Behavior

Acceptance Criteria

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Model routing: use Flash for scout/tool-heavy work and Pro for synthesis #2027

Description

Problem

Desired Behavior

Acceptance Criteria

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions