Skip to content

feat: reduce planner noise for small models + structured artifact tool#23

Merged
m62624 merged 8 commits into
mainfrom
feat/planner-status-finish-fill-cleanup
Jun 15, 2026
Merged

feat: reduce planner noise for small models + structured artifact tool#23
m62624 merged 8 commits into
mainfrom
feat/planner-status-finish-fill-cleanup

Conversation

@m62624

@m62624 m62624 commented Jun 15, 2026

Copy link
Copy Markdown
Owner
  • TUI resize fixes

m62624 and others added 8 commits June 15, 2026 17:53
Strip internal/noise fields from buildPlannerStatusText that do not drive
the model's next action:

- Drop runtime dump (creationMethod, compatibilityMode, nextStep,
  questions*, compactBoundaries, idle/stuck/debug ids, requires*, broken,
  blockedReason) — state is already reflected in Lifecycle Decision and
  Next Required Action.
- Replace "## Effective Settings" with a focused "## Languages" block
  (metadata.* only); drop idle/skills/contracts tuning knobs.
- Trim "## Git And Worktree" to git essentials (drop activeBranches,
  mergeTargets, actualHEAD).
- Show "## Debug Mode" only while a debug session is active.
- Keep only enabled/guidance lines from the contracts section.

Add an explicit worktree-location line ("You are in worktree X — work
here") and a bold pointer to re-read the current stage instruction, so
status answers "where am I / what's the goal / what next" up front.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
After a transition the workflow tool returned a generic "Call
planner_status before choosing the next planner action", forcing the
small model to fetch the heavy status after every step. It now emits a
compact hint built from the post-transition state.

- Add buildNextStepHint(state) (src/runtime/next-step-hint.ts): worktree
  location, current step, goal, first required action, exit condition,
  and the next move — derived from the state machine.
- Branch points list every allowed planner_finish_step target and label
  loop-back targets (e.g. run_final_tests -> implement_task) so the model
  sees both the forward and the fix/loop path.
- Compact steps point to planner_request_compact only when the boundary
  is enabled; otherwise fall through to the linear next step.
- Wire it into formatWorkflowToolResult using result.state (the step we
  moved INTO), replacing the old pre-transition lifecycle footer.

The hint is deliberately lighter than planner_status (no Stage Behavior,
allowed-tool dumps, or full instruction body); status stays the heavier
source of truth.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add four planner wrapper tools so the small model fills artifacts from
arguments instead of hand-formatting markdown, and funnel the structured
ones through validation:

- planner_tdd_submit: structured per-section fields (Pre-Implementation
  Proof Contract, Post-Implementation Counterexample Review, Task Merge
  Scope Audit). The wrapper assembles tdd.md, validates required fields
  immediately, and merges incrementally across steps (src/runtime/
  tdd-form.ts). Field/section definitions are now shared with the
  tdd-evidence validators.
- planner_discovery_submit: body + verificationProtocol[] -> always
  well-formed "## Verification Protocol" section.
- planner_plan_submit / planner_summary_submit: content-arg writers for
  the open-ended plan.md / final_summary.md.

Enforcement (src/guard/project-mutation.ts): built-in edit/write are now
blocked on the structured artifacts that have a wrapper — goal.md,
questions.md, and the active task's tdd.md — with a message naming the
right tool. Open-ended, append-heavy artifacts (plan.md, discovery.md,
final_summary.md) stay editable so the model can append across the
lifecycle; their submit tools are convenience.

Wire-up: register the tools in tool-policy step allowlists and index.ts,
widen the built-in guard state to carry questionsMd/tasksDir/activeTaskId.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Rewrite all 12 bundled stage instructions (instructions/defaults/*.md)
to a consistent structure and align drifted duplicate blocks, while
preserving every distinct rule. Reflect the changes from the rest of
this branch:

- Route artifact writing through the new fill-tools: tdd.md via
  planner_tdd_submit (and state that built-in edit/write cannot modify
  it), discovery.md via planner_discovery_submit (body +
  verificationProtocol), plan.md via planner_plan_submit, final_summary
  via planner_summary_submit.
- Lean on the richer planner_finish_step hint: the recurring footer now
  says the finish_step result already names the next step, goal, and
  worktree, and to call planner_status only when the full step rule or
  instruction is needed.
- Consolidate discovery's two Fundamental Rules blocks (1-6) into one
  section; keep the planning Integration-vs-New-Entity and execution
  Uncertainty->Question rules as named rules.
- Trim verbose diagnostics into compact bullets (~310 fewer lines).

Parser-critical "## auto-compact"/"## manual-compact" headings are kept
in exactly the files that had them.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The workspace overlay only repainted when its render signature changed
(clock, content, focus/input). Terminal size was not part of the
signature, so after a resize the idle tick computed the same signature
and never called requestRender() — the overlay stayed frozen at the old
dimensions until the next content change.

Include the terminal columns/rows in computeSignature() so a resize is
detected on the next tick and triggers a redraw. render() already
recomputes on width/height change; it just was not being asked to.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
State plainly that the extension is built for local models and, at
runtime, is driven by a small local model through Pi, while cloud LLMs
assist development. Add the experiment warning verbatim so it is clear
this is not a guarantee of better output and can make results worse.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The previous resize fix made the workspace request a redraw on resize,
but the overlay still could not grow past its startup size. overlayOptions()
is evaluated once at show time, and it returned absolute width/maxHeight
from the initial terminal. resolveOverlayLayout re-runs each render but an
absolute width only clamps down to a smaller terminal — it never expands —
so shrink-and-restore worked while enlarging beyond the startup size (or
going fullscreen) stayed pinned.

Return relative geometry instead: width "100%" and maxHeight "100%" track
the live terminal both ways, and margin.bottom reserves the footer rows.
The component already derives its render height from the live terminal, so
it fills the larger area without a double reserve.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
buildNextStepHint described the next move but not which wrapper is
permitted at the current step, so a small model could still reach for a
built-in write/edit. Add a "Tools allowed now: ..." line sourced from
getAllowedPlannerWrapperTools(state) — e.g. at discovery/scan_project_structure
it names planner_discovery_submit and the contract tools — so the model
picks the right planner tool in the moment. The hint is reused by both
planner_finish_step and start_step.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@github-actions github-actions Bot added the feat label Jun 15, 2026
@m62624 m62624 merged commit 3f7761c into main Jun 15, 2026
2 checks passed
@m62624 m62624 deleted the feat/planner-status-finish-fill-cleanup branch June 15, 2026 13:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant