From b37d22cc91aaaa3a948e70abc3fb9c62cc3ba48e Mon Sep 17 00:00:00 2001 From: Henry Lach Date: Sun, 10 May 2026 19:26:59 -0400 Subject: [PATCH] task(TP-196, TP-197): stage segment-engine hardening + dashboard-progress packets MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Triage closure of the 5 segment-related issues that survived the #51 closure (because they're polish/hardening, not core feature work): - #462: Harden .DONE authority for multi-segment tasks → TP-196 - #502: SegmentScopeMode single enum gating all signals → TP-196 - #503: Regression tests for SegmentScopeMode prompt injection → TP-196 - #508: Lane-runner segment completion check before respawn → TP-196 - #464: Dashboard segment-level progress indicators → TP-197 All 5 issues triaged as still relevant (none obsolete). Grouped into 2 task packets by file-scope cohesion: ## TP-196 — Multi-segment engine hardening (#462, #502, #503, #508) Size M, Review Level 2 (Plan + Code), per-step reviews. All four issues overlap heavily in lane-runner.ts, execution.ts, resume.ts, discovery.ts. Sequencing within the task: - Step 2: #502 first (foundational SegmentScopeMode promotion) - Step 3: #462 (.DONE authority guards built atop #502's flag) - Step 4: #508 (early-exit optimization) - Step 5: #503 (regression tests covering #502/#503 surfaces) ## TP-197 — Dashboard segment-level progress indicators (#464) Size S-M, Review Level 1 (Plan Only). Kept separate from TP-196: - Different file domain: dashboard/public/ (vanilla JS, intentionally out of Biome lint scope per code-quality-gates spec section 3) - Different validation approach: manual visual verification vs. unit-test-driven - Different reviewer level historically (UX-only changes are Level 1) - No file overlap with TP-196's scope ## CONTEXT.md Next Task ID bumped TP-196 → TP-198. The unassigned Tier-1.5 follow-ups (TS strictness ratchet, CHANGELOG fragments) per the code-quality-gates spec will claim TP-198 / TP-199 when staged. ## Issue comments posted Each of the 5 issues received a triage-update comment explaining which TP absorbs the work and why. Will auto-close when the absorbing PRs merge. ## Discovery probe 0 errors, 2 pending tasks, 0 deps. Ready for /orch when scheduled. --- taskplane-tasks/CONTEXT.md | 2 +- .../PROMPT.md | 220 ++++++++++++++++++ .../STATUS.md | 164 +++++++++++++ .../PROMPT.md | 180 ++++++++++++++ .../STATUS.md | 128 ++++++++++ 5 files changed, 693 insertions(+), 1 deletion(-) create mode 100644 taskplane-tasks/TP-196-multi-segment-engine-hardening/PROMPT.md create mode 100644 taskplane-tasks/TP-196-multi-segment-engine-hardening/STATUS.md create mode 100644 taskplane-tasks/TP-197-dashboard-segment-progress/PROMPT.md create mode 100644 taskplane-tasks/TP-197-dashboard-segment-progress/STATUS.md diff --git a/taskplane-tasks/CONTEXT.md b/taskplane-tasks/CONTEXT.md index 20c4cab1..9cda5969 100644 --- a/taskplane-tasks/CONTEXT.md +++ b/taskplane-tasks/CONTEXT.md @@ -2,7 +2,7 @@ **Last Updated:** 2026-05-10 **Status:** Active -**Next Task ID:** TP-196 +**Next Task ID:** TP-198 --- diff --git a/taskplane-tasks/TP-196-multi-segment-engine-hardening/PROMPT.md b/taskplane-tasks/TP-196-multi-segment-engine-hardening/PROMPT.md new file mode 100644 index 00000000..33696931 --- /dev/null +++ b/taskplane-tasks/TP-196-multi-segment-engine-hardening/PROMPT.md @@ -0,0 +1,220 @@ +# Task: TP-196 - Multi-segment engine hardening: `.DONE` authority + scope-mode unification + early-exit optimization + test hardening + +**Created:** 2026-05-10 +**Size:** M + +## Review Level: 2 (Plan and Code) + +**Assessment:** Bundles 4 segment-engine follow-up issues (#462, #502, #503, #508) into one task. All four touch overlapping files (`lane-runner.ts`, `execution.ts`, `resume.ts`, `discovery.ts`) and share a conceptual theme: hardening multi-segment execution against edge cases and drift. Plan review evaluates the unification design (single authoritative `SegmentScopeMode` flag, defense-in-depth `.DONE` guards). Code review evaluates the per-fix correctness and test adequacy. Per-step reviews fit naturally — each issue's work is independent enough that a step boundary maps to an issue. + +**Score:** 4/8 — Blast radius: 2, Pattern novelty: 1, Security: 0, Reversibility: 1 + +## Canonical Task Folder + +``` +taskplane-tasks/TP-196-multi-segment-engine-hardening/ +├── PROMPT.md ← This file (immutable above --- divider) +├── STATUS.md ← Execution state (worker updates this) +├── .reviews/ ← Reviewer output (created by the orchestrator runtime) +└── .DONE ← Created when complete +``` + +## Mission + +Close out four segment-engine polish/hardening issues that survived the closure of #51 (multi-repo task execution) because they're defense-in-depth, not core feature work: + +- **#462** — Harden `.DONE` authority for multi-segment tasks (monitor, resume, discovery guards) +- **#502** — Segment scope mode should be a single enum gating all segment signals +- **#503** — Add regression tests for SegmentScopeMode prompt injection +- **#508** — Lane-runner should check segment completion before spawning next iteration + +These four are conceptually cohesive and overlap heavily in file scope. Bundling lets the worker reuse the segment-engine context once and ship a single coherent hardening pass. + +By the end of TP-196: +- `.DONE` cannot prematurely terminate a multi-segment task in monitor/resume/discovery edge cases (#462) +- `SegmentScopeMode` is the single authoritative flag — env vars, tool registration, and execution branches all gate on it (#502) +- Regression tests verify both `FULL_TASK` and `SEGMENT_SCOPED` prompt content + the polyrepo single-segment case + the legacy/partial-marker fallback (#503) +- Lane-runner skips wasted iteration when all segment checkboxes are already complete (#508) +- All existing tests still pass; new behavioral tests cover the four fixes + +## Dependencies + +**None** — all referenced predecessor work is merged. The following are informational cross-references: + +- TP-081 / TP-133 / TP-134 / TP-135 (multi-repo task execution Phase B-E, shipped): the foundational segment infrastructure these guards harden. +- TP-145 (already shipped): the four-layer `.DONE` defense that #462 builds atop. +- TP-501 (already shipped, predecessor for #502/#503): the SegmentScopeMode prompt-injection fix that #502 unifies and #503 regression-tests. +- TP-194 (gates flip, shipped v0.30.0): means `typecheck` / `lint` / `format:check` are now hard gates — any new code in this task must keep them green. + +## Context to Read First + +> Only list docs the worker actually needs. Less is better. + +**Tier 2 (area context):** +- `taskplane-tasks/CONTEXT.md` + +**Tier 3 (load only if needed):** +- `docs/specifications/taskplane/multi-repo-task-execution.md` — operative spec for the segment subsystem (the broader context for all 4 issues) +- Each issue's body, fetched via `gh issue view `: + - `gh issue view 462` + - `gh issue view 502` + - `gh issue view 503` + - `gh issue view 508` +- `extensions/taskplane/lane-runner.ts` — segment-scope-mode computation; iteration loop (pre-spawn check site for #508) +- `extensions/taskplane/execution.ts` — `resolveTaskMonitorState` (monitor guard for #462); segment env var + tool registration (#502) +- `extensions/taskplane/resume.ts` — `collectDoneTaskIdsForResume` / reconciliation (resume guard for #462) +- `extensions/taskplane/discovery.ts` — `.DONE` skip logic in segmented contexts (discovery safeguard for #462) +- `extensions/tests/segment-scoped-lane-runner.test.ts` — existing segment scope test file (add #503 assertions here) +- `extensions/tests/lane-runner-v2.test.ts` — update existing segment contracts if changed by #502 work + +## Environment + +- **Workspace:** `extensions/` (engine + tests) +- **Services required:** None + +## File Scope + +> The orchestrator uses this to avoid merge conflicts. Worker hydrates Step 2-5 +> with the specific files touched per issue based on the plan in Step 1. + +- `extensions/taskplane/lane-runner.ts` (segment-scope computation, iteration loop) +- `extensions/taskplane/execution.ts` (monitor guard, env var, tool registration) +- `extensions/taskplane/resume.ts` (resume guard) +- `extensions/taskplane/discovery.ts` (discovery safeguard) +- `extensions/taskplane/types.ts` (if `SegmentScopeMode` enum needs to be promoted to a first-class type — likely) +- `extensions/tests/segment-scoped-lane-runner.test.ts` (#503 assertions) +- `extensions/tests/lane-runner-v2.test.ts` (segment contract updates) +- New test files as needed (e.g., `extensions/tests/done-authority-multi-segment.test.ts` for #462 edge cases) +- `CHANGELOG.md` — `[Unreleased]` entry under `Fixed` (or `Internal` if framed as hardening) + +## Steps + +> **Hydration:** STATUS.md tracks outcomes per-issue. Worker expands Steps 2-5 +> with concrete checkboxes after Step 1 plan-review APPROVE. + +### Step 0: Preflight + +- [ ] On `main` (lane worktree, fresh from v0.30.0 release) +- [ ] All four gates pass on baseline: `npm run typecheck` exit 0, `npm run lint` exit 0, `npm run format:check` exit 0, `npm run test:fast` 3627+ +- [ ] All four issue bodies read: #462, #502, #503, #508 +- [ ] Tier 3 context files read +- [ ] Live grep verification: confirm `stepSegmentMap && currentRepoId && repoStepNumbers` is still the condition pattern referenced in #502 (or document the post-TP-194 equivalent) +- [ ] Decision: introduce a `SegmentScopeMode` enum/type in `types.ts` (vs. inline string union)? Recommendation in Discoveries. + +### Step 1: Plan all four fixes + +> ⚠️ Plan-review checkpoint. Reviewer evaluates architectural cohesion across the 4 issues. + +- [ ] #462 design: monitor guard (suppress `.DONE` as success signal for known non-final active segments), resume guard (don't accept `.DONE` for incomplete frontier), discovery safeguard (sanity check or doctor warning). Document each guard's exact check + the "fail-loud vs auto-recover" stance per guard. +- [ ] #502 design: `SegmentScopeMode` promotion to a first-class type; gate env var, tool registration, and execution branches on it. List every site that currently checks `stepSegmentMap && currentRepoId` and the unified-condition replacement. +- [ ] #503 design: test file structure (extend existing `segment-scoped-lane-runner.test.ts` vs new dedicated file). Per-case checklist matches the 4 scenarios in the issue body. +- [ ] #508 design: pre-spawn segment-completion check site in `lane-runner.ts` iteration loop. Document exit-condition semantics (skip to segment-completion handling vs. break to next-task). +- [ ] Cross-issue coordination: any interaction between #462's monitor guard and #508's pre-spawn check? Document. +- [ ] Drafts in Discoveries. + +### Step 2: Implement #502 first (foundational refactor) + +> ⚠️ Code-review fires after this step. + +> Rationale: promoting `SegmentScopeMode` to a first-class type creates the +> authoritative flag that #462 and #508 can also reference. Doing this first +> avoids retrofitting after the other work lands. + +- [ ] `SegmentScopeMode` promoted (likely as enum in `types.ts`) +- [ ] `lane-runner.ts` computes it once + threads via lane config +- [ ] `execution.ts` `TASKPLANE_ACTIVE_SEGMENT_ID` env var gated on it +- [ ] `execution.ts` `request_segment_expansion` tool registration gated on it +- [ ] Scattered `stepSegmentMap && currentRepoId` checks replaced with single-flag reference +- [ ] Targeted tests pass; full fast suite passes + +### Step 3: Implement #462 guards + +> ⚠️ Code-review fires after this step. + +- [ ] Monitor guard in `resolveTaskMonitorState` (`execution.ts`) +- [ ] Resume guard in `collectDoneTaskIdsForResume` (`resume.ts`) +- [ ] Discovery safeguard in `discovery.ts` (sanity check or doctor warning per plan decision) +- [ ] 3-4 behavioral tests covering: non-final unlink failure, transient `.DONE` monitor race, resume with `.DONE` + incomplete frontier +- [ ] Full fast suite passes + +### Step 4: Implement #508 early-exit optimization + +> ⚠️ Code-review fires after this step. + +- [ ] Pre-spawn segment-completion check in `lane-runner.ts` iteration loop +- [ ] Exit-condition wiring per plan decision (skip to segment-completion handling) +- [ ] Behavioral test asserting wasted iteration is skipped when all segment checkboxes are pre-complete +- [ ] Full fast suite passes + +### Step 5: Implement #503 prompt-injection regression tests + +> ⚠️ Code-review fires after this step. + +- [ ] `FULL_TASK` prompt assertions: includes `SegmentScopeMode: FULL_TASK`, NOT `Active segment ID`, NOT segment-scoped checkbox block +- [ ] `SEGMENT_SCOPED` prompt assertions: includes `SegmentScopeMode: SEGMENT_SCOPED`, `Active segment ID`, segment-scoped checkbox block, "Other segments in this step (NOT yours)" +- [ ] Polyrepo single-segment regression: worker proceeds beyond Step 0 (does not exit after one step) +- [ ] Legacy/partial-marker case: fallback behavior does not silently one-step scope +- [ ] Tests pass in isolation + full fast suite + +### Step 6: Testing & Verification + +> ZERO test failures allowed. ALL FOUR GATES must remain green (post-TP-194: typecheck, lint, format:check, tests). + +- [ ] `npm run typecheck` exits 0 +- [ ] `npm run lint` exits 0 +- [ ] `npm run format:check` exits 0 +- [ ] `npm run test:fast` passes (target: 3627+ baseline + new tests from this task; record final count) +- [ ] Full integration suite passes +- [ ] CLI smoke clean + +### Step 7: Documentation & Delivery + +- [ ] CHANGELOG entry under `[Unreleased]` → `Fixed` (or `Internal` if framed as hardening): + - Title: `**Multi-segment engine hardening (TP-196, #462 + #502 + #503 + #508)**` + - Body: 2-3 paragraph summary covering: (1) `.DONE` authority guards, (2) SegmentScopeMode unification + regression tests, (3) wasted-iteration elimination, (4) validation (tests + gates green) +- [ ] Discoveries logged: per-issue final fix summary; any latent bugs uncovered during hardening +- [ ] Step boundaries committed with `feat(TP-196): ...` / `fix(TP-196): ...` / `test(TP-196): ...` prefixes +- [ ] Issue-close comments drafted in Discoveries for #462, #502, #503, #508 — to be posted by operator after PR merges + +## Documentation Requirements + +**Must Update:** +- `CHANGELOG.md` — Fixed/Internal entry per Step 7 + +**Check If Affected:** +- `docs/specifications/taskplane/multi-repo-task-execution.md` — if any of these fixes change a contract documented there, update; otherwise leave alone + +## Completion Criteria + +- [ ] All four issues' acceptance criteria met (per their issue bodies) +- [ ] All four CI gates pass (`typecheck`, `lint`, `format:check`, `test:fast`) +- [ ] Per-step plan + code reviews APPROVE'd +- [ ] CHANGELOG entry added +- [ ] Issue-close comment drafts ready for operator + +## Git Commit Convention + +Commits happen at **step boundaries** AND at issue boundaries within combined steps. All commits MUST include the task ID: + +- **Step completion:** `chore(TP-196): complete Step N — description` +- **Per-issue fix:** `fix(TP-196, #): description` +- **Test addition:** `test(TP-196, #): description` + +## Do NOT + +- **Don't split into separate PRs unless plan-review reveals a clear architectural split.** The 4 issues are bundled deliberately because they share files and the segment-engine mental model. +- **Don't break the post-TP-194 hard gates.** Any change must keep `typecheck` / `lint` / `format:check` all exit 0. The reviewer agent now downgrades APPROVE → REVISE on any failing gate, so plan accordingly. +- **Don't change behavior beyond what each issue specifies.** Hardening = guards + drift prevention + tests, not new feature work. +- **Don't address the dashboard segment-progress visibility issue (#464)** — that's TP-197's scope, separate file domain (`dashboard/public/`). +- **Don't load docs not listed in "Context to Read First."** +- **Don't commit without the `TP-196` prefix.** + +--- + +## Amendments (Added During Execution) + + diff --git a/taskplane-tasks/TP-196-multi-segment-engine-hardening/STATUS.md b/taskplane-tasks/TP-196-multi-segment-engine-hardening/STATUS.md new file mode 100644 index 00000000..7c5091b7 --- /dev/null +++ b/taskplane-tasks/TP-196-multi-segment-engine-hardening/STATUS.md @@ -0,0 +1,164 @@ +# TP-196: Multi-segment engine hardening — Status + +**Current Step:** Not Started +**Status:** 🔵 Ready for Execution +**Last Updated:** 2026-05-10 +**Review Level:** 2 +**Review Counter:** 0 +**Iteration:** 0 +**Size:** M + +> **Hydration:** Worker expands Steps 2-5 with concrete per-file checkboxes +> after Step 1 plan-review APPROVE. Each step maps to one of the 4 absorbed +> issues (#462, #502, #503, #508). + +> **⚠️ Post-TP-194 hard-gate environment.** All four code-quality gates +> (typecheck, lint, format:check, tests) are now required at PR time. The +> reviewer agent downgrades APPROVE → REVISE on any failure. Plan accordingly. + +--- + +### Step 0: Preflight +**Status:** ⬜ Not Started + +- [ ] On `main` (fresh from v0.30.0) +- [ ] All four gates pass on baseline (typecheck 0, lint 0, format:check 0, tests 3627+) +- [ ] All four issue bodies read: #462, #502, #503, #508 +- [ ] Tier 3 context files read (lane-runner.ts segment scope, execution.ts monitor + tool registration, resume.ts reconciliation, discovery.ts skip logic, segment-scoped-lane-runner test file) +- [ ] Live grep verification of `#502` condition pattern +- [ ] Decision: SegmentScopeMode promotion to first-class enum/type (recommendation in Discoveries) + +--- + +### Step 1: Plan all four fixes +**Status:** ⬜ Not Started + +> ⚠️ Plan-review checkpoint. + +- [ ] #462 design (3 guards + edge-case tests) +- [ ] #502 design (SegmentScopeMode promotion + gate sites) +- [ ] #503 design (test file structure + 4 scenarios) +- [ ] #508 design (pre-spawn check site + exit-condition semantics) +- [ ] Cross-issue coordination documented +- [ ] Drafts in Discoveries + +--- + +### Step 2: Implement #502 first (foundational refactor) +**Status:** ⬜ Not Started + +> ⚠️ Code-review fires after this step. + +- [ ] `SegmentScopeMode` promoted to first-class type +- [ ] `lane-runner.ts` threads via lane config +- [ ] `execution.ts` env var + tool registration gated +- [ ] Scattered `stepSegmentMap && currentRepoId` checks unified +- [ ] Targeted + full fast suite pass + +--- + +### Step 3: Implement #462 guards +**Status:** ⬜ Not Started + +> ⚠️ Code-review fires after this step. + +- [ ] Monitor guard in `resolveTaskMonitorState` +- [ ] Resume guard in `collectDoneTaskIdsForResume` +- [ ] Discovery safeguard +- [ ] 3-4 behavioral tests for edge cases +- [ ] Full fast suite passes + +--- + +### Step 4: Implement #508 early-exit optimization +**Status:** ⬜ Not Started + +> ⚠️ Code-review fires after this step. + +- [ ] Pre-spawn segment-completion check +- [ ] Exit-condition wiring +- [ ] Behavioral test asserting wasted iteration skipped +- [ ] Full fast suite passes + +--- + +### Step 5: Implement #503 prompt-injection regression tests +**Status:** ⬜ Not Started + +> ⚠️ Code-review fires after this step. + +- [ ] FULL_TASK assertions +- [ ] SEGMENT_SCOPED assertions +- [ ] Polyrepo single-segment regression +- [ ] Legacy/partial-marker fallback case +- [ ] Tests pass in isolation + full suite + +--- + +### Step 6: Testing & Verification +**Status:** ⬜ Not Started + +> ZERO test failures allowed. ALL FOUR GATES green. + +- [ ] `npm run typecheck` exit 0 +- [ ] `npm run lint` exit 0 +- [ ] `npm run format:check` exit 0 +- [ ] `npm run test:fast` passes (target: 3627+ + new tests; record final count) +- [ ] Full integration suite passes +- [ ] CLI smoke clean + +--- + +### Step 7: Documentation & Delivery +**Status:** ⬜ Not Started + +- [ ] CHANGELOG entry under [Unreleased] → Fixed (or Internal) +- [ ] Discoveries logged: per-issue final fix summary +- [ ] Issue-close comment drafts for #462, #502, #503, #508 in Discoveries +- [ ] All commits include `TP-196` prefix + +--- + +## Reviews + +| # | Type | Step | Verdict | File | +|---|------|------|---------|------| + +--- + +## Discoveries + +| Discovery | Disposition | Location | +|-----------|-------------|----------| + +--- + +## Execution Log + +| Timestamp | Action | Outcome | +|-----------|--------|---------| +| 2026-05-10 | Task staged | PROMPT.md and STATUS.md created (bundles #462/#502/#503/#508) | + +--- + +## Blockers + +*None* + +--- + +## Notes + +**Why bundle 4 issues into one task:** + +All 4 touch overlapping files (`lane-runner.ts`, `execution.ts`, `resume.ts`, `discovery.ts`, `segment-scoped-lane-runner.test.ts`). The segment-engine mental model is consistent across all of them — `.DONE` authority guards (#462), scope-mode unification (#502), regression tests for scope mode (#503), and early-exit optimization (#508). Bundling lets the worker reuse the context once and ship a coherent hardening pass. + +If plan-review reveals a clear architectural split during Step 1, splitting is allowed but should be explicit (and the spec should document why). + +**Sequencing within the task:** + +#502 is implemented FIRST because it promotes `SegmentScopeMode` to a first-class type that #462 and #508 can also reference. Implementing it first avoids retrofitting the others. #503 (tests for #502) is the last implementation step — gives the most stable surface to write assertions against. + +**Hard-gate compliance:** + +Post-TP-194, the reviewer agent downgrades APPROVE → REVISE on any failing `typecheck` / `lint` / `format:check`. This is the first task to run entirely under hard gates; the worker should expect that gate failures will be surfaced in code reviews and cannot be ignored. Plan accordingly: don't break gates anywhere mid-step. diff --git a/taskplane-tasks/TP-197-dashboard-segment-progress/PROMPT.md b/taskplane-tasks/TP-197-dashboard-segment-progress/PROMPT.md new file mode 100644 index 00000000..3d4f51ec --- /dev/null +++ b/taskplane-tasks/TP-197-dashboard-segment-progress/PROMPT.md @@ -0,0 +1,180 @@ +# Task: TP-197 - Dashboard segment-level progress indicators for multi-segment tasks + +**Created:** 2026-05-10 +**Size:** S-M + +## Review Level: 1 (Plan Only) + +**Assessment:** UX improvement, scoped to dashboard rendering. No engine changes. The data is already available (lane snapshots include `segmentId`; the segment frontier is tracked in batch state). Worker needs to (1) surface segment metadata via the dashboard server API and (2) render it in the dashboard UI. Plan review evaluates the API + visual design; code review is skipped because dashboard JS work has been Level 1 historically and the test gate is the test suite + manual operator inspection during a polyrepo run. + +**Score:** 2/8 — Blast radius: 1, Pattern novelty: 1, Security: 0, Reversibility: 0 + +## Canonical Task Folder + +``` +taskplane-tasks/TP-197-dashboard-segment-progress/ +├── PROMPT.md ← This file (immutable above --- divider) +├── STATUS.md ← Execution state (worker updates this) +├── .reviews/ ← Reviewer output (created by the orchestrator runtime) +└── .DONE ← Created when complete +``` + +## Mission + +Close [#464](https://github.com/HenryLach/taskplane/issues/464) — Dashboard: segment-level progress indicators for multi-segment tasks. + +The dashboard correctly avoids prematurely marking multi-segment lanes as complete (per TP-145's `.DONE` suppression for non-final segments), but the side effect is that **operators have no way to tell what's happening during the suppression window**. A lane sits "running" with no segment-level signal, no completion-of-current-segment indicator, no clear "we're on segment 1 of 3" framing. In wave 2+ batches where all tasks are mid-segment, the entire wave appears stuck. + +By the end of TP-197: +- Each lane row in the dashboard shows segment context when applicable (e.g., `segment 2/3 · shared-libs`) +- Segment completion status is visually distinct (✅ for done segments, ⏳ for in-progress, ⬚ for pending) +- The progress bar reflects the **current segment's** checkpoint progress, not just the overall task progress +- Non-final segment completion produces a visual indicator even though `.DONE` is suppressed +- Single-segment / single-repo tasks render identically to today (no regression for the common case) + +## Dependencies + +**None** — all referenced predecessor work is merged. Informational cross-references: + +- TP-145 (already shipped): `.DONE` suppression for non-final segments. The visual gap this task fills is the SIDE EFFECT of that correctness fix. +- TP-081 / TP-133 / TP-134 / TP-135 (multi-repo task execution, shipped): provides the segment frontier + lane snapshot segmentId data that this task surfaces. +- TP-485 (task title in dashboard, shipped v0.28.7): the row-layout precedent for adding new visual content to a lane row. +- TP-485 follow-up (widened task-title row, shipped v0.28.8): the grid-layout precedent for spanning new content across columns 3-6. + +## Context to Read First + +> Only list docs the worker actually needs. Less is better. + +**Tier 2 (area context):** +- `taskplane-tasks/CONTEXT.md` + +**Tier 3 (load only if needed):** +- Issue body: `gh issue view 464` +- `dashboard/public/app.js` — lane-row rendering; this is the file where the new indicators live +- `dashboard/public/style.css` — lane-row layout; check existing grid for cols 3-6 (per TP-485 follow-up) and how segment indicators fit +- `dashboard/server.cjs` — API endpoint that serves batch state to the dashboard frontend; may need to surface segment data more explicitly +- `extensions/taskplane/types.ts` — `RuntimeLaneSnapshot` shape (has `segmentId` already), `PersistedSegmentRecord` shape +- `extensions/taskplane/persistence.ts` — batch state shape (where `segments[]` lives), so worker knows what data is canonically available +- `docs/specifications/taskplane/multi-repo-task-execution.md` section "Lane snapshots" — confirms what segment metadata is on disk + +## Environment + +- **Workspace:** `dashboard/` (rendering) + `extensions/taskplane/` (data shape, if API extension needed) +- **Services required:** None + +## File Scope + +> The orchestrator uses this to avoid merge conflicts. + +- `dashboard/public/app.js` — segment-indicator rendering +- `dashboard/public/style.css` — segment-indicator styling +- `dashboard/server.cjs` — segment data on the API response, if not already present (verify in Step 0) +- `extensions/tests/.test.ts` — optional: dashboard server unit test if the API shape changes +- `CHANGELOG.md` — `[Unreleased]` entry under `Enhanced` + +## Steps + +> **Hydration:** Worker expands Step 2 with concrete render-site checkboxes after Step 1 plan-review. + +### Step 0: Preflight + +- [ ] On `main` (lane worktree, fresh from v0.30.0) +- [ ] All four gates pass on baseline (typecheck 0, lint 0, format:check 0, tests 3627+) +- [ ] Issue #464 read in full +- [ ] Tier 3 context files read +- [ ] Verify on the API side: does `dashboard/server.cjs` already surface segment data to the frontend, or does the worker need to extend the API endpoint? +- [ ] Identify a real-world test case: a recent multi-segment batch in `.pi/runtime//` whose data we can inspect to validate the rendering + +### Step 1: Plan the API + visual design + +> ⚠️ Plan-review checkpoint. Reviewer evaluates UX + API shape. + +- [ ] API design: what segment metadata does the dashboard server expose? (current segment ID + status, segment frontier completion state, repoId per segment). Document the exact JSON shape. +- [ ] Visual design: how should segment context appear in a lane row? Recommendation: a horizontal pill row under the task title (e.g., `✅ shared-libs · ⏳ web-client · ⬚ administration`) with the current segment highlighted. Consider compact vs. expanded variants. +- [ ] Progress-bar behavior: how does it reflect segment-level progress vs. task-level? Options: (a) show segment progress when segmented, task progress otherwise; (b) two-tone bar (filled portion = done segments; ghost portion = current segment progress). +- [ ] Single-segment fallback: confirm rendering is IDENTICAL to today for non-segmented tasks (no regression). +- [ ] Mobile/narrow-viewport: dashboard's current responsive behavior — confirm the new indicators degrade gracefully. +- [ ] Drafts in Discoveries. + +### Step 2: Implement the data plumbing + +> Plan-reviewer must have APPROVED Step 1 before proceeding. + +- [ ] Extend `dashboard/server.cjs` API response with segment data (only if needed per Step 0 verification) +- [ ] Add typed shape to `dashboard/public/app.js` (or inline types) for the new API field +- [ ] Verify the API response on a real running batch (use the test case identified in Step 0) + +### Step 3: Implement the visual rendering + +- [ ] Segment indicator pill row in lane rows (per plan) +- [ ] CSS styling for ✅ / ⏳ / ⬚ states (per plan) +- [ ] Progress-bar segment-aware logic (per plan) +- [ ] Single-segment fallback: visual regression test on a non-segmented batch (manual operator check; no automation possible for visual UX) +- [ ] Browser-side smoke: load a recent multi-segment batch's data into the dashboard and confirm rendering + +### Step 4: Testing & Verification + +> ZERO test failures allowed. ALL FOUR GATES must remain green. + +- [ ] `npm run typecheck` exits 0 +- [ ] `npm run lint` exits 0 +- [ ] `npm run format:check` exits 0 +- [ ] `npm run test:fast` passes (target: 3627+ baseline) +- [ ] Full integration suite passes +- [ ] Manual visual verification: load a multi-segment batch (e.g., the polyrepo test workspace's most recent run) and confirm: + - Indicators render + - Current segment is visually distinct + - Completed segments show ✅ + - Pending segments show ⬚ + - Progress bar reflects current segment's progress + - Single-segment task in the same batch renders identically to current behavior + +### Step 5: Documentation & Delivery + +- [ ] CHANGELOG entry under `[Unreleased]` → `Enhanced`: + - Title: `**Dashboard segment-level progress indicators (TP-197, #464)**` + - Body: 1-2 paragraph summary covering: (1) the visibility gap during `.DONE` suppression, (2) the indicator design (per-segment pills + segment-aware progress bar), (3) backwards-compatibility (single-segment tasks unchanged) +- [ ] Discoveries logged: API shape used, design choices, any visual regressions found+fixed +- [ ] Issue-close comment drafted for #464 in Discoveries (for operator to post after PR merges) +- [ ] All commits include `TP-197` prefix + +## Documentation Requirements + +**Must Update:** +- `CHANGELOG.md` — Enhanced entry per Step 5 + +**Check If Affected:** +- `docs/user-guide/dashboard.md` (if exists) — document the new segment indicators + +## Completion Criteria + +- [ ] #464's acceptance criteria met (visible segment context + completion status + segment-aware progress bar) +- [ ] All four CI gates pass +- [ ] Plan review APPROVE'd (Level 1) +- [ ] Manual visual verification confirms rendering +- [ ] CHANGELOG entry added +- [ ] Issue-close comment draft ready + +## Git Commit Convention + +Commits at **step boundaries**, all with `TP-197` prefix: + +- **Step completion:** `feat(TP-197): complete Step N — description` +- **API extension:** `feat(TP-197): expose segment metadata via dashboard server` +- **UI changes:** `feat(TP-197): render segment indicators in lane rows` + +## Do NOT + +- **Don't change `.DONE` suppression behavior** — that's TP-145 and is correct as-is. This task adds visibility ON TOP of suppression, not in lieu of. +- **Don't address segment-engine hardening** (#462, #502, #503, #508) — that's TP-196's scope. +- **Don't break the post-TP-194 hard gates** — typecheck/lint/format:check are now required. +- **Don't add `dashboard/public/`** to the Biome lint scope — it's intentionally out of scope per the code-quality-gates spec. This task touches the same files but keeps the lint exclusion intact. +- **Don't expand scope** beyond #464's stated UX gap. If exploring reveals additional dashboard gaps, document in Discoveries and propose separate tasks. +- **Don't load docs not listed in "Context to Read First."** +- **Don't commit without the `TP-197` prefix.** + +--- + +## Amendments (Added During Execution) + + diff --git a/taskplane-tasks/TP-197-dashboard-segment-progress/STATUS.md b/taskplane-tasks/TP-197-dashboard-segment-progress/STATUS.md new file mode 100644 index 00000000..23b41835 --- /dev/null +++ b/taskplane-tasks/TP-197-dashboard-segment-progress/STATUS.md @@ -0,0 +1,128 @@ +# TP-197: Dashboard segment-level progress indicators — Status + +**Current Step:** Not Started +**Status:** 🔵 Ready for Execution +**Last Updated:** 2026-05-10 +**Review Level:** 1 +**Review Counter:** 0 +**Iteration:** 0 +**Size:** S-M + +> **Hydration:** Worker expands Step 2/3 with concrete render-site checkboxes +> after Step 1 plan-review APPROVE. + +> **⚠️ Post-TP-194 hard-gate environment.** All four code-quality gates +> (typecheck, lint, format:check, tests) are now required at PR time. + +--- + +### Step 0: Preflight +**Status:** ⬜ Not Started + +- [ ] On `main` (fresh from v0.30.0) +- [ ] All four gates pass on baseline +- [ ] Issue #464 read in full +- [ ] Tier 3 context files read (dashboard/public/app.js, style.css, server.cjs, types.ts segment shapes) +- [ ] API verification: does `dashboard/server.cjs` already surface segment data, or does the worker need to extend it? +- [ ] Real-world test case identified (recent multi-segment batch in `.pi/runtime//`) + +--- + +### Step 1: Plan the API + visual design +**Status:** ⬜ Not Started + +> ⚠️ Plan-review checkpoint. + +- [ ] API design + JSON shape documented +- [ ] Visual design (pill row + progress-bar behavior) documented +- [ ] Single-segment fallback confirmed (no regression for non-segmented tasks) +- [ ] Mobile/narrow-viewport considered +- [ ] Drafts in Discoveries + +--- + +### Step 2: Implement the data plumbing +**Status:** ⬜ Not Started + +- [ ] `dashboard/server.cjs` API extended (if needed) +- [ ] Frontend types added for new API shape +- [ ] API response verified on real running batch + +--- + +### Step 3: Implement the visual rendering +**Status:** ⬜ Not Started + +- [ ] Segment indicator pill row +- [ ] CSS styling for ✅ / ⏳ / ⬚ states +- [ ] Progress-bar segment-aware logic +- [ ] Single-segment fallback visual regression-checked +- [ ] Browser-side smoke on real batch + +--- + +### Step 4: Testing & Verification +**Status:** ⬜ Not Started + +> ZERO test failures allowed. ALL FOUR GATES green. + +- [ ] `npm run typecheck` exit 0 +- [ ] `npm run lint` exit 0 +- [ ] `npm run format:check` exit 0 +- [ ] `npm run test:fast` passes (3627+ baseline) +- [ ] Full integration suite passes +- [ ] Manual visual verification on multi-segment batch + +--- + +### Step 5: Documentation & Delivery +**Status:** ⬜ Not Started + +- [ ] CHANGELOG entry under [Unreleased] → Enhanced +- [ ] Discoveries logged +- [ ] Issue-close comment draft for #464 +- [ ] All commits include `TP-197` prefix + +--- + +## Reviews + +| # | Type | Step | Verdict | File | +|---|------|------|---------|------| + +--- + +## Discoveries + +| Discovery | Disposition | Location | +|-----------|-------------|----------| + +--- + +## Execution Log + +| Timestamp | Action | Outcome | +|-----------|--------|---------| +| 2026-05-10 | Task staged | PROMPT.md and STATUS.md created | + +--- + +## Blockers + +*None* + +--- + +## Notes + +**Why this is separate from TP-196:** + +TP-196 handles segment-engine hardening (`.DONE` authority, scope-mode unification, regression tests, early-exit optimization) — all in `extensions/taskplane/` files. TP-197 is purely a dashboard UX concern in `dashboard/public/` files. Different file domain, different test approach (TP-196 is unit/integration-test-driven; TP-197 is manual-visual-verification-driven). Bundling would dilute both. + +**Manual verification expected:** + +Unlike most tasks, the success criterion for TP-197 is partially visual — does the indicator look right? does the progress bar make sense? This means the worker should expect to load the dashboard against a real batch and manually inspect, then the operator does a final visual check before merge. There's no automated way to test "the UI looks correct." + +**dashboard/public/ stays out of Biome lint scope:** + +Per the code-quality-gates spec (section 3, non-goals), `dashboard/public/` is intentionally vanilla JS, out of lint scope. This task touches those files but does NOT add them to lint scope. The `.biome.json` exclusion for `dashboard/public/**` stays in place. A separate future task could opt-in to dashboard linting if/when there's demand.