Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
7da832a
chore(TP-196): complete Step 0 \u2014 preflight + SegmentScopeMode de…
HenryLach May 10, 2026
26737e5
chore(TP-197): step 0 preflight complete
HenryLach May 10, 2026
bb50b55
plan(TP-196): Step 1 \u2014 per-issue design + cross-issue coordination
HenryLach May 10, 2026
8f55f0a
docs(TP-197): step 1 design plan drafted (ready for plan review)
HenryLach May 10, 2026
38fca2c
docs(TP-197): address R001 plan review - move pill row to grid row 3 …
HenryLach May 10, 2026
2fd6af1
feat(TP-196, #502): promote SegmentScopeMode to first-class type + un…
HenryLach May 10, 2026
d831ed6
chore(TP-197): steps 1-2 complete (plan APPROVE), hydrate step 3 impl…
HenryLach May 10, 2026
a60616b
hydrate(TP-196): add R002 revision items to Step 2
HenryLach May 10, 2026
0be59dd
fix(TP-196, #502): gate segment-prompt block on isSegmentScoped (R002)
HenryLach May 10, 2026
56cdf35
chore(TP-196): mark Step 2 complete (code review APPROVE after R002 f…
HenryLach May 10, 2026
72eb4b5
feat(TP-197): render per-segment status pill row in dashboard (#464)
HenryLach May 10, 2026
761ef5a
docs(TP-197): CHANGELOG + use-the-dashboard.md updates for segment pi…
HenryLach May 10, 2026
a94397d
chore(TP-197): mark task complete (all 5 steps done, all gates green)
HenryLach May 10, 2026
787c93a
checkpoint: TP-197 task artifacts (.DONE, STATUS.md)
HenryLach May 10, 2026
a73a484
feat(TP-196, #462): .DONE authority guards (monitor + resume + discov…
HenryLach May 10, 2026
790478e
chore(TP-196): mark Step 3 complete (code review APPROVE)
HenryLach May 11, 2026
e614de0
feat(TP-196, #508): pre-spawn segment-completion check eliminates was…
HenryLach May 11, 2026
c3694ec
fix(TP-196, #508): extract shouldSkipSpawnForCompleteSegment + add be…
HenryLach May 11, 2026
b369eca
fix(TP-196): biome-format Step 4 test wrapping (R006)
HenryLach May 11, 2026
1a2398a
chore(TP-196): mark Step 4 complete (code review APPROVE)
HenryLach May 11, 2026
346c01f
test(TP-196, #503): SegmentScopeMode prompt-injection regression tests
HenryLach May 11, 2026
bf35251
fix(TP-196, #503): strengthen polyrepo single-segment regression test…
HenryLach May 11, 2026
22dbcd6
chore(TP-196): mark Step 5 complete (code review APPROVE)
HenryLach May 11, 2026
818c787
chore(TP-196): complete Step 6 \u2014 testing & verification (all gat…
HenryLach May 11, 2026
752e97f
docs(TP-196): Step 7 \u2014 CHANGELOG entry + issue-close drafts + ST…
HenryLach May 11, 2026
a471ba2
checkpoint: TP-196 task artifacts (.DONE, STATUS.md)
HenryLach May 11, 2026
7716cc2
merge: wave 1 lane 2 — TP-197
HenryLach May 11, 2026
994badc
merge: wave 1 lane 1 — TP-196
HenryLach May 11, 2026
533a6fa
fix(TP-197): make 3-row grid opt-in via .has-segments class (sage pos…
HenryLach May 11, 2026
46d7b69
fix(TP-197): wave-chip parallelization for all waves + pending pill c…
HenryLach May 11, 2026
e8cd194
fix(TP-197): treat laneNumber=0 as sentinel 'unallocated' in wave-chi…
HenryLach May 11, 2026
f63cfad
fix(TP-197): merge-agents panel cleanup — remove dead columns + bump …
HenryLach May 11, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
88 changes: 88 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,94 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

### Enhanced

- **Dashboard segment-level progress indicators (TP-197, #464):** Multi-segment
task rows now show a horizontal pill row of per-segment status badges —
one pill per segment with a status icon (✅ succeeded · ⏳ running · ⬚
pending · ❌ failed · ⏸ stalled · ↷ skipped) plus the segment’s repo
ID. The currently-executing segment is visually emphasized. This closes
the operator-visibility gap introduced by TP-145’s `.DONE` suppression for
non-final segments: previously, multi-segment lanes sat “running” with
no segment-level signal during the suppression window, which made wave 2+
batches where all tasks were mid-segment appear stuck. With the pill row
in place, operators can see at a glance which segments have finished,
which is running, and which remain. The progress bar itself is unchanged
— TP-174 already made it segment-scoped via the V2 lane snapshot’s
per-segment counts; the new pill row provides the missing context that
makes the existing bar legible as “current segment’s progress.”

Backwards-compatibility: single-segment tasks render an empty pill row
(auto-collapsed grid sub-row), so the DOM and visual layout for
non-segmented batches are identical to before. The pill row lives in a
new grid row 3 of `.task-row` (cols 3–7), mirroring the
`task-title-subtitle` pattern from TP-485, and is intentionally placed
*outside* the `.task-step` cell so the existing `@media (max-width: 900px)`
rule that hides `.task-step` does not hide segment context on narrow
viewports. No `dashboard/server.cjs` change was required — the existing
API response already exposed `batch.segments[]`, `task.segmentIds`, and
`runtimeLaneSnapshots[*].segmentId`.

### Fixed

- **Multi-segment engine hardening (TP-196, #462 + #502 + #503 + #508):**
closes four follow-up issues from the multi-repo task execution rollout
with a single coherent hardening pass against the multi-segment engine.

- **`.DONE` authority guards (#462)** — three defense-in-depth checks now
refuse to honor a stale or premature `.DONE` in multi-segment tasks:
(a) `resolveTaskMonitorState` (`execution.ts`) accepts an optional
`multiSegmentContext: { isFinalSegment, segmentId }` parameter; when
`isFinalSegment === false` and `.DONE` is present, Priority 1 is
skipped and a WARN is logged via `execLog`; `monitorLanes` populates
this context from `task.segmentIds` + `task.activeSegmentId`. (b)
`collectDoneTaskIdsForResume` (`resume.ts`) now refuses to add a
taskId to the done set when persisted segment records exist AND any
segment is not `succeeded`/`skipped` — the task re-reconciles instead
of silently being marked complete. (c) A new exported
`checkDoneAuthoritySafeguard` helper (`discovery.ts`) emits a
doctor-style `console.warn` when `.DONE` coexists with unchecked
STATUS.md checkboxes during area scans. The pre-existing TP-135
"keeps .DONE authoritative even when segment frontier is incomplete"
test was updated to assert the inverted (post-#462) contract.

- **SegmentScopeMode unification (#502 + #503)** — promotes the
FULL_TASK / SEGMENT_SCOPED decision to a first-class
`SegmentScopeMode = "FULL_TASK" | "SEGMENT_SCOPED"` type in `types.ts`
plus a `computeSegmentScopeMode(stepSegmentMap, repoStepNumbers,
currentRepoId, currentStepNumber)` helper in `lane-runner.ts`. The
iteration loop now derives both the authoritative `segmentScopeMode`
and the legacy `isSegmentScoped` boolean alias from one call, and
the segment-prompt injection block is gated on `isSegmentScoped`
instead of the previous scattered `stepSegmentMap && currentRepoId
&& repoStepNumbers && remainingSteps.length > 0` composite. New
behavioural regression suite
(`extensions/tests/segment-scope-mode-prompt.test.ts`, 9 tests
across 4 describe blocks) mocks `spawnAgent` to capture the worker
prompt + env + system prompt and verifies the FULL_TASK,
SEGMENT_SCOPED, polyrepo single-segment, and legacy/partial-marker
contracts end-to-end.

- **Wasted-iteration elimination (#508)** — lane-runner now performs
an explicit pre-spawn segment-completion check between the existing
`remainingSteps.length === 0` guard and the `totalIterations++`
increment, delegating to a new pure helper
`shouldSkipSpawnForCompleteSegment(statusContent, repoStepNumbers,
currentRepoId)`. When every segment-scoped step for the active repo
is already complete, the loop logs `"Pre-spawn segment-completion
check"` and breaks before incurring a worker spawn. Behavioural
test (`extensions/tests/early-exit-segment-spawn-skip.test.ts`)
mocks `agent-host.spawnAgent` via `mock.module` and asserts
`spawnAgentCallCount === 0` for a fixture worktree whose checkboxes
are pre-checked.

- **Validation:** typecheck / lint / format:check all exit 0. Fast
test suite passes at 3678 / 0 fail / 1 skip — net +51 new tests
spread across 3 new test files plus targeted updates to
`segment-scoped-lane-runner.test.ts`, `resume-segment-frontier.test.ts`,
and `engine-runtime-v2-routing.test.ts` (slice-window widening for
the longer `resolveTaskMonitorState` body).

## [0.30.0] - 2026-05-10

### Fixed
Expand Down
139 changes: 124 additions & 15 deletions dashboard/public/app.js
Original file line number Diff line number Diff line change
Expand Up @@ -382,6 +382,53 @@ function taskSegmentProgress(task, segmentStatusMap, forcedActiveSegmentId) {
};
}

// TP-197 (#464): Render a horizontal pill row of per-segment status badges for a
// multi-segment task. Each pill shows an icon + repoId for one segment. The icon
// reflects the segment's status (succeeded / running / pending / failed / stalled /
// skipped). The current segment (the one actively executing on its lane) gets an
// emphasis class. Returns "" for single-segment tasks so the rendered DOM is
// byte-identical to today for the non-segmented common case (no regression).
//
// Consumes:
// - task.segmentIds: string[] (ordered, from PersistedTaskRecord)
// - segmentStatusMap: Map<segmentId, PersistedSegmentStatus> built by
// buildSegmentStatusMap() from batch.segments[]
// - activeSegmentId: string|null — current executing segment (from V2 lane
// snapshot's segmentId, or the task's activeSegmentId field)
function taskSegmentPillRow(task, segmentStatusMap, activeSegmentId) {
const segmentIds = Array.isArray(task?.segmentIds)
? task.segmentIds.filter(id => typeof id === "string")
: [];
if (segmentIds.length <= 1) return "";

// Status -> { icon, className } table. Keep emoji simple/monospace-friendly.
// ✅ succeeded, ⏳ running, ⬚ pending, ❌ failed, ⏸ stalled, ↷ skipped.
const styles = {
succeeded: { icon: "\u2705", cls: "seg-succeeded" },
running: { icon: "\u23F3", cls: "seg-running" },
pending: { icon: "\u2B1A", cls: "seg-pending" },
failed: { icon: "\u274C", cls: "seg-failed" },
stalled: { icon: "\u23F8", cls: "seg-stalled" },
skipped: { icon: "\u21B7", cls: "seg-skipped" },
};

const pills = segmentIds.map((segId) => {
const status = segmentStatusMap.get(segId) || "pending";
const style = styles[status] || styles.pending;
const parsed = parseSegmentId(segId);
const repoLabel = parsed?.repoId || segId;
const isCurrent = activeSegmentId && segId === activeSegmentId;
const currentCls = isCurrent ? " seg-pill-current" : "";
const title = `${segId} \u00b7 ${status}`;
return `<span class="seg-pill ${style.cls}${currentCls}" title="${escapeHtml(title)}">`
+ `<span class="seg-pill-icon">${style.icon}</span>`
+ `<span class="seg-pill-label">${escapeHtml(repoLabel)}</span>`
+ `</span>`;
}).join("");

return `<div class="task-segment-row">${pills}</div>`;
}

function laneActiveSegmentInfo(v2snap, laneTasks, segmentStatusMap) {
if (!v2snap || !v2snap.segmentId) return null;
const parsed = parseSegmentId(v2snap.segmentId);
Expand Down Expand Up @@ -620,7 +667,13 @@ function renderSummary(batch) {
// their assigned lane: tasks on the same lane render with `→` (serial),
// tasks on different lanes render with ` | ` (parallel). Tooltip shows
// the expanded lane breakdown.
const { compact, tooltip } = formatWaveLaneBreakdown(taskIds, batch.lanes || [], i + 1);
// TP-197 post-merge fold: pass `batch.tasks` as the task→lane source.
// The previous arg `batch.lanes` only carries live Runtime V2 lane
// state for the *currently active* wave — past/future wave chips
// would fall back to comma-separated. `batch.tasks[].laneNumber` is
// persisted for the entire batch lifecycle, so all waves render with
// the correct parallelization separator regardless of active state.
const { compact, tooltip } = formatWaveLaneBreakdown(taskIds, batch.lanes || [], batch.tasks || [], i + 1);
const titleAttr = tooltip ? ` title="${escapeHtml(tooltip)}"` : "";
wavesHtml += `<span class="wave-chip ${cls}"${titleAttr}>W${i + 1} [${compact}]</span>`;
});
Expand Down Expand Up @@ -648,16 +701,46 @@ function renderSummary(batch) {
* are shown with the previous flat formatting and no tooltip is generated
* — this preserves backward compatibility with future-wave display.
*/
function formatWaveLaneBreakdown(taskIds, lanes, waveNumber) {
function formatWaveLaneBreakdown(taskIds, lanes, tasks, waveNumber) {
if (!Array.isArray(taskIds) || taskIds.length === 0) {
return { compact: "", tooltip: "" };
}
// Build taskId → laneNumber map for the lanes that have any of these tasks.
// Build taskId → laneNumber map. Prefer the persisted-per-task
// `tasks[i].laneNumber` (covers all waves, lifecycle-stable). Fall back
// to live `lanes[]` only when tasks data is missing or doesn't carry
// laneNumber for a given task.
//
// TP-197 post-merge fold: the previous implementation read ONLY from
// `lanes`, which is Runtime V2 live state and only populated for the
// currently active wave. That caused inactive waves' chips to fall back
// to comma-separated display (no parallelization indicator), giving the
// impression that the separator changed as the batch progressed. Using
// the persisted `tasks[].laneNumber` makes the indicator stable across
// all waves regardless of active state.
const taskToLane = new Map();
if (Array.isArray(tasks)) {
for (const t of tasks) {
// Persistence assigns `laneNumber: 0` as a sentinel meaning
// "unallocated" (see persistence.ts:1378 — `lane?.laneNumber ??
// outcome?.laneNumber ?? 0`). Real lane numbers start at 1. We must
// skip 0 here so future-wave tasks (which all have the 0 sentinel
// until their wave starts) don't get falsely grouped under a fake
// "lane 0" and rendered as serial.
if (
t &&
t.taskId &&
typeof t.laneNumber === "number" &&
t.laneNumber >= 1 &&
!taskToLane.has(t.taskId)
) {
taskToLane.set(t.taskId, t.laneNumber);
}
}
}
// Fallback: anything `tasks` didn't cover, try `lanes` (live state).
for (const lane of lanes) {
if (!lane || !Array.isArray(lane.taskIds)) continue;
for (const tid of lane.taskIds) {
// First lane to claim a task wins (lanes shouldn't overlap, but be defensive).
if (!taskToLane.has(tid)) taskToLane.set(tid, lane.laneNumber);
}
}
Expand Down Expand Up @@ -859,8 +942,24 @@ function renderLanesTasks(batch, sessions) {
stepHtml = `<span style="color:var(--text-faint)">${escapeHtml(task.exitReason || "—")}</span>`;
}

// TP-197 (#464): Compute the per-segment pill row for multi-segment tasks.
// Returns "" for single-segment tasks (no DOM regression for the common case).
// For multi-segment tasks we render the pill row in the task-row's grid row 3
// (via .task-segment-row CSS) and suppress the inline "Segment N/T: repo" text
// in detailBits to avoid duplicating signal — the pill row already shows the
// current segment (via seg-pill-current) and total count (via pill count).
const segmentPillRowHtml = taskSegmentPillRow(
task,
segmentStatusMap,
v2snap && v2snap.taskId === task.taskId ? v2snap.segmentId : (segmentInfo?.segmentId || null),
);
const hasSegmentPillRow = segmentPillRowHtml !== "";

const detailBits = [];
if (segmentInfo) {
if (segmentInfo && !hasSegmentPillRow) {
// Single-segment + non-segmented tasks: existing inline text (unchanged).
// Multi-segment tasks: suppressed because the new pill row carries the same
// information more legibly.
detailBits.push(`<span class="task-segment-progress" title="${escapeHtml(segmentInfo.segmentId || segmentProgressText(segmentInfo))}">${escapeHtml(segmentProgressText(segmentInfo))}</span>`);
}
if (showPacketHome) {
Expand Down Expand Up @@ -950,8 +1049,17 @@ function renderLanesTasks(batch, sessions) {
const titleHtml = task.taskTitle
? `<div class="task-title-subtitle">${escapeHtml(task.taskTitle)}</div>`
: "";
// TP-197 (#464): segmentPillRowHtml is empty for single-segment tasks so
// the rendered DOM is byte-identical to today for non-segmented tasks.
// For multi-segment tasks it renders as grid-row 3 of .task-row.
// Sage post-merge fold: the .has-segments class opts the .task-row
// grid into a 3-row template only when we actually have a pill row;
// otherwise the default 2-row template preserves single-segment task
// spacing exactly (an unconditional 3-row template would add an 8px
// row-gap even when row 3 is empty, breaking the no-regression contract).
const taskRowClass = hasSegmentPillRow ? "task-row has-segments" : "task-row";
html += `
<div class="task-row">
<div class="${taskRowClass}">
<span class="task-icon"><span class="status-dot ${task.status}"></span></span>
<span class="task-actions">${eyeHtml}</span>
<span class="task-id status-${task.status}">${escapeHtml(task.taskId)}${showRepos ? repoBadgeHtml(tRepo, "repo-badge-task") : ""}</span>
Expand All @@ -960,6 +1068,7 @@ function renderLanesTasks(batch, sessions) {
<span>${progressHtml}</span>
<span class="task-step">${stepHtml}${workerHtml}</span>
${titleHtml}
${segmentPillRowHtml}
</div>`;
html += reviewerRowHtml;
}
Expand Down Expand Up @@ -1054,7 +1163,15 @@ function renderMergeAgents(batch, sessions) {
}

let html = '<table class="merge-table"><thead><tr>';
html += '<th>Wave</th><th>Status</th><th>Session</th><th>Telemetry</th><th>Session ID</th><th>Details</th>';
// TP-197 post-merge fold: removed 'Session ID' and 'Details' columns.
// SESSION ID was hardcoded to '—' in every row — dead weight.
// DETAILS only populated for `mr.failureReason` (rare failure cases);
// for the common all-merges-succeeded case it's always '—' too.
// When a real failure happens, the operator sees status='failed' in
// the Status column and can dig into engine logs for the reason —
// we'll re-add a focused DETAILS column if/when we have meaningful
// structured failure-reason data to surface in the dashboard table.
html += '<th>Wave</th><th>Status</th><th>Session</th><th>Telemetry</th>';
html += '</tr></thead><tbody>';

// Track sessions shown in wave result rows so we don't duplicate them below
Expand Down Expand Up @@ -1135,10 +1252,6 @@ function renderMergeAgents(batch, sessions) {
html += `<td class="merge-session-cell">${effectiveAlive ? escapeHtml(effectiveSession) : "—"}</td>`;
// Full telemetry cell
html += `<td class="merge-telemetry-cell">${mergeTelemetryHtml(mergeTel, effectiveAlive)}</td>`;
html += `<td>`;
html += '<span class="merge-no-data">—</span>';
html += `</td>`;
html += `<td class="merge-detail-cell">${mr.failureReason ? escapeHtml(mr.failureReason) : "—"}</td>`;
html += `</tr>`;

// Per-repo sub-rows: show when workspace mode has repo results
Expand All @@ -1159,8 +1272,6 @@ function renderMergeAgents(batch, sessions) {
html += `<td><span class="status-badge ${rrStatusCls}">${rr.status}</span></td>`;
html += `<td class="merge-session-cell">${rrLanes}</td>`;
html += `<td></td>`; /* telemetry placeholder */
html += `<td></td>`; /* attach placeholder */
html += `<td class="merge-detail-cell">${rrDetail}</td>`;
html += `</tr>`;
}
}
Expand All @@ -1177,8 +1288,6 @@ function renderMergeAgents(batch, sessions) {
html += `<td class="merge-session-cell">${escapeHtml(sess)}</td>`;
// Full telemetry cell for active merge session
html += `<td class="merge-telemetry-cell">${mergeTelemetryHtml(sessTel, true)}</td>`;
html += `<td>—</td>`;
html += `<td>—</td>`;
html += `</tr>`;
}

Expand Down
Loading