From a2488f435492b9a2f298136117d41956b0c7a73d Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Sun, 14 Jun 2026 01:25:57 -0300 Subject: [PATCH 01/67] docs: ADRs for fork divergences vs upstream Record only actual discrepancies vs unclebob/swarm-forge, one ADR each: - 0001 permanent fork, synced by merge (sync policy: merge not rebase, additive, rerere) - 0002 idle gate + clear-first Stop-hook delivery (the only engine discrepancy; presence side channel for the awake notification) Plus CONTEXT.md: minimal glossary of fork-specific terms only. Co-Authored-By: Claude Opus 4.8 (1M context) --- CONTEXT.md | 17 +++++++++++++ .../0001-permanent-fork-synced-by-merge.md | 12 ++++++++++ ...0002-idle-gate-and-clear-first-delivery.md | 24 +++++++++++++++++++ 3 files changed, 53 insertions(+) create mode 100644 CONTEXT.md create mode 100644 docs/adr/0001-permanent-fork-synced-by-merge.md create mode 100644 docs/adr/0002-idle-gate-and-clear-first-delivery.md diff --git a/CONTEXT.md b/CONTEXT.md new file mode 100644 index 0000000..0406eb8 --- /dev/null +++ b/CONTEXT.md @@ -0,0 +1,17 @@ +# SwarmForge Fork + +A permanent fork of `unclebob/swarm-forge` (rationale in `docs/adr/`). This glossary holds only terms whose fork-specific meaning is already settled; terms are added as decisions are made, not in advance. + +## Language + +**Idle gate**: +The rule that a role does nothing until it receives a handoff — no startup work, scanning, installing, or self-assigned tasks. The single line is "Wait for a handoff. Do not act without one." +_Avoid_: startup guard, wait condition + +**Ready notification** (presence signal): +The startup "I'm awake" message each role sends to the specifier. Informational only — it tells the operator the role launched. Stamped a distinct `presence` type and excluded from the _Delivery sequence_; in the fork's idle model readiness is implicit (a role at idle with an empty queue is ready). +_Avoid_: awake handoff, ready handoff + +**Delivery sequence**: +The fork's Stop-hook steps that start a work handoff on an idle receiver: `/clear` → re-inject the role bundle (`codex`/`grok` only) → send the task message. Runs only for work handoffs, never for presence pings. (Upstream instead types the message straight into the terminal with no clear.) +_Avoid_: inject, dispatch diff --git a/docs/adr/0001-permanent-fork-synced-by-merge.md b/docs/adr/0001-permanent-fork-synced-by-merge.md new file mode 100644 index 0000000..a71f38c --- /dev/null +++ b/docs/adr/0001-permanent-fork-synced-by-merge.md @@ -0,0 +1,12 @@ +--- +status: accepted +--- + +# Permanent fork of unclebob/swarm-forge, synced by merge + +This repo is a permanent fork of `unclebob/swarm-forge` (remote `upstream`); nothing is contributed back. Upstream moves fast, so we keep current by **merging** `upstream/` into our branches — never rebasing — because the fork is published/shared and rebasing would rewrite shared history and re-surface every conflict on each sync. `git rerere` is enabled (`rerere.enabled`, `rerere.autoupdate`) so conflict resolutions replay automatically. Every divergence should be **additive** (a new file or an appended rule) and recorded as its own ADR in this directory; a non-additive edit to an upstream line is a conscious, documented cost. Two branches are maintained: `main` (shared scripts + these docs) and `six-pack` (runnable: role prompts, `swarmforge.conf`, templates). + +## Considered options + +- **Rebase onto upstream** — rejected: the fork is shared/published; rebasing rewrites history others may track and re-resolves every conflict each cycle. +- **Snapshot upstream and stop tracking** — rejected: the goal is to stay current with a fast-moving upstream, not freeze it. diff --git a/docs/adr/0002-idle-gate-and-clear-first-delivery.md b/docs/adr/0002-idle-gate-and-clear-first-delivery.md new file mode 100644 index 0000000..21e639d --- /dev/null +++ b/docs/adr/0002-idle-gate-and-clear-first-delivery.md @@ -0,0 +1,24 @@ +--- +status: accepted +--- + +# Idle gate and clear-first delivery via a per-role Stop hook + +The fork uses upstream's handoff harness as-is (queue, scripts, append-only `logbook.jsonl`); the only engine discrepancy is here. Upstream agents do setup work at startup and never clear context between tasks (delivery is an immediate `tmux send-keys`; "don't interrupt" relies on the terminal app buffering typed input). The fork instead requires each role to (1) do nothing until it receives a handoff and (2) begin every task with a cleared context: + +- **Idle gate** — a prompt rule ("Wait for a handoff. Do not act without one.") plus removal of the startup-install directives from role prompts (install work moves to the setup skill). Pure additive prompt edits. +- **Clear-first delivery** — a per-role **Stop hook** that, on idle, drains a durable inbox: if a *work* handoff is waiting it runs `/clear` → re-inject the role bundle (`codex`/`grok` only; for `claude` the bundle lives in the system prompt and survives `/clear`) → deliver the message. This needs one **non-additive** transport change: a handoff must be dropped into a durable inbox the hook reads, not typed straight into the terminal, so `/clear` cannot race buffered input. That single redirect is the recurring sync-friction point. + +"Ready" is therefore implicit (idle + empty queue = ready). Upstream's startup "I'm awake" ping is kept only as an operator-visible **presence** signal — stamped a distinct `presence` type and excluded from the clear-first path, so the Stop hook never clears for it. + +## Considered options + +- **Keep a full fork-owned harness for delivery** — rejected: a parallel harness conflicts with upstream's actively-developed one on every sync, so we ride upstream's and diverge only as above. +- **Rely on upstream's app-buffering model, skip `/clear`** — rejected: loses the required per-task context reset. +- **Orchestrator-in-code** (`docs/proposals/2026-06-11-factory-line-refactor.md`) — deferred as a future bet; it maximizes divergence and is a re-architecture, not a sync move. + +## Open + +- Inbox location: reuse upstream's `.swarmforge/handoffs/queue/` vs a fork-owned dir. +- Suppress upstream's immediate send-keys nudge globally vs per-cleared-role. +- `/clear` cost is backend-dependent; the current `six-pack` `swarmforge.conf` runs all six roles on `codex`, so bundle re-injection is required today. From 13fb738f385c7191151b5e99618196ec8e139281 Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Sun, 14 Jun 2026 01:45:52 -0300 Subject: [PATCH 02/67] docs(adr-0002): settle two-case delivery; universal re-injection; Claude Code first - idle/busy marker is the one new piece; reuse upstream's queue as the inbox - two cases: busy -> Stop hook delivers on next stop; idle -> deliver now (clear first either way) - /clear clears the session for any agent -> re-injection is universal (no backend split) - implement via Claude Code hooks first; codex/grok delivery pending Co-Authored-By: Claude Opus 4.8 (1M context) --- CONTEXT.md | 2 +- ...0002-idle-gate-and-clear-first-delivery.md | 31 ++++++++++++------- 2 files changed, 20 insertions(+), 13 deletions(-) diff --git a/CONTEXT.md b/CONTEXT.md index 0406eb8..02b5328 100644 --- a/CONTEXT.md +++ b/CONTEXT.md @@ -13,5 +13,5 @@ The startup "I'm awake" message each role sends to the specifier. Informational _Avoid_: awake handoff, ready handoff **Delivery sequence**: -The fork's Stop-hook steps that start a work handoff on an idle receiver: `/clear` → re-inject the role bundle (`codex`/`grok` only) → send the task message. Runs only for work handoffs, never for presence pings. (Upstream instead types the message straight into the terminal with no clear.) +The steps that start a work handoff on a receiver: `/clear` → re-inject the role bundle → send the task message. Runs for work handoffs only, never for presence pings. Delivered immediately if the receiver is idle, or by its Stop hook when it next stops if busy. (Upstream instead types the message straight into the terminal with no clear.) _Avoid_: inject, dispatch diff --git a/docs/adr/0002-idle-gate-and-clear-first-delivery.md b/docs/adr/0002-idle-gate-and-clear-first-delivery.md index 21e639d..960538b 100644 --- a/docs/adr/0002-idle-gate-and-clear-first-delivery.md +++ b/docs/adr/0002-idle-gate-and-clear-first-delivery.md @@ -2,23 +2,30 @@ status: accepted --- -# Idle gate and clear-first delivery via a per-role Stop hook +# Idle gate and clear-first delivery -The fork uses upstream's handoff harness as-is (queue, scripts, append-only `logbook.jsonl`); the only engine discrepancy is here. Upstream agents do setup work at startup and never clear context between tasks (delivery is an immediate `tmux send-keys`; "don't interrupt" relies on the terminal app buffering typed input). The fork instead requires each role to (1) do nothing until it receives a handoff and (2) begin every task with a cleared context: +The fork uses upstream's handoff harness as-is (queue, scripts, `logbook.jsonl`); the only engine discrepancy is **delivery**. Upstream does setup work at startup and never clears context between tasks — it types each handoff straight into the terminal and lets the terminal buffer it whether the agent is working or not. The fork instead requires every role to (1) do nothing until it receives a handoff and (2) start each task from a cleared session. -- **Idle gate** — a prompt rule ("Wait for a handoff. Do not act without one.") plus removal of the startup-install directives from role prompts (install work moves to the setup skill). Pure additive prompt edits. -- **Clear-first delivery** — a per-role **Stop hook** that, on idle, drains a durable inbox: if a *work* handoff is waiting it runs `/clear` → re-inject the role bundle (`codex`/`grok` only; for `claude` the bundle lives in the system prompt and survives `/clear`) → deliver the message. This needs one **non-additive** transport change: a handoff must be dropped into a durable inbox the hook reads, not typed straight into the terminal, so `/clear` cannot race buffered input. That single redirect is the recurring sync-friction point. +**Idle gate** — a prompt rule ("Wait for a handoff. Do not act without one.") plus removal of the startup-install directives from role prompts (install work moves to a separate setup skill). Additive prompt edits. -"Ready" is therefore implicit (idle + empty queue = ready). Upstream's startup "I'm awake" ping is kept only as an operator-visible **presence** signal — stamped a distinct `presence` type and excluded from the clear-first path, so the Stop hook never clears for it. +**Clear-first delivery** — `/clear` clears the session for **any** agent, so it cannot be sent to a working agent. Delivery therefore must know whether the receiver is idle or busy. Upstream tracks no such state, so the fork adds a minimal per-role **idle/busy marker**. Delivery then has two cases, both running `/clear` → re-inject the role bundle → send the task message: + +- receiver **busy** — the handoff waits in upstream's queue (`.swarmforge/handoffs/queue/`); the receiver's **Stop hook** delivers it when the agent next stops. +- receiver **idle** — deliver immediately, because no stop will occur to trigger the hook. + +The marker is set *busy* when a delivery starts and *idle* when the Stop hook finds the queue empty; the hook re-checks the queue before declaring idle to close the narrow "went idle just as a sender judged it busy" race. + +**Re-injection is universal.** `/clear` wipes the session regardless of backend, so the role bundle is always re-sent after `/clear`. + +**Claude Code first.** Both the marker and the delivery ride Claude Code's hook system (the Stop hook). The fork's delivery replaces upstream's immediate terminal-typing only for the roles it manages. The `claude` backend is supported now; roles on `codex`/`grok` keep upstream's delivery until their hook-based equivalent is built — **pending implementation**. + +Ready is implicit (idle + empty queue = ready). Upstream's startup "I'm awake" ping is kept only as an operator-visible **presence** signal — stamped a distinct `presence` type and excluded from the clear-first path, so the Stop hook never clears for it. ## Considered options -- **Keep a full fork-owned harness for delivery** — rejected: a parallel harness conflicts with upstream's actively-developed one on every sync, so we ride upstream's and diverge only as above. -- **Rely on upstream's app-buffering model, skip `/clear`** — rejected: loses the required per-task context reset. -- **Orchestrator-in-code** (`docs/proposals/2026-06-11-factory-line-refactor.md`) — deferred as a future bet; it maximizes divergence and is a re-architecture, not a sync move. +- **Rely on upstream's type-into-terminal model and skip `/clear`** — rejected: loses the required per-task session reset. +- **Orchestrator-in-code** (`docs/proposals/2026-06-11-factory-line-refactor.md`) — deferred; a re-architecture, not a sync move. -## Open +## Pending implementation -- Inbox location: reuse upstream's `.swarmforge/handoffs/queue/` vs a fork-owned dir. -- Suppress upstream's immediate send-keys nudge globally vs per-cleared-role. -- `/clear` cost is backend-dependent; the current `six-pack` `swarmforge.conf` runs all six roles on `codex`, so bundle re-injection is required today. +- `codex`/`grok` hook-based delivery (Claude Code first). The current `six-pack` `swarmforge.conf` runs all six roles on `codex`, so until that is built — or those roles move to `claude` — clear-first delivery applies only to `claude` roles. From 6e75b82d4b81f71bb9879816addcb78311b334a7 Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Sun, 14 Jun 2026 02:59:00 -0300 Subject: [PATCH 03/67] docs(adr): capture five accepted pipeline divergences + one open item MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ADRs 0003-0008 define fork divergences vs upstream (definition only, not implementation): - 0003 setup is a one-time, stack-aware skill (additive file, no merge conflict); run path does no project setup - 0004 rework routes back to the stage whose decision it exposes; local fixes stay with finder; at most one bounce - 0005 QA refutes (assumes build fails spec + tests too weak) rather than confirms; includes conversion fidelity - 0006 harness-enforced holdout — status: proposed (open item) - 0007 UX Engineer role that fixes against UX Intent (six-pack) - 0008 terminal Integrator lands only behind a green CI gate, no local-merge fallback CONTEXT.md gains glossary terms: Setup skill, Integrator, UX Engineer, UX Intent, Refuting QA, Back-routing. Co-Authored-By: Claude Opus 4.8 (1M context) --- CONTEXT.md | 24 +++++++++++++++++++ docs/adr/0003-setup-is-a-one-time-skill.md | 24 +++++++++++++++++++ docs/adr/0004-rework-routes-back.md | 23 ++++++++++++++++++ docs/adr/0005-qa-refutes-not-confirms.md | 23 ++++++++++++++++++ docs/adr/0006-harness-enforced-holdout.md | 19 +++++++++++++++ docs/adr/0007-ux-engineer-role.md | 27 ++++++++++++++++++++++ docs/adr/0008-integrator-role.md | 26 +++++++++++++++++++++ 7 files changed, 166 insertions(+) create mode 100644 docs/adr/0003-setup-is-a-one-time-skill.md create mode 100644 docs/adr/0004-rework-routes-back.md create mode 100644 docs/adr/0005-qa-refutes-not-confirms.md create mode 100644 docs/adr/0006-harness-enforced-holdout.md create mode 100644 docs/adr/0007-ux-engineer-role.md create mode 100644 docs/adr/0008-integrator-role.md diff --git a/CONTEXT.md b/CONTEXT.md index 02b5328..c0f96ba 100644 --- a/CONTEXT.md +++ b/CONTEXT.md @@ -15,3 +15,27 @@ _Avoid_: awake handoff, ready handoff **Delivery sequence**: The steps that start a work handoff on a receiver: `/clear` → re-inject the role bundle → send the task message. Runs for work handoffs only, never for presence pings. Delivered immediately if the receiver is idle, or by its Stop hook when it next stops if busy. (Upstream instead types the message straight into the terminal with no clear.) _Avoid_: inject, dispatch + +**Setup skill**: +The one-time, stack-aware step that makes a project swarm-ready — installs the project's language quality tools, enables session tracking, grants the agents' permissions, pins skill versions. Ships inside the swarm install and is the first thing the operator runs. The run path (`./swarm`) does no project setup; it stops if the skill has not run. (Upstream instead installs tooling per-role at startup.) +_Avoid_: preflight, bootstrap, onboarding + +**Integrator**: +The terminal role that lands finished work. From the QA-approved commit it opens a pull request, gates on CI, merges only on green, runs the post-merge verification, and notifies the specifier — one PR per feature. It never merges locally: CI is a hard precondition, so a project without CI is not swarm-ready (setup ensures CI; see [[project-fork-divergence-adr-structure]] / ADR 0003). CI failures route to the owning role via [[back-routing]]. (Upstream has no integrator — the specifier merges ad hoc.) +_Avoid_: merger, releaser, deployer + +**UX Engineer** (six-pack only): +The role, immediately after the coder, that runs the built product and fixes visual/usability mismatches in rendering code (leaving a regression check behind) — an engineer that fixes, not a flag-only reviewer. Checks against the feature's _UX Intent_ and any optional design inputs the feature references. Skips (passes through) when the feature has no UX Intent. Routes back to the coder via [[back-routing]] when a fix needs a model-state change. Framework-agnostic; the visual-testing tool is named by the constitution. +_Avoid_: UX Reviewer, designer + +**UX Intent**: +The section the specifier authors inline in the feature file stating, in concrete observable terms, what a feature should look and feel like. Part of the swarm and the _UX Engineer_'s primary target. Distinct from optional project design inputs (DESIGN.md, EXPERIENCE.md, mockups) — those are not swarm-owned; the specifier merely references them from the feature file when they exist. +_Avoid_: design spec, UX requirements + +**Refuting QA**: +QA's posture in the fork: assume the build does not meet the spec and the acceptance tests are too weak to notice, until proven otherwise — attack the specified contract rather than run a checklist and confirm. Bounded by the spec (unspecified gaps route back to the specifier, they are not QA pass/fail). Includes _conversion fidelity_: a QA procedure converted into an executable script must encode the procedure's full intent, not a green version that asserts nothing (_test theater_). (Upstream QA confirms the spec is met and fixes what fails.) +_Avoid_: verification, acceptance check, confirm + +**Back-routing**: +Sending rework back to the stage whose decision it exposes as flawed, instead of resolving it where it was found. The trigger is any finding that an earlier stage's work must change — a bug, a refactor blocked by a bad earlier decision, or a design/spec revision. Applies only to _structural_ rework (re-opening an earlier stage's job: an ambiguous/missing spec, a weak/missing test, a design that can't hold the behavior); _local_ work the finder can resolve without re-opening an earlier decision stays with the finder. Routes back at most once. (Upstream fixes everything in place.) +_Avoid_: rejection, escalation, bounce, defect back-routing diff --git a/docs/adr/0003-setup-is-a-one-time-skill.md b/docs/adr/0003-setup-is-a-one-time-skill.md new file mode 100644 index 0000000..45f17e9 --- /dev/null +++ b/docs/adr/0003-setup-is-a-one-time-skill.md @@ -0,0 +1,24 @@ +--- +status: accepted +--- + +# Setup is a one-time skill, not in-execution work + +Adapting a project to the swarm — installing the project's language quality tools (mutation, CRAP, DRY, the Acceptance Pipeline commands), enabling session tracking, granting the permissions the agents need, pinning skill versions — lives in a **setup skill** that ships inside the swarm install and is the first thing the operator runs. The run path does no project setup. + +**Execution installs nothing.** `./swarm` still fetches its own code when missing (the program obtaining itself, not project setup) and still does per-launch plumbing (worktrees, sessions, copying constitution files). It never adapts the project to its stack. If the project has not been set up, `./swarm` stops and says so rather than installing anything. + +**The only edits to upstream files are four role-prompt lines.** The "At startup, install the language tools" directives in `coder`, `QA`, `cleaner`, and `hardender` are removed; that install work moves into the setup skill and runs once. ADR 0002 already removes these same lines for the idle gate (a role does nothing until handed off); here they go for a second, complementary reason — tool install is a one-time setup step, not per-task startup work. The removal is the seam between the two decisions; neither owns it alone. + +**Why a skill rather than functions added to the launch script.** A skill is a new fork-owned file, so it adds zero upstream merge-conflict surface — exactly the additive divergence ADR 0001 asks for. Adding setup functions inside `swarmforge.sh` would instead edit an upstream-tracked file, a permanent conflict point on every sync. A skill also lets setup *reason about the stack* (which tools for Go vs Java vs Clojure, which gates matter), which a deterministic script cannot. + +**Why replace rather than overlay.** Setup is an explicit one-time step; the run path stays pure "start the agents." The accepted cost is that the swarm no longer self-installs project tooling on first run — the operator runs the setup skill once before the first `./swarm`. Any setup step this moves out of the run path is named and documented so the divergence stays auditable. + +## Considered options + +- **Add setup as functions inside `swarmforge.sh`** — rejected: edits an upstream-tracked file (a permanent merge-conflict surface, against ADR 0001's additive rule) and a deterministic script cannot adapt to the project's stack. +- **Overlay — skill adds the fork's extras while execution keeps installing** — rejected: leaves setup split across two places and keeps the run path doing setup work, defeating the purpose. + +## Pending implementation + +- The skill itself: stack detection, the exact tooling/permissions/pins it writes, how it is shipped inside the install, and the "swarm-ready" marker `./swarm` checks before launching. diff --git a/docs/adr/0004-rework-routes-back.md b/docs/adr/0004-rework-routes-back.md new file mode 100644 index 0000000..192dba4 --- /dev/null +++ b/docs/adr/0004-rework-routes-back.md @@ -0,0 +1,23 @@ +--- +status: accepted +--- + +# Rework routes back to its cause + +Upstream fixes a problem wherever it is found — the QA role's prompt says plainly "fix bugs found by the QA suite." That keeps the line moving but lets fixes pile up downstream of the stage that caused them, and the responsible stage never learns it did its job wrong. The fork instead sends the work **back to the stage whose decision it exposes as flawed**, so the fix lands at the cause. + +The trigger is not only a defect. Any finding that an earlier stage's work must change routes back — a failing behavior (a bug), a refactor blocked because the structure rests on a bad earlier decision, or a design/spec revision surfaced when a later stage tries to hold a behavior the specification can't carry. A defect is the most obvious case, not the only one. + +**Only structural rework routes back.** It routes back when resolving it means re-opening an earlier stage's job — an ambiguous or missing specification, a weak or missing acceptance test, a design that can't hold the behavior. The stage that owns that work gets it back and corrects the root cause. **Local** work — anything the finder can resolve without re-opening an earlier stage's decision — stays with the finder. Routing a contained, local change backward only adds a round trip and teaches no one. + +**Rework routes back at most once.** If it comes back still unresolved, the finder resolves it in place and flags it. This caps the cost and stops two stages volleying the same item indefinitely. + +## Considered options + +- **Route every finding back to its origin** — rejected: the line ping-pongs and a trivial local change becomes a round trip that teaches nothing; the cost is paid for findings that don't carry a lesson. +- **Keep upstream's fix-in-place** — rejected: rework accumulates as downstream patches and the stage that caused it is never corrected, so the same class of problem recurs. + +## Pending implementation + +- How a finding is attributed to an origin stage (the line must be able to trace it back to the spec, test, or design that owns it). +- Where the rule lives in the role prompts (runnable change, `six-pack`). diff --git a/docs/adr/0005-qa-refutes-not-confirms.md b/docs/adr/0005-qa-refutes-not-confirms.md new file mode 100644 index 0000000..6b5c560 --- /dev/null +++ b/docs/adr/0005-qa-refutes-not-confirms.md @@ -0,0 +1,23 @@ +--- +status: accepted +--- + +# QA refutes rather than confirms + +Upstream QA verifies that the accepted specification is met and fixes what fails — a *confirm* posture. It converts the specifier's written QA procedures into executable scripts and runs them through the real user interface. The fork flips the posture: QA assumes the build does **not** meet the spec and the acceptance tests are too weak to notice, until it proves otherwise. Its job is to make the claim "this meets the spec and the tests prove it" *fail*. + +**Refute against the spec, not beyond it.** QA attacks the specified contract — it hunts specified-but-untested behavior, proves the acceptance tests too weak to catch a real violation, and throws inputs designed to break the specified behavior. It does **not** invent new requirements. A genuinely unspecified gap it stumbles on is not a QA pass/fail; it is a finding that routes back to the specifier. This keeps QA adversarial but bounded, so it never blocks the line on behavior no one agreed to. + +**Conversion fidelity.** When QA turns the specifier's written procedures into executable scripts, the script must encode the procedure's full intent — not a weakened version that passes. QA refutes its *own* conversion. This is the highest-leverage guard in the line because the QA end-to-end suite is the one suite the hardener's mutation testing explicitly does not cover: a weak conversion ("test theater" — a green test that asserts nothing real) that hides there is caught by nothing else. + +**Findings route back; QA owns the attack, not the routing.** A structural weakness QA surfaces routes back to its cause (a weak acceptance test or an ambiguous spec → the specifier); a local defect QA fixes in place — per the back-routing decision. Refuting QA is the engine that *generates* structural findings; it needs no routing rule of its own. + +## Considered options + +- **Keep upstream's confirm posture** — rejected: a confirming QA passes test theater (green suites that assert nothing); the defects that survive an otherwise-complete pipeline are exactly the ones a checklist confirms. +- **Refute beyond the spec** — rejected: unbounded; QA becomes a fuzzer that blocks the line on unspecified behavior. Unspecified gaps route back to the specifier instead. + +## Pending implementation + +- Prompt change on `six-pack`. +- Whether QA's converted end-to-end suite should itself be mutation-tested (the hardener currently ignores it) — the objective way to detect a theatrical conversion rather than relying on QA's self-judgment. diff --git a/docs/adr/0006-harness-enforced-holdout.md b/docs/adr/0006-harness-enforced-holdout.md new file mode 100644 index 0000000..3e34cca --- /dev/null +++ b/docs/adr/0006-harness-enforced-holdout.md @@ -0,0 +1,19 @@ +--- +status: proposed +--- + +# Harness-enforced holdout of the QA suite + +**Open item — not decided.** Recorded so the gap is visible; needs a decision before any work. + +Upstream already holds the end-to-end QA suite back from the coder: the coder's prompt says "ignore the specifier's end-to-end QA suite." But the wall is **honor-system only** — roles run in separate worktrees, yet the coder bases its work on the specifier's accepted commit, and the QA suite is part of that commit. The files sit in the coder's own working tree; nothing but a prompt instruction stops it from reading them. + +The "AI Software Factory" reference argues a reachable validation criterion is a gamed one — "if the coding agent can see the tests, it will game them" — so the protection that counts is *mechanical*, not instructional. This item proposes making the holdout **harness-enforced**: the QA suite is physically absent from the coder's reach (for example, the specifier commits it on a path or branch the coder never bases on, or the harness strips QA-suite files from the coder's worktree), so "ignore it" becomes "cannot reach it." + +It is filed as a candidate, not a decision, because enforcing a true holdout in a shared-git, peer-role swarm is non-trivial and may not be worth its cost: the fork already backs the visible test layers with mutation testing and an adversarial (refuting) QA suite, which is detection rather than prevention. Whether to add prevention on top is the open question. + +## Open questions + +- Can the QA suite be kept out of the coder's worktree without breaking the specifier→coder→QA handoff flow (the coder must still build against the spec, just not the QA suite)? +- Is harness-enforced prevention worth it given the fork already has mutation + refuting QA as detection? +- Does the same concern apply to the Gherkin acceptance tests, or only the QA suite? (The coder must see and build the Gherkin runner, so those likely cannot be walled off.) diff --git a/docs/adr/0007-ux-engineer-role.md b/docs/adr/0007-ux-engineer-role.md new file mode 100644 index 0000000..b3333b2 --- /dev/null +++ b/docs/adr/0007-ux-engineer-role.md @@ -0,0 +1,27 @@ +--- +status: accepted +--- + +# UX Engineer role and UX Intent + +Upstream has no UX role — nothing in the line owns whether the product is *usable*, only whether it is correct. The fork adds a **UX Engineer** (six-pack only) that runs the built product and **fixes** visual and usability mismatches in rendering code, leaving a regression check behind. It is an engineer, not a flag-only reviewer: the fork's pattern is that every stage fixes in place and leaves a durable artifact, so a report-only role is the anti-pattern it rejects. + +**It checks against UX Intent.** The specifier authors a **UX Intent** section inline in the feature file — concrete, observable statements of what the feature should look and feel like. UX Intent is part of the swarm and travels with the feature. A feature with no UX Intent is the signal to skip: the UX Engineer passes straight through to the next stage, the same "no work, no handoff" pattern used elsewhere. + +**Optional design inputs are referenced, not owned.** When a project supplies design artifacts — a DESIGN.md (visual system), an EXPERIENCE.md (interaction and feel), mockups (concrete visual targets) — the specifier **references** them from the feature file, and the UX Engineer consults them alongside UX Intent. These are optional project inputs; the swarm neither defines, scaffolds, nor requires them. This replaces the earlier design's automatic "nearest-file" resolution with an explicit reference from the one canonical artifact. + +**Framework-agnostic.** The role defines the *class* of check — the running product matches its stated UX — and leaves the specific visual-testing tool to the project's constitution. No terminal-UI assumptions live in the role. + +**Placement and routing.** The UX Engineer sits immediately after the coder, so the downstream roles (cleaner, architect, hardener, QA) see implementation and rendering code together in one pass rather than running twice. When a mismatch cannot be fixed in rendering alone and needs a model-state change, it routes back to the coder — using the back-routing rule already decided (`0004`), not a separate mechanism. + +## Considered options + +- **A flag-only UX reviewer** — rejected: produces a handback with no durable artifact; the fork's pattern is fix-in-place. +- **The swarm owns/scaffolds DESIGN.md and friends** — rejected: those are optional project inputs, referenced not owned; the swarm should not impose a design system. +- **Automatic nearest-file resolution of design docs** — superseded: explicit references from the feature file are clearer and need no walk-up. +- **Place the UX role late (after the hardener)** — rejected: prior batch evidence showed it made the cleaner, architect, and hardener each run twice per feature. + +## Pending implementation + +- Six-pack only: new `ux-engineer` role prompt; UX Intent authoring in the specifier and the feature template; coder reads UX Intent; `swarmforge.conf` adds the window after the coder. +- Routing follows `0004`. diff --git a/docs/adr/0008-integrator-role.md b/docs/adr/0008-integrator-role.md new file mode 100644 index 0000000..0f9b6db --- /dev/null +++ b/docs/adr/0008-integrator-role.md @@ -0,0 +1,26 @@ +--- +status: accepted +--- + +# Integrator role lands work behind a CI gate + +Upstream has no integrator: when QA signals done, the **specifier** merges the work ad hoc (a local `git merge`) and asks for the next feature. There is no gate between "QA passed" and "landed on the main branch." The fork adds a dedicated **integrator** as the terminal stage of the line that owns *landing* the work — and nothing lands except through a green CI gate. + +**Landing is PR + CI, with no fallback.** From the QA-approved commit the integrator opens a pull request, watches CI, and merges only when CI is green; then it runs the post-merge verification and notifies the specifier. It never merges locally — a local merge is exactly what the specifier already did, so the integrator's whole value is that the main branch only ever receives green-CI'd work. **CI is therefore a hard precondition, not optional:** a project without CI is not swarm-ready, and ensuring CI is in place belongs to project setup (`0003`). + +**One PR per feature.** Rework updates the same PR; a second PR is never opened for the same feature. + +**Failure routing reuses back-routing.** A CI failure routes to the role that owns it — a failing test to the coder, a failing cleanliness gate to the cleaner, a failing architecture check to the architect; a trivially autofixable failure (lint/format) the integrator fixes in place on the PR branch and re-runs. This is the back-routing rule already decided (`0004`) with the integrator as the finder, capped the same way: after the cap it stops and reports rather than looping. + +**The specifier stops merging.** Merging moves entirely to the integrator, so the specifier no longer needs the main checkout — it moves from the `master` worktree to its own worktree and starts each feature from a clean reset to the default branch. + +## Considered options + +- **Keep the specifier merging (no integrator)** — rejected: conflates deciding *what* to build with landing it safely, and provides no gate before the main branch. +- **Local-merge fallback when CI is absent** — rejected: a local merge is what the specifier already does; the integrator exists for the green-CI gate, so CI is required, not optional. + +## Pending implementation + +- Runnable branches (`six-pack`; `four-pack` where present): new terminal `integrator` role; `swarmforge.conf` window; specifier worktree change and removal of its merge step. +- The PR/CI mechanism (platform, e.g. `gh`) named at implementation. +- CI-in-place enforced as a setup precondition (`0003`); routing per `0004`. From 21ff2e06c1e00817ee2876558793d62404533fec Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Sun, 14 Jun 2026 03:09:04 -0300 Subject: [PATCH 04/67] docs(adr-0006): decide harness-enforced holdout via sparse-checkout Settle the open item as accepted. The end-to-end QA suite is held out mechanically, not by prompt instruction: - mechanism: git sparse-checkout excludes the suite's pinned path from each role worktree (absent from disk, still tracked in the commit) -- rejecting rm-from-worktree (commit would drop it) and separate-branch (more flow, no extra protection) - scope: hidden from all implementer roles; visible only to the specifier (authors it) and QA (runs it) - precondition: specifier writes the suite under a pinned path; existing coder "ignore it" line stays as defense-in-depth Backed by verification_loop.md: holdout leakage "must be enforced architecturally," not instructionally. Adds CONTEXT.md term "QA holdout". Co-Authored-By: Claude Opus 4.8 (1M context) --- CONTEXT.md | 4 ++++ docs/adr/0006-harness-enforced-holdout.md | 28 +++++++++++++++-------- 2 files changed, 23 insertions(+), 9 deletions(-) diff --git a/CONTEXT.md b/CONTEXT.md index c0f96ba..20736ee 100644 --- a/CONTEXT.md +++ b/CONTEXT.md @@ -39,3 +39,7 @@ _Avoid_: verification, acceptance check, confirm **Back-routing**: Sending rework back to the stage whose decision it exposes as flawed, instead of resolving it where it was found. The trigger is any finding that an earlier stage's work must change — a bug, a refactor blocked by a bad earlier decision, or a design/spec revision. Applies only to _structural_ rework (re-opening an earlier stage's job: an ambiguous/missing spec, a weak/missing test, a design that can't hold the behavior); _local_ work the finder can resolve without re-opening an earlier decision stays with the finder. Routes back at most once. (Upstream fixes everything in place.) _Avoid_: rejection, escalation, bounce, defect back-routing + +**QA holdout**: +The end-to-end QA suite kept physically out of reach of every role that shapes the implementation, so it stays a blind test. The harness sparse-checks-out the suite's pinned path from each role worktree except the specifier's (which authors it) and QA's (which runs it) — present in the commit, absent from disk. Distinct from upstream's prompt-level "ignore it," which leaves the files in the coder's worktree. Covers only the end-to-end QA suite; the Gherkin acceptance tests stay visible because the coder builds and runs them. (Upstream walls it by instruction only.) +_Avoid_: hidden tests, secret suite, test isolation diff --git a/docs/adr/0006-harness-enforced-holdout.md b/docs/adr/0006-harness-enforced-holdout.md index 3e34cca..daa496c 100644 --- a/docs/adr/0006-harness-enforced-holdout.md +++ b/docs/adr/0006-harness-enforced-holdout.md @@ -1,19 +1,29 @@ --- -status: proposed +status: accepted --- # Harness-enforced holdout of the QA suite -**Open item — not decided.** Recorded so the gap is visible; needs a decision before any work. +Upstream holds the end-to-end QA suite back from the coder by prompt instruction alone: the coder's prompt says "ignore the specifier's end-to-end QA suite," but the files sit in the coder's own worktree (every worktree is `git worktree add -B … HEAD`, a full checkout of the commit the specifier wrote the suite into). The wall is honor-system. The fork makes it **mechanical**: the QA suite is physically absent from the worktree of every role that shapes the implementation, so "ignore it" becomes "cannot reach it." -Upstream already holds the end-to-end QA suite back from the coder: the coder's prompt says "ignore the specifier's end-to-end QA suite." But the wall is **honor-system only** — roles run in separate worktrees, yet the coder bases its work on the specifier's accepted commit, and the QA suite is part of that commit. The files sit in the coder's own working tree; nothing but a prompt instruction stops it from reading them. +**Why mechanical, not instructional.** The verification-loop reference is explicit that the scenario suite is a *holdout* — "never visible to the code generation agent" — and names the failure mode directly: "holdout leakage … must be enforced architecturally (filesystem isolation, separate repos, access controls)," not by a prompt. A holdout the implementer can read is a holdout the implementer can quietly fit to; the suite then stops being a blind test and QA running it proves nothing. This is the prevention layer that the detection layers (mutation testing + refuting QA, ADR 0005) cannot supply: detection catches a gamed suite after the fact; the wall stops the gaming. -The "AI Software Factory" reference argues a reachable validation criterion is a gamed one — "if the coding agent can see the tests, it will game them" — so the protection that counts is *mechanical*, not instructional. This item proposes making the holdout **harness-enforced**: the QA suite is physically absent from the coder's reach (for example, the specifier commits it on a path or branch the coder never bases on, or the harness strips QA-suite files from the coder's worktree), so "ignore it" becomes "cannot reach it." +**Mechanism: `git sparse-checkout`, not file deletion.** The worktree-prep step the harness already runs sets a sparse-checkout on each role worktree that excludes the QA-suite path. Sparse-checkout makes the file *absent from disk but still tracked in the commit* — so the role cannot read it, yet its commit cannot accidentally drop it downstream. Naive deletion (`rm` from the worktree) was rejected for exactly this reason: the role commits with `git add`, the deletion gets staged, and the suite vanishes for QA. A separate QA-only branch was rejected as more flow change for no extra protection. -It is filed as a candidate, not a decision, because enforcing a true holdout in a shared-git, peer-role swarm is non-trivial and may not be worth its cost: the fork already backs the visible test layers with mutation testing and an adversarial (refuting) QA suite, which is detection rather than prevention. Whether to add prevention on top is the open question. +**Scope: hide from implementers, keep for author and verifier.** The exclusion applies to every worktree *except* the specifier's (`master` — it authors the suite) and QA's (it runs the suite — it is the verifier). Coder, UX Engineer, cleaner, architect, and hardener all touch the implementation before QA and so are walled. The integrator never touches implementation; its worktree is irrelevant either way. -## Open questions +**Precondition: a fixed QA-suite path.** For the harness to exclude the suite it must live at a deterministic path; the specifier writes the end-to-end QA suite under a pinned location (e.g. `qa/`). This is the only added convention. The existing coder-prompt "ignore it" line stays as defense-in-depth. -- Can the QA suite be kept out of the coder's worktree without breaking the specifier→coder→QA handoff flow (the coder must still build against the spec, just not the QA suite)? -- Is harness-enforced prevention worth it given the fork already has mutation + refuting QA as detection? -- Does the same concern apply to the Gherkin acceptance tests, or only the QA suite? (The coder must see and build the Gherkin runner, so those likely cannot be walled off.) +**Scope boundary: only the end-to-end QA suite.** The Gherkin acceptance tests and the acceptance pipeline stay fully visible — the coder builds and runs them. The holdout is the specifier's end-to-end QA suite alone. + +## Considered options + +- **Keep upstream's prompt-level "ignore it" (detection-only)** — rejected: an implementer that can read the holdout can fit to it; the reference doc calls instructional holdouts a leak that must be closed architecturally. Mutation + refuting QA detect a weak suite but do not stop the implementation being shaped to the visible one. +- **Harness deletes the QA path from the worktree** — rejected: the role's commit stages the deletion and the suite disappears downstream for QA. +- **Specifier commits the QA suite to a separate QA-only branch** — rejected: more handoff-graph complexity (QA must merge code + QA branch) for no protection sparse-checkout doesn't already give. + +## Pending implementation + +- Add the sparse-checkout exclusion to the worktree-prep step (`six-pack`/scripts), keyed to skip the specifier(master) and QA worktrees. +- Pin the end-to-end QA-suite path in the specifier prompt. +- Confirm sparse-checkout interacts cleanly with the coder→cleaner→…→QA handoff commits (the excluded path must survive each role's commit untouched). From e1642412c6de9cec99b52b7c0ece6cb1ac554f6b Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Sun, 14 Jun 2026 03:17:53 -0300 Subject: [PATCH 05/67] docs(adr-0009): feature files open with a structured spec header Record the feature-template divergence (idea L) from the real artifact on backup/six-pre-reset, not the idea-file summary: - structured comment header above the Gherkin: TRACKING, CONTRACT, CONSTRAINTS, SEQUENCING, NFR, SIDE EFFECTS, SCOPE (each with an Ask:/Format:); the spec-authoring layer the scenarios can't carry - six-pack adds an 8th section, UX INTENT (home for ADR 0007); four-pack has the 7 only, since the UX Engineer is six-pack-only - address every section; SEQUENCING/SIDE EFFECTS/UX INTENT default to `none` (a deliberate answer; `none` UX INTENT = UX Engineer skips) - impl note: six-pack specifier prompt still says "seven sections" but the template has eight -- fix to eight/all on landing Adds CONTEXT.md term "Spec header". Co-Authored-By: Claude Opus 4.8 (1M context) --- CONTEXT.md | 4 ++++ docs/adr/0009-feature-file-spec-header.md | 26 +++++++++++++++++++++++ 2 files changed, 30 insertions(+) create mode 100644 docs/adr/0009-feature-file-spec-header.md diff --git a/CONTEXT.md b/CONTEXT.md index 20736ee..cfbc8de 100644 --- a/CONTEXT.md +++ b/CONTEXT.md @@ -43,3 +43,7 @@ _Avoid_: rejection, escalation, bounce, defect back-routing **QA holdout**: The end-to-end QA suite kept physically out of reach of every role that shapes the implementation, so it stays a blind test. The harness sparse-checks-out the suite's pinned path from each role worktree except the specifier's (which authors it) and QA's (which runs it) — present in the commit, absent from disk. Distinct from upstream's prompt-level "ignore it," which leaves the files in the coder's worktree. Covers only the end-to-end QA suite; the Gherkin acceptance tests stay visible because the coder builds and runs them. (Upstream walls it by instruction only.) _Avoid_: hidden tests, secret suite, test isolation + +**Spec header**: +The structured block of comment sections the specifier fills in at the top of every feature file, above the Gherkin scenarios — the spec-authoring layer that states what the scenarios cannot: contract, constraints, sequencing, NFRs, side effects, scope (and, six-pack only, _UX Intent_). The scenarios are the contract by example; the spec header is the contract's surrounding intent. Every section is addressed; several default to `none` (a deliberate answer). Comments only, so the Gherkin parser ignores them. (Upstream feature files are pure Gherkin with no header.) +_Avoid_: preamble, comment block, feature description diff --git a/docs/adr/0009-feature-file-spec-header.md b/docs/adr/0009-feature-file-spec-header.md new file mode 100644 index 0000000..9b8b53b --- /dev/null +++ b/docs/adr/0009-feature-file-spec-header.md @@ -0,0 +1,26 @@ +--- +status: accepted +--- + +# Feature files open with a structured spec header + +Upstream feature files are pure Gherkin: a `Feature:` line, then scenarios. The fork prepends a **structured spec header** — a block of comment sections the specifier fills in before writing any scenario, captured in a template (`swarmforge/templates/feature.feature`) that the specifier starts every feature from. + +The header is the **spec-authoring layer** the reference verification loop puts ahead of the scenarios (its Step 1): the Gherkin scenarios are the contract *by example*, but they cannot state what is out of scope, what was assumed, what non-functional targets apply, or what side effects must be observed. The header carries exactly that — the WHAT/WHY around the examples — so those concerns are stated once, up front, where every downstream role reads them. + +**Sections (four-pack):** `TRACKING` (traceability to an issue), `CONTRACT` (every input, every response shape and status, fields deliberately absent), `CONSTRAINTS` (dataset bounds, validation, exclusions), `SEQUENCING` (ordering / async dependencies, defaults `none`), `NFR` (latency, idempotency key+window, in-flight UI, error distinguishability), `SIDE EFFECTS` (public-contract changes, derived artifacts to regenerate, defaults `none`), `SCOPE` (`Does NOT:` exclusions and `ASSUMED:` assumptions). Each section pairs an `Ask:` (the questions that elicit it) with a `Format:` (how to write the answer). + +**Six-pack adds an eighth section, `UX INTENT`**, with four dimensions — Visual Composition, Information Hierarchy, Interaction Feel, State Transitions — written as concrete observable statements. Its content and semantics are owned by ADR 0007; the header is merely its home in the feature file. It is six-pack-only because the UX Engineer that consumes it is six-pack-only. + +**Address every section; do not fill every section.** `SEQUENCING`, `SIDE EFFECTS`, and (six-pack) `UX INTENT` default to `none`. `none` is a deliberate answer, not a skipped one — and for `UX INTENT`, `none` is the signal that tells the UX Engineer to pass through (ADR 0007). The sections are comments (`#`), so the Gherkin parser ignores them and the acceptance pipeline is unaffected. + +## Considered options + +- **Free-form prose header** (what melech-mini-apps feature files actually do) — rejected: no guarantee that scope, NFR, or side effects ever get stated; relies entirely on specifier memory. The structured `Ask:`/`Format:` turns each concern into a forced question. +- **No header (stay pure Gherkin like upstream)** — rejected: scope exclusions, assumptions, and NFRs stay implicit, and 0007's UX Intent would have no documented home in the artifact. +- **Design a fork-specific minimal header** — rejected: the template already exists, is internally coherent, and already integrates the fork's other decisions (UX Intent, side-effects-to-observe for refuting QA). Re-deriving it adds nothing. + +## Pending implementation + +- Template already drafted on `four-pack` (7 sections) and `six-pack` (8, with `UX INTENT`); land both. +- Specifier phase 1 starts from the template and addresses all header sections before scenarios. Fix the stale count in the **six-pack** specifier prompt: it says "complete all seven header sections" but the six-pack template has eight — change to "eight" (or "all"). Four-pack's "seven" is correct. From 8929068299bdbccbc63554b6c474f07ba7ddd414 Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Sun, 14 Jun 2026 03:22:12 -0300 Subject: [PATCH 06/67] docs(adr-0010,0011): split observation harness into surface + fidelity Idea P recorded from the real backup/six-pre-reset artifacts, split into two independent divergences (plus idea Q folded into 0010): - 0010 surface-harness doctrine: live-verification roles (QA both packs, UX Engineer six-pack) drive the running system through its real surface via a declared per-surface tool (surface tool table in engineering.prompt); every surface carries a mandatory idle baseline scenario -- the direct fix for the tetris idle-state defects. Replaces QA's "through the UI only" with "through the declared surface harness" + every Expected bullet -> assertion or NOT AUTOMATED (idea Q). No surface field in project.prompt (resolves the stale monolith table row) - 0011 dependency-fidelity-manifest: new dependency-manifest.prompt declaring deps by tier 1/2/3 with machine-readable gaps the specifier and QA refuse to build on; twin-authorship + post-interaction-state rules; specifier-owned, defaults (none) 0005 gains a cross-ref: conversion fidelity is audited by 0010's bullet->assertion-or-NOT-AUTOMATED rule. CONTEXT.md gains: Surface harness, Baseline scenario, Fidelity manifest, Dependency tier. Co-Authored-By: Claude Opus 4.8 (1M context) --- CONTEXT.md | 16 ++++++++++ docs/adr/0005-qa-refutes-not-confirms.md | 1 + docs/adr/0010-surface-harness-doctrine.md | 29 +++++++++++++++++++ docs/adr/0011-dependency-fidelity-manifest.md | 26 +++++++++++++++++ 4 files changed, 72 insertions(+) create mode 100644 docs/adr/0010-surface-harness-doctrine.md create mode 100644 docs/adr/0011-dependency-fidelity-manifest.md diff --git a/CONTEXT.md b/CONTEXT.md index cfbc8de..5c84815 100644 --- a/CONTEXT.md +++ b/CONTEXT.md @@ -47,3 +47,19 @@ _Avoid_: hidden tests, secret suite, test isolation **Spec header**: The structured block of comment sections the specifier fills in at the top of every feature file, above the Gherkin scenarios — the spec-authoring layer that states what the scenarios cannot: contract, constraints, sequencing, NFRs, side effects, scope (and, six-pack only, _UX Intent_). The scenarios are the contract by example; the spec header is the contract's surrounding intent. Every section is addressed; several default to `none` (a deliberate answer). Comments only, so the Gherkin parser ignores them. (Upstream feature files are pure Gherkin with no header.) _Avoid_: preamble, comment block, feature description + +**Surface harness**: +The way the live-verification roles (QA always; the _UX Engineer_ on six-pack) drive the running system through its real production interface — a declared per-surface tool (tmux/PTY for a TUI, Playwright for web, an HTTP client for an API, event injection for a headless service) chosen from the constitution's surface tool table. Replaces upstream's mechanically-silent "through the user interface only," which let in-process function calls pass as interface verification. Every surface also carries a _baseline scenario_. The role identifies the surface from the codebase; nothing declares it in `project.prompt`. +_Avoid_: UI test, e2e harness, driver + +**Baseline scenario**: +The permanent idle/no-op scenario committed alongside a surface's flow scenarios, asserting the system is stable when nothing is happening — TUI: no input, identical consecutive captures, zero scrollback growth; web: idle load with no console errors; headless: a no-op event changes no state. It catches idle-state defects that flow scenarios never observe because flow scenarios only assert while the user is acting. +_Avoid_: smoke test, idle test, sanity check + +**Fidelity manifest**: +The constitution sub-file (`dependency-manifest.prompt`) declaring every dependency beyond the system itself by _dependency tier_, each as `name: tier N; implementation; gaps: `. A declared gap is binding: the specifier and QA refuse to write or accept any scenario that rests on it, so a known emulator limitation can never pass as covered behavior. Specifier-owned; defaults to `(none)`. +_Avoid_: mock list, dependency doc, services file + +**Dependency tier**: +The fidelity level at which a dependency is provided, declared in the _fidelity manifest_. Tier 1 — owned infrastructure run locally as the real engine (Postgres in Docker); tier 2 — stateful protocol-level emulation (vendor-official > third-party > swarm-built twin as last resort); tier 3 — external domain the swarm does not own, wire-level stubbed against a referenced contract. The system itself is always implicit, never a tier. +_Avoid_: mock level, fidelity grade diff --git a/docs/adr/0005-qa-refutes-not-confirms.md b/docs/adr/0005-qa-refutes-not-confirms.md index 6b5c560..71031b2 100644 --- a/docs/adr/0005-qa-refutes-not-confirms.md +++ b/docs/adr/0005-qa-refutes-not-confirms.md @@ -20,4 +20,5 @@ Upstream QA verifies that the accepted specification is met and fixes what fails ## Pending implementation - Prompt change on `six-pack`. +- Conversion fidelity is made auditable by the surface-harness conversion rule (ADR 0010): every Expected bullet maps to a harness assertion or is marked `NOT AUTOMATED — `, so a dropped bullet is visible rather than taken on QA's word. - Whether QA's converted end-to-end suite should itself be mutation-tested (the hardener currently ignores it) — the objective way to detect a theatrical conversion rather than relying on QA's self-judgment. diff --git a/docs/adr/0010-surface-harness-doctrine.md b/docs/adr/0010-surface-harness-doctrine.md new file mode 100644 index 0000000..277be04 --- /dev/null +++ b/docs/adr/0010-surface-harness-doctrine.md @@ -0,0 +1,29 @@ +--- +status: accepted +--- + +# Live verification runs through a declared surface harness + +Two defects (a screen blink and a runaway key-repeat) once survived a 250-scenario, eight-role pipeline. The cause was structural: no gate ever drove the *running* system through its real production interface — every check ran below the surface, against functions and return values. The fork closes this with a **surface harness doctrine**: the roles that own live verification drive the running system through its actual surface, using a declared tool, and every surface carries a permanent idle baseline. + +This is the reference verification loop's execute-and-observe layer (its Steps 5–7) made concrete: build the real thing, drive it through its surface, assert on what comes out. + +**Surface tool table (in `engineering.prompt`).** Following the existing language-tool-table pattern, the constitution declares the harness tool per surface type: tmux/PTY for a TUI (`send-keys -l` for raw input at controlled timing, `capture-pane` for screen state over time), Playwright for web, an HTTP client for HTTP APIs, event-injection-at-ingress for headless services. Roles owning live verification — **QA** (both packs) and the **UX Engineer** (six-pack, ADR 0007) — identify the project's surface *from the codebase* and acquire the matching tool before their first harness run, exactly as they acquire language tools. + +**No surface field in `project.prompt`.** Roles read the code to know the surface; an explicit declaration would be a meaningless placeholder until the project is customised. (The pre-reset summary table mentioned a `project.prompt` surface field — that was superseded by this decision; the real artifacts carry no such field.) + +**Every surface carries a mandatory baseline scenario**, committed alongside the flow scenarios: TUI → idle stability (no input, consecutive captures identical, zero scrollback growth); web → idle page loads with no console errors; headless → a no-op event produces no state change. The baseline is what the tetris defects would have hit — they were *idle-state* failures invisible to any flow test, because flow tests only assert while the user is acting. + +**QA verifies through the declared surface harness, not "the UI" (idea Q).** Upstream QA's "operate through the user interface only" was right in intent but mechanically silent — it let in-process function calls masquerade as UI verification. The fork replaces the phrase with "through the declared surface harness," and adds an auditable conversion rule: **every Expected bullet maps to a harness assertion, or is explicitly marked `NOT AUTOMATED — `.** This is the mechanism that makes the conversion-fidelity guard of ADR 0005 checkable rather than a matter of QA's word — a silently dropped bullet becomes a visible marker. Findings route back per ADR 0004. + +## Considered options + +- **Keep "through the UI only"** — rejected: no mechanical referent, so in-process calls and constant-checks wore the name of behavioral verification; this is exactly how the tetris defects slipped through. +- **Flow scenarios only, no idle baseline** — rejected: the defects were idle-state, which no flow scenario observes; the baseline is the part that actually closes the gap. +- **Declare the surface in `project.prompt`** — rejected: placeholder until customisation, and agents can read the surface from the code. + +## Pending implementation + +- Add the surface tool table + context-driven acquisition rule to `engineering.prompt` on `four-pack` and `six-pack`. +- Change QA's "through the UI only" to "through the declared surface harness" and add the Expected-bullet → assertion / `NOT AUTOMATED` rule in `QA.prompt` (both packs). +- Require the per-surface baseline scenario to be committed with every feature's flow scenarios. diff --git a/docs/adr/0011-dependency-fidelity-manifest.md b/docs/adr/0011-dependency-fidelity-manifest.md new file mode 100644 index 0000000..d632feb --- /dev/null +++ b/docs/adr/0011-dependency-fidelity-manifest.md @@ -0,0 +1,26 @@ +--- +status: accepted +--- + +# Dependencies are declared by fidelity tier in a manifest + +A scenario that rests on an emulated dependency the emulator does not actually implement passes green and proves nothing — the system was never exercised against the behavior the scenario claims to cover. The fork makes dependency fidelity **explicit and refusable** through a new constitution sub-file, `swarmforge/dependency-manifest.prompt`, that declares every dependency beyond the system itself by fidelity tier. This is the reference loop's Digital-Twin discipline: a twin is only trustworthy if its fidelity — and its gaps — are stated. + +**A separate constitution file, not `project.prompt`.** The manifest holds project-specific dependency data that would clutter `project.prompt`; it lives in its own file, auto-resolved by the same bundle resolver as the other constitution sub-files. It ships on both packs and defaults to `(none)` — a project with no external dependencies declares nothing. + +**Three tiers (the system itself is always implicit).** Tier 1 — owned infrastructure run locally as the real engine (e.g. Postgres in Docker). Tier 2 — stateful, protocol-level emulation (preference order: vendor-official emulator > established third-party > a swarm-built twin only as last resort). Tier 3 — external domain the swarm does not own (third-party APIs, other teams' services), wire-level stubbed against a referenced contract. Entry format: `name: tier N; implementation; gaps: `. + +**Declared gaps are machine-readable and binding.** The specifier and QA must not write or accept scenarios that rest on a declared gap — so a known emulator limitation can never masquerade as covered behavior. Supporting rules: every harness scenario starts from a declared seed state and resets dependency state between scenarios; tier-2/3 dependencies must expose post-interaction state for assertion (the message landed in the emulator's outbox), so scenarios assert *effects*, not only the system's own surface; and a swarm-built twin must not be authored by the role that wrote the system code it emulates, and must be validated against recorded real traffic or the vendor's official SDK tests. + +**The specifier owns the manifest.** Before writing scenarios it reads the manifest; if a feature touches an external system not yet declared, it stops, proposes name/tier/implementation/gaps to the user, and waits for approval before adding the entry — tier assignment is an architectural decision the user must own, mirroring the other specifier approval gates. + +## Considered options + +- **Free-form mocking guidance in a role prompt** — rejected: gaps stay prose and un-actionable, so nothing can mechanically *refuse* a scenario built on one. +- **Put dependency data in `project.prompt`** — rejected: mixes volatile project-specific data into the shared project constitution; a separate auto-resolved file keeps it isolated. +- **No tiers, just "mock externals"** — rejected: collapses the real distinction between owned infra run for real (tier 1) and wire-stubbed third parties (tier 3), and loses the fidelity-preference order that keeps twins honest. + +## Pending implementation + +- Add `swarmforge/dependency-manifest.prompt` (tier definitions inline, body `(none)`) on `four-pack` and `six-pack`. +- Add the read-manifest / propose-on-undeclared rule to `specifier.prompt` (both packs); QA's refusal of gap-resting scenarios is part of refuting QA (ADR 0005). From dad5f9aba7f0cad905bd8121af5df61a76d2cc20 Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Sun, 14 Jun 2026 03:23:30 -0300 Subject: [PATCH 07/67] docs(adr-0012): per-role model/effort/advisor in swarmforge.conf Record idea U from the backup design. Optional inline key=value tail on window lines (existing 4-field lines unchanged; upstream hard-rejects !=4 fields, so this is a real but backward-compatible parser change): - model (all backends), effort (claude/copilot/grok), advisor (claude only); unsupported keys silently ignored - per-role not per-backend; no pre-populated values (fully opt-in) - lands on main (swarmforge.sh); runnable configs stay topology-only - impl note: verify `claude --advisor` actually exists; treat as reserved-but-inert if not No new CONTEXT.md vocabulary. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../adr/0012-per-role-model-effort-advisor.md | 41 +++++++++++++++++++ 1 file changed, 41 insertions(+) create mode 100644 docs/adr/0012-per-role-model-effort-advisor.md diff --git a/docs/adr/0012-per-role-model-effort-advisor.md b/docs/adr/0012-per-role-model-effort-advisor.md new file mode 100644 index 0000000..52f1cb0 --- /dev/null +++ b/docs/adr/0012-per-role-model-effort-advisor.md @@ -0,0 +1,41 @@ +--- +status: accepted +--- + +# Per-role model, effort, and advisor in `swarmforge.conf` + +Different roles have different compute needs — the architect reasoning about design warrants a more capable model than the coder grinding through an implementation slice. Upstream's only per-role knob is the agent backend (`window `); model, effort, and advisor are absent. The fork adds **optional per-role overrides** without breaking any existing config. + +**Syntax: an inline `key=value` tail on the window line.** The existing four fields parse exactly as before; any fields beyond position four are read as `key=value` pairs stored per role. Upstream rejects lines that are not exactly four fields, so this is a genuine parser change, but it is backward compatible — a four-field line still works untouched. + +```conf +# before (still valid) +window coder claude coder + +# after (opt-in per role) +window specifier claude specifier model=opus effort=xhigh advisor=sonnet +window coder claude coder model=sonnet effort=high +window architect codex architect model=o3 +``` + +**Three keys, mapped to CLI flags per backend; unsupported keys are silently ignored:** + +| Key | Applies to | Mapping | +|-----|-----------|---------| +| `model` | all backends | `claude`/`copilot`/`grok`: `--model ` · `codex`: `-c model=""` | +| `effort` | claude, copilot, grok | `--effort ` (codex has no effort flag — skipped) | +| `advisor` | claude only | `--advisor ` (ignored for other backends) | + +**Per-role granularity, not per-backend.** Two `claude` roles can run different models; a global per-backend setting would throw away the value of the role abstraction. **No pre-populated values** ship in the runnable configs — those express topology (roles + worktrees), not opinions about model cost. The feature is fully opt-in: operators add keys only to the lines they care about. + +## Considered options + +- **Per-backend global settings** — rejected: collapses the role distinction; you could not give the architect a stronger model than the coder when both run the same backend. +- **A separate config block / file for model settings** — rejected: splits a role's definition across two places; the inline tail keeps everything about a role on one line. +- **Pre-populate sensible defaults in the runnable configs** — rejected: bakes cost/quality opinions into topology files and creates drift to maintain; opt-in keeps the configs neutral. + +## Pending implementation + +- `main`: extend `parse_config` in `swarmforge.sh` to accept ≥4 fields and read the `key=value` tail into per-role maps; extend `launch_role` to append the mapped flags per backend when set. (Script lives on `main`; the conf grammar is exercised there.) +- Verify each mapped flag actually exists on the target CLI before relying on it — in particular confirm `claude --advisor` is a real flag; if not, treat `advisor` as reserved-but-inert until the CLI supports it. +- Runnable configs (`four-pack`/`six-pack`) stay topology-only — no keys added. From 5f0081dab6acbeac99ce3f46d168caf413dcaaf6 Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Sun, 14 Jun 2026 03:35:48 -0300 Subject: [PATCH 08/67] docs(adr): recapture lost role detail; lean ADR style (divergence + why only) Audit of the real backup/six-pre-reset prompts vs the intent-written ADRs surfaced lost decisions and contradictions; fix them and adopt the house style (no rejected-options sections, no historical/legacy notes): - 0007: add the universal visual-quality bar (AI-aesthetic anti-patterns, type hierarchy, WCAG contrast) the UX Engineer enforces regardless of project input; make the durable artifact concrete (observation-harness/, golden snapshots, rendering invariants); DESIGN.md is referenced-only (no scaffold, no walk-up) - 0010: add the observation-harness/ committed regression record that QA re-executes - 0008: integrator hands off to the curator; post-merge gate; N=3 cap - 0004: two-scope cap -- one bounce per finding, N=3 cycles per feature - 0005: conversion fidelity audited via 0010's bullet->assertion rule - strip "## Considered options" from every ADR; remove legacy mapping - CONTEXT.md: add Observation harness; update Back-routing for the caps Co-Authored-By: Claude Opus 4.8 (1M context) --- CONTEXT.md | 6 +++++- docs/adr/0001-permanent-fork-synced-by-merge.md | 5 ----- .../0002-idle-gate-and-clear-first-delivery.md | 5 ----- docs/adr/0003-setup-is-a-one-time-skill.md | 5 ----- docs/adr/0004-rework-routes-back.md | 7 +------ docs/adr/0005-qa-refutes-not-confirms.md | 5 ----- docs/adr/0006-harness-enforced-holdout.md | 6 ------ docs/adr/0007-ux-engineer-role.md | 16 +++++++--------- docs/adr/0008-integrator-role.md | 12 +++++------- docs/adr/0009-feature-file-spec-header.md | 6 ------ docs/adr/0010-surface-harness-doctrine.md | 10 +++------- docs/adr/0011-dependency-fidelity-manifest.md | 6 ------ docs/adr/0012-per-role-model-effort-advisor.md | 6 ------ 13 files changed, 21 insertions(+), 74 deletions(-) diff --git a/CONTEXT.md b/CONTEXT.md index 5c84815..f6c5497 100644 --- a/CONTEXT.md +++ b/CONTEXT.md @@ -37,7 +37,7 @@ QA's posture in the fork: assume the build does not meet the spec and the accept _Avoid_: verification, acceptance check, confirm **Back-routing**: -Sending rework back to the stage whose decision it exposes as flawed, instead of resolving it where it was found. The trigger is any finding that an earlier stage's work must change — a bug, a refactor blocked by a bad earlier decision, or a design/spec revision. Applies only to _structural_ rework (re-opening an earlier stage's job: an ambiguous/missing spec, a weak/missing test, a design that can't hold the behavior); _local_ work the finder can resolve without re-opening an earlier decision stays with the finder. Routes back at most once. (Upstream fixes everything in place.) +Sending rework back to the stage whose decision it exposes as flawed, instead of resolving it where it was found. The trigger is any finding that an earlier stage's work must change — a bug, a refactor blocked by a bad earlier decision, or a design/spec revision. Applies only to _structural_ rework (re-opening an earlier stage's job: an ambiguous/missing spec, a weak/missing test, a design that can't hold the behavior); _local_ work the finder can resolve without re-opening an earlier decision stays with the finder. Two caps: a single finding bounces back at most once, and a feature tolerates at most three back-route cycles total (N=3, tracked by a routing count in the handoff) before the role stops and asks the user. (Upstream fixes everything in place.) _Avoid_: rejection, escalation, bounce, defect back-routing **QA holdout**: @@ -56,6 +56,10 @@ _Avoid_: UI test, e2e harness, driver The permanent idle/no-op scenario committed alongside a surface's flow scenarios, asserting the system is stable when nothing is happening — TUI: no input, identical consecutive captures, zero scrollback growth; web: idle load with no console errors; headless: a no-op event changes no state. It catches idle-state defects that flow scenarios never observe because flow scenarios only assert while the user is acting. _Avoid_: smoke test, idle test, sanity check +**Observation harness**: +The project `observation-harness/` directory holding the committed, re-runnable surface scenarios — the per-surface _baseline scenario_ plus one set per verified flow — that form the permanent regression record. Authored by the live-verification role (the _UX Engineer_ on six-pack) using the _surface harness_ tool, and re-executed by QA before final verification; a user-facing surface with no scenarios is a finding that routes back. (Upstream has no such artifact.) +_Avoid_: e2e folder, regression dir + **Fidelity manifest**: The constitution sub-file (`dependency-manifest.prompt`) declaring every dependency beyond the system itself by _dependency tier_, each as `name: tier N; implementation; gaps: `. A declared gap is binding: the specifier and QA refuse to write or accept any scenario that rests on it, so a known emulator limitation can never pass as covered behavior. Specifier-owned; defaults to `(none)`. _Avoid_: mock list, dependency doc, services file diff --git a/docs/adr/0001-permanent-fork-synced-by-merge.md b/docs/adr/0001-permanent-fork-synced-by-merge.md index a71f38c..9895a6d 100644 --- a/docs/adr/0001-permanent-fork-synced-by-merge.md +++ b/docs/adr/0001-permanent-fork-synced-by-merge.md @@ -5,8 +5,3 @@ status: accepted # Permanent fork of unclebob/swarm-forge, synced by merge This repo is a permanent fork of `unclebob/swarm-forge` (remote `upstream`); nothing is contributed back. Upstream moves fast, so we keep current by **merging** `upstream/` into our branches — never rebasing — because the fork is published/shared and rebasing would rewrite shared history and re-surface every conflict on each sync. `git rerere` is enabled (`rerere.enabled`, `rerere.autoupdate`) so conflict resolutions replay automatically. Every divergence should be **additive** (a new file or an appended rule) and recorded as its own ADR in this directory; a non-additive edit to an upstream line is a conscious, documented cost. Two branches are maintained: `main` (shared scripts + these docs) and `six-pack` (runnable: role prompts, `swarmforge.conf`, templates). - -## Considered options - -- **Rebase onto upstream** — rejected: the fork is shared/published; rebasing rewrites history others may track and re-resolves every conflict each cycle. -- **Snapshot upstream and stop tracking** — rejected: the goal is to stay current with a fast-moving upstream, not freeze it. diff --git a/docs/adr/0002-idle-gate-and-clear-first-delivery.md b/docs/adr/0002-idle-gate-and-clear-first-delivery.md index 960538b..5099056 100644 --- a/docs/adr/0002-idle-gate-and-clear-first-delivery.md +++ b/docs/adr/0002-idle-gate-and-clear-first-delivery.md @@ -21,11 +21,6 @@ The marker is set *busy* when a delivery starts and *idle* when the Stop hook fi Ready is implicit (idle + empty queue = ready). Upstream's startup "I'm awake" ping is kept only as an operator-visible **presence** signal — stamped a distinct `presence` type and excluded from the clear-first path, so the Stop hook never clears for it. -## Considered options - -- **Rely on upstream's type-into-terminal model and skip `/clear`** — rejected: loses the required per-task session reset. -- **Orchestrator-in-code** (`docs/proposals/2026-06-11-factory-line-refactor.md`) — deferred; a re-architecture, not a sync move. - ## Pending implementation - `codex`/`grok` hook-based delivery (Claude Code first). The current `six-pack` `swarmforge.conf` runs all six roles on `codex`, so until that is built — or those roles move to `claude` — clear-first delivery applies only to `claude` roles. diff --git a/docs/adr/0003-setup-is-a-one-time-skill.md b/docs/adr/0003-setup-is-a-one-time-skill.md index 45f17e9..a2016d6 100644 --- a/docs/adr/0003-setup-is-a-one-time-skill.md +++ b/docs/adr/0003-setup-is-a-one-time-skill.md @@ -14,11 +14,6 @@ Adapting a project to the swarm — installing the project's language quality to **Why replace rather than overlay.** Setup is an explicit one-time step; the run path stays pure "start the agents." The accepted cost is that the swarm no longer self-installs project tooling on first run — the operator runs the setup skill once before the first `./swarm`. Any setup step this moves out of the run path is named and documented so the divergence stays auditable. -## Considered options - -- **Add setup as functions inside `swarmforge.sh`** — rejected: edits an upstream-tracked file (a permanent merge-conflict surface, against ADR 0001's additive rule) and a deterministic script cannot adapt to the project's stack. -- **Overlay — skill adds the fork's extras while execution keeps installing** — rejected: leaves setup split across two places and keeps the run path doing setup work, defeating the purpose. - ## Pending implementation - The skill itself: stack detection, the exact tooling/permissions/pins it writes, how it is shipped inside the install, and the "swarm-ready" marker `./swarm` checks before launching. diff --git a/docs/adr/0004-rework-routes-back.md b/docs/adr/0004-rework-routes-back.md index 192dba4..6add5f7 100644 --- a/docs/adr/0004-rework-routes-back.md +++ b/docs/adr/0004-rework-routes-back.md @@ -10,12 +10,7 @@ The trigger is not only a defect. Any finding that an earlier stage's work must **Only structural rework routes back.** It routes back when resolving it means re-opening an earlier stage's job — an ambiguous or missing specification, a weak or missing acceptance test, a design that can't hold the behavior. The stage that owns that work gets it back and corrects the root cause. **Local** work — anything the finder can resolve without re-opening an earlier stage's decision — stays with the finder. Routing a contained, local change backward only adds a round trip and teaches no one. -**Rework routes back at most once.** If it comes back still unresolved, the finder resolves it in place and flags it. This caps the cost and stops two stages volleying the same item indefinitely. - -## Considered options - -- **Route every finding back to its origin** — rejected: the line ping-pongs and a trivial local change becomes a round trip that teaches nothing; the cost is paid for findings that don't carry a lesson. -- **Keep upstream's fix-in-place** — rejected: rework accumulates as downstream patches and the stage that caused it is never corrected, so the same class of problem recurs. +**Two caps, at two scopes.** A *single finding* routes back to its cause **at most once**: if it returns still unresolved, the finder resolves it in place and flags it, so two stages never volley the same item. Independently, a *feature* tolerates **at most three back-route cycles total** (depth cap N=3), tracked by a routing count carried in the handoff trail; after the third the routing role stops and asks the user rather than looping. The first cap stops ping-pong on one issue; the second stops a feature from churning through endless distinct bounces. (The role prompts — ux-engineer, integrator — carry the N=3 feature-level cap.) ## Pending implementation diff --git a/docs/adr/0005-qa-refutes-not-confirms.md b/docs/adr/0005-qa-refutes-not-confirms.md index 71031b2..cc5ab72 100644 --- a/docs/adr/0005-qa-refutes-not-confirms.md +++ b/docs/adr/0005-qa-refutes-not-confirms.md @@ -12,11 +12,6 @@ Upstream QA verifies that the accepted specification is met and fixes what fails **Findings route back; QA owns the attack, not the routing.** A structural weakness QA surfaces routes back to its cause (a weak acceptance test or an ambiguous spec → the specifier); a local defect QA fixes in place — per the back-routing decision. Refuting QA is the engine that *generates* structural findings; it needs no routing rule of its own. -## Considered options - -- **Keep upstream's confirm posture** — rejected: a confirming QA passes test theater (green suites that assert nothing); the defects that survive an otherwise-complete pipeline are exactly the ones a checklist confirms. -- **Refute beyond the spec** — rejected: unbounded; QA becomes a fuzzer that blocks the line on unspecified behavior. Unspecified gaps route back to the specifier instead. - ## Pending implementation - Prompt change on `six-pack`. diff --git a/docs/adr/0006-harness-enforced-holdout.md b/docs/adr/0006-harness-enforced-holdout.md index daa496c..58bd199 100644 --- a/docs/adr/0006-harness-enforced-holdout.md +++ b/docs/adr/0006-harness-enforced-holdout.md @@ -16,12 +16,6 @@ Upstream holds the end-to-end QA suite back from the coder by prompt instruction **Scope boundary: only the end-to-end QA suite.** The Gherkin acceptance tests and the acceptance pipeline stay fully visible — the coder builds and runs them. The holdout is the specifier's end-to-end QA suite alone. -## Considered options - -- **Keep upstream's prompt-level "ignore it" (detection-only)** — rejected: an implementer that can read the holdout can fit to it; the reference doc calls instructional holdouts a leak that must be closed architecturally. Mutation + refuting QA detect a weak suite but do not stop the implementation being shaped to the visible one. -- **Harness deletes the QA path from the worktree** — rejected: the role's commit stages the deletion and the suite disappears downstream for QA. -- **Specifier commits the QA suite to a separate QA-only branch** — rejected: more handoff-graph complexity (QA must merge code + QA branch) for no protection sparse-checkout doesn't already give. - ## Pending implementation - Add the sparse-checkout exclusion to the worktree-prep step (`six-pack`/scripts), keyed to skip the specifier(master) and QA worktrees. diff --git a/docs/adr/0007-ux-engineer-role.md b/docs/adr/0007-ux-engineer-role.md index b3333b2..1c8d1bf 100644 --- a/docs/adr/0007-ux-engineer-role.md +++ b/docs/adr/0007-ux-engineer-role.md @@ -8,20 +8,18 @@ Upstream has no UX role — nothing in the line owns whether the product is *usa **It checks against UX Intent.** The specifier authors a **UX Intent** section inline in the feature file — concrete, observable statements of what the feature should look and feel like. UX Intent is part of the swarm and travels with the feature. A feature with no UX Intent is the signal to skip: the UX Engineer passes straight through to the next stage, the same "no work, no handoff" pattern used elsewhere. -**Optional design inputs are referenced, not owned.** When a project supplies design artifacts — a DESIGN.md (visual system), an EXPERIENCE.md (interaction and feel), mockups (concrete visual targets) — the specifier **references** them from the feature file, and the UX Engineer consults them alongside UX Intent. These are optional project inputs; the swarm neither defines, scaffolds, nor requires them. This replaces the earlier design's automatic "nearest-file" resolution with an explicit reference from the one canonical artifact. +**Optional design inputs are referenced, not owned.** When a project supplies design artifacts — a DESIGN.md (visual system), an EXPERIENCE.md (interaction and feel), mockups (concrete visual targets) — the specifier **references** them from the feature file, and the UX Engineer consults them alongside UX Intent. These are optional project inputs; the swarm neither defines, scaffolds, nor requires them, and does not walk the directory tree to discover them — the only link is an explicit reference from the feature file. -**Framework-agnostic.** The role defines the *class* of check — the running product matches its stated UX — and leaves the specific visual-testing tool to the project's constitution. No terminal-UI assumptions live in the role. +**It also enforces a universal visual-quality bar, independent of any project input.** Beyond UX Intent and any DESIGN.md, the role applies a fixed standard the prompt enumerates: AI-aesthetic anti-patterns (unjustified default purple/indigo, gradient noise, uniformly maximal rounding, oversized equal padding, shadow-heavy chrome, missing loading/error/empty states), type-hierarchy rules (primary content must dominate; no skipped heading levels), and colour rules including **WCAG contrast minimums (4.5:1 normal, 3:1 large) and "colour is never the sole state indicator."** These hold even when a feature has no UX Intent and the project has no DESIGN.md — they are the floor, not project preferences. -**Placement and routing.** The UX Engineer sits immediately after the coder, so the downstream roles (cleaner, architect, hardener, QA) see implementation and rendering code together in one pass rather than running twice. When a mismatch cannot be fixed in rendering alone and needs a model-state change, it routes back to the coder — using the back-routing rule already decided (`0004`), not a separate mechanism. +**It leaves a durable, re-runnable artifact.** "Fixes, leaves a regression check behind" is concrete: per verified flow the role commits re-runnable scenarios to `observation-harness/` using the project's surface tool (ADR 0010), plus golden-file snapshots per state and rendering invariants for structural properties. These are the permanent regression record and must pass against the committed code — and QA re-executes them downstream (ADR 0010), routing back here if a user-facing surface has none. -## Considered options +**Framework-agnostic.** The role defines the *class* of check — the running product matches its stated UX — and leaves the specific visual-testing tool to the project's constitution. No terminal-UI assumptions live in the role. -- **A flag-only UX reviewer** — rejected: produces a handback with no durable artifact; the fork's pattern is fix-in-place. -- **The swarm owns/scaffolds DESIGN.md and friends** — rejected: those are optional project inputs, referenced not owned; the swarm should not impose a design system. -- **Automatic nearest-file resolution of design docs** — superseded: explicit references from the feature file are clearer and need no walk-up. -- **Place the UX role late (after the hardener)** — rejected: prior batch evidence showed it made the cleaner, architect, and hardener each run twice per feature. +**Placement and routing.** The UX Engineer sits immediately after the coder, so the downstream roles (cleaner, architect, hardener, QA) see implementation and rendering code together in one pass rather than running twice. When a mismatch cannot be fixed in rendering alone and needs a model-state change, it routes back to the coder — using the back-routing rule already decided (`0004`), not a separate mechanism. The back-route message carries what UX Intent says, what the implementation does, what must change, and the current routing count; the role observes the N=3 feature-level cap (`0004`) and stops to ask the user after the third cycle. ## Pending implementation - Six-pack only: new `ux-engineer` role prompt; UX Intent authoring in the specifier and the feature template; coder reads UX Intent; `swarmforge.conf` adds the window after the coder. -- Routing follows `0004`. +- Routing follows `0004`; durable artifact (`observation-harness/`, snapshots, rendering invariants) follows `0010`. +- DESIGN.md is referenced from the feature file only — the specifier does not scaffold it and the ux-engineer does not walk the tree to find it. diff --git a/docs/adr/0008-integrator-role.md b/docs/adr/0008-integrator-role.md index 0f9b6db..a5830a0 100644 --- a/docs/adr/0008-integrator-role.md +++ b/docs/adr/0008-integrator-role.md @@ -6,21 +6,19 @@ status: accepted Upstream has no integrator: when QA signals done, the **specifier** merges the work ad hoc (a local `git merge`) and asks for the next feature. There is no gate between "QA passed" and "landed on the main branch." The fork adds a dedicated **integrator** as the terminal stage of the line that owns *landing* the work — and nothing lands except through a green CI gate. -**Landing is PR + CI, with no fallback.** From the QA-approved commit the integrator opens a pull request, watches CI, and merges only when CI is green; then it runs the post-merge verification and notifies the specifier. It never merges locally — a local merge is exactly what the specifier already did, so the integrator's whole value is that the main branch only ever receives green-CI'd work. **CI is therefore a hard precondition, not optional:** a project without CI is not swarm-ready, and ensuring CI is in place belongs to project setup (`0003`). +**Landing is PR + CI, with no fallback.** From the QA-approved commit the integrator opens a pull request, watches CI, and merges only when CI is green; then it runs a **post-merge gate** — it watches the resulting main-branch CI run and, if the project defines a full verification suite, runs that on green too — before handing off. It never merges locally — a local merge is exactly what the specifier already did, so the integrator's whole value is that the main branch only ever receives green-CI'd work. **CI is therefore a hard precondition, not optional:** a project without CI is not swarm-ready, and ensuring CI is in place belongs to project setup (`0003`). + +**It hands off to the curator.** The integrator is the last *code* stage, but not the last stage: on a green landing it notifies the **curator** (ADR 0013), which promotes the run's retro knowledge and only then releases the specifier for the next feature. **One PR per feature.** Rework updates the same PR; a second PR is never opened for the same feature. -**Failure routing reuses back-routing.** A CI failure routes to the role that owns it — a failing test to the coder, a failing cleanliness gate to the cleaner, a failing architecture check to the architect; a trivially autofixable failure (lint/format) the integrator fixes in place on the PR branch and re-runs. This is the back-routing rule already decided (`0004`) with the integrator as the finder, capped the same way: after the cap it stops and reports rather than looping. +**Failure routing reuses back-routing.** A CI failure routes to the role that owns it — a failing test to the coder, a failing cleanliness gate to the cleaner, a failing architecture check to the architect; a trivially autofixable failure (lint/format) the integrator fixes in place on the PR branch and re-runs. This is the back-routing rule already decided (`0004`) with the integrator as the finder, capped at N=3 (`0004`): it tracks the cycle depth by counting its own failure comments on the PR, and after three it posts a final `FAILED: depth cap reached` comment and stops rather than looping. The post-merge gate's CI-red is routed the same way as pre-merge. **The specifier stops merging.** Merging moves entirely to the integrator, so the specifier no longer needs the main checkout — it moves from the `master` worktree to its own worktree and starts each feature from a clean reset to the default branch. -## Considered options - -- **Keep the specifier merging (no integrator)** — rejected: conflates deciding *what* to build with landing it safely, and provides no gate before the main branch. -- **Local-merge fallback when CI is absent** — rejected: a local merge is what the specifier already does; the integrator exists for the green-CI gate, so CI is required, not optional. - ## Pending implementation - Runnable branches (`six-pack`; `four-pack` where present): new terminal `integrator` role; `swarmforge.conf` window; specifier worktree change and removal of its merge step. - The PR/CI mechanism (platform, e.g. `gh`) named at implementation. - CI-in-place enforced as a setup precondition (`0003`); routing per `0004`. +- Terminal handoff target is the curator (`0013`), not the specifier; autofixable lint/format is the integrator's only allowed code change. diff --git a/docs/adr/0009-feature-file-spec-header.md b/docs/adr/0009-feature-file-spec-header.md index 9b8b53b..c2911ce 100644 --- a/docs/adr/0009-feature-file-spec-header.md +++ b/docs/adr/0009-feature-file-spec-header.md @@ -14,12 +14,6 @@ The header is the **spec-authoring layer** the reference verification loop puts **Address every section; do not fill every section.** `SEQUENCING`, `SIDE EFFECTS`, and (six-pack) `UX INTENT` default to `none`. `none` is a deliberate answer, not a skipped one — and for `UX INTENT`, `none` is the signal that tells the UX Engineer to pass through (ADR 0007). The sections are comments (`#`), so the Gherkin parser ignores them and the acceptance pipeline is unaffected. -## Considered options - -- **Free-form prose header** (what melech-mini-apps feature files actually do) — rejected: no guarantee that scope, NFR, or side effects ever get stated; relies entirely on specifier memory. The structured `Ask:`/`Format:` turns each concern into a forced question. -- **No header (stay pure Gherkin like upstream)** — rejected: scope exclusions, assumptions, and NFRs stay implicit, and 0007's UX Intent would have no documented home in the artifact. -- **Design a fork-specific minimal header** — rejected: the template already exists, is internally coherent, and already integrates the fork's other decisions (UX Intent, side-effects-to-observe for refuting QA). Re-deriving it adds nothing. - ## Pending implementation - Template already drafted on `four-pack` (7 sections) and `six-pack` (8, with `UX INTENT`); land both. diff --git a/docs/adr/0010-surface-harness-doctrine.md b/docs/adr/0010-surface-harness-doctrine.md index 277be04..fc8282e 100644 --- a/docs/adr/0010-surface-harness-doctrine.md +++ b/docs/adr/0010-surface-harness-doctrine.md @@ -10,17 +10,13 @@ This is the reference verification loop's execute-and-observe layer (its Steps 5 **Surface tool table (in `engineering.prompt`).** Following the existing language-tool-table pattern, the constitution declares the harness tool per surface type: tmux/PTY for a TUI (`send-keys -l` for raw input at controlled timing, `capture-pane` for screen state over time), Playwright for web, an HTTP client for HTTP APIs, event-injection-at-ingress for headless services. Roles owning live verification — **QA** (both packs) and the **UX Engineer** (six-pack, ADR 0007) — identify the project's surface *from the codebase* and acquire the matching tool before their first harness run, exactly as they acquire language tools. -**No surface field in `project.prompt`.** Roles read the code to know the surface; an explicit declaration would be a meaningless placeholder until the project is customised. (The pre-reset summary table mentioned a `project.prompt` surface field — that was superseded by this decision; the real artifacts carry no such field.) +**No surface field in `project.prompt`.** Roles read the code to know the surface; an explicit declaration would be a meaningless placeholder until the project is customised. **Every surface carries a mandatory baseline scenario**, committed alongside the flow scenarios: TUI → idle stability (no input, consecutive captures identical, zero scrollback growth); web → idle page loads with no console errors; headless → a no-op event produces no state change. The baseline is what the tetris defects would have hit — they were *idle-state* failures invisible to any flow test, because flow tests only assert while the user is acting. -**QA verifies through the declared surface harness, not "the UI" (idea Q).** Upstream QA's "operate through the user interface only" was right in intent but mechanically silent — it let in-process function calls masquerade as UI verification. The fork replaces the phrase with "through the declared surface harness," and adds an auditable conversion rule: **every Expected bullet maps to a harness assertion, or is explicitly marked `NOT AUTOMATED — `.** This is the mechanism that makes the conversion-fidelity guard of ADR 0005 checkable rather than a matter of QA's word — a silently dropped bullet becomes a visible marker. Findings route back per ADR 0004. - -## Considered options +**The harness scenarios are committed and re-run, not throwaway.** Per verified flow, the live-verification role commits re-runnable scenarios to a project `observation-harness/` directory using the surface tool — alongside the per-surface baseline — as a permanent regression record that must pass against the committed code (on six-pack the UX Engineer authors these, ADR 0007; it also adds golden-file snapshots per state and rendering invariants for structural properties). **QA re-executes the committed `observation-harness/` scenarios before its own final verification**, and routes back (ADR 0004) if a user-facing surface exists but has no scenarios. This is what makes the surface check durable: a defect fixed once stays fixed because its scenario re-runs every cycle. -- **Keep "through the UI only"** — rejected: no mechanical referent, so in-process calls and constant-checks wore the name of behavioral verification; this is exactly how the tetris defects slipped through. -- **Flow scenarios only, no idle baseline** — rejected: the defects were idle-state, which no flow scenario observes; the baseline is the part that actually closes the gap. -- **Declare the surface in `project.prompt`** — rejected: placeholder until customisation, and agents can read the surface from the code. +**QA verifies through the declared surface harness, not "the UI" (idea Q).** Upstream QA's "operate through the user interface only" was right in intent but mechanically silent — it let in-process function calls masquerade as UI verification. The fork replaces the phrase with "through the declared surface harness," and adds an auditable conversion rule: **every Expected bullet maps to a harness assertion, or is explicitly marked `NOT AUTOMATED — `.** This is the mechanism that makes the conversion-fidelity guard of ADR 0005 checkable rather than a matter of QA's word — a silently dropped bullet becomes a visible marker. Findings route back per ADR 0004. ## Pending implementation diff --git a/docs/adr/0011-dependency-fidelity-manifest.md b/docs/adr/0011-dependency-fidelity-manifest.md index d632feb..c27f814 100644 --- a/docs/adr/0011-dependency-fidelity-manifest.md +++ b/docs/adr/0011-dependency-fidelity-manifest.md @@ -14,12 +14,6 @@ A scenario that rests on an emulated dependency the emulator does not actually i **The specifier owns the manifest.** Before writing scenarios it reads the manifest; if a feature touches an external system not yet declared, it stops, proposes name/tier/implementation/gaps to the user, and waits for approval before adding the entry — tier assignment is an architectural decision the user must own, mirroring the other specifier approval gates. -## Considered options - -- **Free-form mocking guidance in a role prompt** — rejected: gaps stay prose and un-actionable, so nothing can mechanically *refuse* a scenario built on one. -- **Put dependency data in `project.prompt`** — rejected: mixes volatile project-specific data into the shared project constitution; a separate auto-resolved file keeps it isolated. -- **No tiers, just "mock externals"** — rejected: collapses the real distinction between owned infra run for real (tier 1) and wire-stubbed third parties (tier 3), and loses the fidelity-preference order that keeps twins honest. - ## Pending implementation - Add `swarmforge/dependency-manifest.prompt` (tier definitions inline, body `(none)`) on `four-pack` and `six-pack`. diff --git a/docs/adr/0012-per-role-model-effort-advisor.md b/docs/adr/0012-per-role-model-effort-advisor.md index 52f1cb0..0cddac6 100644 --- a/docs/adr/0012-per-role-model-effort-advisor.md +++ b/docs/adr/0012-per-role-model-effort-advisor.md @@ -28,12 +28,6 @@ window architect codex architect model=o3 **Per-role granularity, not per-backend.** Two `claude` roles can run different models; a global per-backend setting would throw away the value of the role abstraction. **No pre-populated values** ship in the runnable configs — those express topology (roles + worktrees), not opinions about model cost. The feature is fully opt-in: operators add keys only to the lines they care about. -## Considered options - -- **Per-backend global settings** — rejected: collapses the role distinction; you could not give the architect a stronger model than the coder when both run the same backend. -- **A separate config block / file for model settings** — rejected: splits a role's definition across two places; the inline tail keeps everything about a role on one line. -- **Pre-populate sensible defaults in the runnable configs** — rejected: bakes cost/quality opinions into topology files and creates drift to maintain; opt-in keeps the configs neutral. - ## Pending implementation - `main`: extend `parse_config` in `swarmforge.sh` to accept ≥4 fields and read the `key=value` tail into per-role maps; extend `launch_role` to append the mapped flags per backend when set. (Script lives on `main`; the conf grammar is exercised there.) From 1a24b8aa453b334ba7eaa8091e99c92ca4b0d772 Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Sun, 14 Jun 2026 03:37:20 -0300 Subject: [PATCH 09/67] docs(adr-0013,0014): curator role + .agents knowledge injection (idea V) Record the knowledge-promotion loop (issue #20), split into two divergences: - 0013 curator: terminal role after the integrator that promotes session retros to versioned repo knowledge via one self-merging PR, then releases the specifier. Capture-everything with scope tags; single discard gate (non-inferable check); routing ladder (gate > AGENTS.md > role file > reference > skill > upstream > ledger); append-only ledger; budgets 60/40; integrator->curator->specifier; empty run never stalls - 0014 .agents contract: AGENTS.md + .agents/ written only by the curator, versioned in the repo (not ~/.claude), injected into each role bundle at launch (AGENTS.md for all, role file role-scoped; references by pointer); missing files silently skipped CONTEXT.md: add Curator, Promoted knowledge, Knowledge ledger. Co-Authored-By: Claude Opus 4.8 (1M context) --- CONTEXT.md | 12 ++++++++++ docs/adr/0013-curator-knowledge-promotion.md | 25 ++++++++++++++++++++ docs/adr/0014-agents-knowledge-injection.md | 17 +++++++++++++ 3 files changed, 54 insertions(+) create mode 100644 docs/adr/0013-curator-knowledge-promotion.md create mode 100644 docs/adr/0014-agents-knowledge-injection.md diff --git a/CONTEXT.md b/CONTEXT.md index f6c5497..54ec8ab 100644 --- a/CONTEXT.md +++ b/CONTEXT.md @@ -67,3 +67,15 @@ _Avoid_: mock list, dependency doc, services file **Dependency tier**: The fidelity level at which a dependency is provided, declared in the _fidelity manifest_. Tier 1 — owned infrastructure run locally as the real engine (Postgres in Docker); tier 2 — stateful protocol-level emulation (vendor-official > third-party > swarm-built twin as last resort); tier 3 — external domain the swarm does not own, wire-level stubbed against a referenced contract. The system itself is always implicit, never a tier. _Avoid_: mock level, fidelity grade + +**Curator**: +The terminal role, after the integrator, that turns a run's session retros into versioned repo knowledge via one self-merging PR, then releases the specifier for the next feature. Makes no code changes — writes only _promoted knowledge_. An empty run notifies the specifier immediately; the line never stalls on it. (Upstream has no such role; lessons live only in unread retros.) +_Avoid_: librarian, archivist, scribe + +**Promoted knowledge**: +The project-versioned knowledge contract the _curator_ writes and the launcher injects into role bundles: a root `AGENTS.md` (universal invariants + navigation) and `.agents/` (per-role files, references, skills, the enforcement-gate backlog, the _knowledge ledger_). Lives in the repo, not `~/.claude`, so a fresh clone carries every lesson. `AGENTS.md` and the role's file are injected into that role's bundle at launch; references load on demand by pointer. (Upstream bundles only the constitution and role prompt.) +_Avoid_: docs, memory, knowledge base + +**Knowledge ledger**: +`.agents/ledger.md` — the append-only audit the _curator_ writes, one never-pruned line per processed retro item (`date | session-id | role | failure-class | verdict`). Makes recurrence provable: an item rejected before and seen again has proven itself worth promoting. +_Avoid_: changelog, history, log diff --git a/docs/adr/0013-curator-knowledge-promotion.md b/docs/adr/0013-curator-knowledge-promotion.md new file mode 100644 index 0000000..45a2087 --- /dev/null +++ b/docs/adr/0013-curator-knowledge-promotion.md @@ -0,0 +1,25 @@ +--- +status: accepted +--- + +# Curator role and the knowledge-promotion loop + +Upstream ends the line at QA: the specifier merges and asks for the next feature, and whatever the run *learned* — a wrong path taken, a convention discovered, a gate that should have existed — lives only in a session retro that no one reads again. The fork adds a terminal **curator** role, after the integrator, that turns those retros into **versioned repo knowledge** via one self-merging PR per run, then releases the specifier for the next feature. + +**Pipeline position: integrator → curator → specifier.** The integrator notifies the curator on a green landing; the curator promotes the run's knowledge and only then notifies the specifier. An empty run (no unprocessed retros) notifies the specifier immediately with no PR — the line never stalls on the curator. The curator makes no code changes; it may only write `AGENTS.md` and files under `.agents/` (ADR 0014). + +**Capture everything; discard once, at the curator.** The retro skill tags every action with a scope — `project | swarmforge | skill | ephemeral` — and captures all of them without filtering for "obviousness." The single discard gate is the curator's **non-inferable check**: could a future agent reach this fix from the error output and the files it names, with no foreknowledge? If yes, it is not worth promoting. Putting the one filter here, not at capture, means nothing is lost before a consistent judge sees it. + +**Promote to the highest rung that fits (the routing ladder).** A mechanical fix (config line, CI gate, script guard) goes to the enforcement-gate backlog — a gate beats documentation. Otherwise: `AGENTS.md` for universal invariants, `.agents/roles/.md` for one role's operational knowledge, `.agents/references/.md` for deep dives (each needs a pointer line or it never loads), `.agents/skills//` only on the second occurrence of a need, `.agents/upstream/.md` for `swarmforge`-scoped items, ledger-only for ephemeral and rejected. A learning whose fix is global routes *up* the ladder, never into `AGENTS.md`, and is discarded only when the gap is already mechanically closed. Every item is rewritten from a phenomenon ("X can fail because Y") into a rule ("every X MUST Z because Y") before it is promoted. + +**The ledger is the append-only audit.** `.agents/ledger.md` records one never-pruned line per processed item — `date | session-id | role | failure-class | verdict` — so recurrence is provable: an item rejected before and now recurring has proven itself non-trivial and is promoted rather than rejected again. + +**The curator self-merges from day one.** The knowledge PR is merged in-role with no user confirmation; the PR body (a metric line plus one verbatim bullet per promoted rule) and the ledger are the asynchronous review surface. Budgets hold the knowledge small: `AGENTS.md` ≤ 60 lines, each role file ≤ 40 — over budget, the stalest or now-inferable lines are pruned and ledgered. + +**Loop health is self-reported.** Each PR body carries running totals (`promoted | rejected | upstream | ephemeral`). Kill criterion: fewer than three promotions that survive contact with later sessions over 90 days → disable the curator window; the ledger and promoted docs stay. + +## Pending implementation + +- `six-pack` then `four-pack`: new `curator` role prompt; `swarmforge.conf` gains the curator window (last); rewire — integrator notifies the curator, specifier waits on the curator before the next feature, `workflow.prompt` documents the integrator→curator→specifier chain. +- `main`: upgrade the `agent-retro` skill — scope tag on every action, capture-first (no pre-filter), and an autonomous mode that marks actions `pending-curation` without prompting a human. +- Pairs with ADR 0014 (the `.agents/` contract the curator writes and the launcher injects). diff --git a/docs/adr/0014-agents-knowledge-injection.md b/docs/adr/0014-agents-knowledge-injection.md new file mode 100644 index 0000000..eea76c3 --- /dev/null +++ b/docs/adr/0014-agents-knowledge-injection.md @@ -0,0 +1,17 @@ +--- +status: accepted +--- + +# `.agents/` knowledge contract injected into every bundle + +Promoted knowledge is worthless if it never reaches the agent that needs it. Upstream bundles only the constitution and the role prompt into an agent's context, so there is no channel for project-specific, accumulated knowledge. The fork defines a versioned knowledge contract in the project repo and **injects it into every role bundle at launch**, closing the loop the curator (ADR 0013) feeds. + +**The contract lives in the project repo, under `.agents/` plus a root `AGENTS.md`.** `AGENTS.md` is the navigation map and universal invariants (≤ 60 lines); `.agents/roles/.md` is one role's operational knowledge (≤ 40 lines); `.agents/references/.md` holds deep dives reached by pointer; `.agents/skills//` holds promoted procedures; `.agents/backlog.md` is the enforcement-gate backlog; `.agents/ledger.md` is the append-only audit. All of it is written only by the curator and **versioned in the project**, not in `~/.claude` — so a fresh clone carries every promoted lesson and nothing depends on a machine's local memory. + +**Injection is automatic and role-scoped.** When the launcher builds a role's bundle it appends, when the files exist, the root `AGENTS.md` (so every role gets the universal invariants) and that role's `.agents/roles/.md` (so a role gets only its own operational knowledge). References are not injected — they load on demand when an included line points to them, which is why every reference must be pointed at from `AGENTS.md` or a role file. Missing files are silently skipped: a project that has not bootstrapped its knowledge yet launches cleanly with no knowledge blocks. + +## Pending implementation + +- `main`: extend the bundle generator (`write_agent_instruction_file` in `swarmforge.sh`) to append `AGENTS.md` and `.agents/roles/.md` from the project root when present, and add the preamble sentence telling the agent these knowledge files (and on-demand references) are included. +- Acceptance: a scratch project with an `AGENTS.md` → every generated bundle carries it; adding `.agents/roles/coder.md` → only the coder's bundle gains it; removing both → bundles generate with no knowledge blocks and no errors. +- Pairs with ADR 0013 (the curator is the only writer of this contract). From 1af7118cfdd28f610f3729ece919bab59ddd066f Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Sun, 14 Jun 2026 03:38:40 -0300 Subject: [PATCH 10/67] docs(adr-0015,0016): platform-feasibility stop rule + boundary-logic detection Final pipeline divergences (ideas R, S): - 0015 R: constitution rule (workflow.prompt) -- a spec requirement that conflicts with real platform capability stops and reports to the user; a workaround comment in code is the smell the rule fired and was suppressed. Narrow to spec-vs-platform, binds all roles. - 0016 S: cleaner (six-pack) / refactorer (four-pack) also scans boundary files at a ~15-20 mutation-site threshold (vs 100 for testable files); above = logic in an adapter shell, extract before handoff. Plus the "tested only through a stripped view = untested" anti-pattern. Idea T (evidence-as-code) is covered by 0010's observation-harness loop; G/H/I are rejected (no divergence, no ADR). Co-Authored-By: Claude Opus 4.8 (1M context) --- docs/adr/0015-platform-feasibility-stop-rule.md | 15 +++++++++++++++ docs/adr/0016-boundary-logic-detection.md | 16 ++++++++++++++++ 2 files changed, 31 insertions(+) create mode 100644 docs/adr/0015-platform-feasibility-stop-rule.md create mode 100644 docs/adr/0016-boundary-logic-detection.md diff --git a/docs/adr/0015-platform-feasibility-stop-rule.md b/docs/adr/0015-platform-feasibility-stop-rule.md new file mode 100644 index 0000000..cf9c8c5 --- /dev/null +++ b/docs/adr/0015-platform-feasibility-stop-rule.md @@ -0,0 +1,15 @@ +--- +status: accepted +--- + +# Platform-feasibility stop rule + +Upstream has no rule for what a role does when a spec requirement conflicts with what the platform can actually deliver. So the role improvises — it ships a silent workaround and leaves a code comment acknowledging the conflict, and behavior diverges from the spec with no one having decided that trade-off. The fork adds a constitution rule: **when a spec requirement conflicts with a real platform capability, stop and report to the user before proceeding.** + +**The workaround comment is the smell.** A comment in the code acknowledging a spec-vs-platform conflict is the signal that this rule fired and was suppressed — it is treated as a defect, not an accepted note. + +**Narrow on purpose.** This is not a general "stop when confused" rule; it fires specifically on spec-versus-platform-capability conflicts. It lives in the constitution (`workflow.prompt`), so it binds every role that reads the constitution rather than being repeated per role. + +## Pending implementation + +- `four-pack` + `six-pack`: add the rule to `swarmforge/constitution/workflow.prompt`. diff --git a/docs/adr/0016-boundary-logic-detection.md b/docs/adr/0016-boundary-logic-detection.md new file mode 100644 index 0000000..0a1ddba --- /dev/null +++ b/docs/adr/0016-boundary-logic-detection.md @@ -0,0 +1,16 @@ +--- +status: accepted +--- + +# Boundary-logic detection + +Boundary files — environmentally-unsuitable adapter shells like TUI drivers, OS input handlers, and environment adapters — are excluded by design from every quality tool's worklist, because they can't run under test. Upstream leaves it there, so pure logic that gets embedded in a boundary file is invisible to mutation, CRAP, and coverage alike. The fork closes that hole: the cleaner (six-pack) / refactorer (four-pack) **also scans boundary files**, at a lower threshold, and extracts the logic when it finds too much. + +**A lower threshold, because boundary files should be thin.** Testable source keeps the existing 100-mutation-site split threshold; boundary files trigger at ~15–20 sites — above that, the file holds implementation, not adaptation, and the logic is extracted to a testable module before handoff. Extraction funnels that logic into the normal mutation/CRAP/coverage loops automatically, so no new test type is needed. + +**"Tested only through a stripped view" counts as untested.** A test that asserts a simplified projection of the output — ANSI-stripped text when the real output includes the escape codes and newline placement the function exists to add — does not cover the behavior. This is an explicit anti-pattern the cleaner/refactorer treats as missing coverage. + +## Pending implementation + +- `six-pack`: extend `swarmforge/roles/cleaner.prompt` to scan boundary files at the ~15–20 site threshold and add the stripped-view anti-pattern. +- `four-pack`: same in `swarmforge/roles/refactorer.prompt`. From 75c38658c0abee1066fd3aae0f651c2db058a673 Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Sun, 14 Jun 2026 12:48:29 -0300 Subject: [PATCH 11/67] docs: fork change manifest + migration recovery docs (impl playbook) Permanent rebase index (one row per divergence: change + target + recover-from branch:path + ADR) plus per-row recovery detail for the main script layer, six-pack role prompts, and the setup-swarm skill. Records the section-C ADR assignments and the decisions settled in the what's-missing pass. Co-Authored-By: Claude Opus 4.8 (1M context) --- docs/fork-change-manifest.md | 113 ++++++++++++++++++++ docs/migrations/0003-setup-skill-sources.md | 55 ++++++++++ docs/migrations/main-script-layer.md | 34 ++++++ docs/migrations/six-pack-role-prompts.md | 56 ++++++++++ 4 files changed, 258 insertions(+) create mode 100644 docs/fork-change-manifest.md create mode 100644 docs/migrations/0003-setup-skill-sources.md create mode 100644 docs/migrations/main-script-layer.md create mode 100644 docs/migrations/six-pack-role-prompts.md diff --git a/docs/fork-change-manifest.md b/docs/fork-change-manifest.md new file mode 100644 index 0000000..e17f379 --- /dev/null +++ b/docs/fork-change-manifest.md @@ -0,0 +1,113 @@ +# Fork change manifest + +Compact, permanent record of **every divergence to apply on top of a pristine `upstream`**, one line per change. Rationale lives in the ADRs (`docs/adr/`) — this file is *where + what + source*, not *why*. Use it to (re)apply the fork after any upstream sync. + +## Sync policy (ADR 0001) + +- `main`, `six-pack`, `four-pack` are kept **identical to `upstream/`** and advanced by **merge** (`git merge upstream/`), never rebase. `rerere` replays conflict resolutions. +- **four-pack is frozen (decision 2026-06-14): no fork divergences are applied to it.** Only `main` and `six-pack` carry changes. (Open: whether four-pack is still resynced to upstream to honor "keep == upstream", or left as-is — see below.) +- Every item below is **additive** (new file or appended rule) wherever possible; a non-additive edit to an upstream line is marked **[edit]** and is a conscious, documented conflict point. +- **Delivery routing:** `main` ← scripts + skills + docs/ADRs · `six-pack` ← role prompts, constitution articles, templates, manifest, `swarmforge.conf`. +- Never push `main` without explicit request; **never** push `upstream` (`gh` defaults to upstream → always `--repo gabadi/swarm-forge`). + +## Source legend + +- **ADR** — `docs/adr/NNNN-*.md` (decision + rationale + `## Pending implementation`). +- **B6** — `backup/six-pre-reset` (real pre-reset six-pack artifacts: prompts, manifest, template, conf). Re-merge onto *current* prompts; do **not** copy whole files (they predate current upstream; some carry behavior the ADRs removed). +- **I20A** — `feat/issue-20-a-retro-skill-upgrade` (`swarmforge/skills/agent-retro/`, `AGENTS.md`). +- **I20B** — `feat/issue-20-b-bundle-knowledge-injection:docs/specs/issue-20-knowledge-promotion-loop.md` (locked curator-loop spec, PRs A→B→C→D; **spec wins** over issue #20; budgets AGENTS.md ≤60 / role files ≤40). + +## Per-row recovery docs (exact recover-from `branch:path` + delta + STRIP per item) + +- `docs/migrations/main-script-layer.md` — all Section A + Section C `swarmforge.sh`/scripts rows. **⚠ Idea B + cmux + M3 + executing-fields are one entangled ~400-line restructure — gating decision: keep the full cmux delivery model or rebuild lean on upstream's harness.** +- `docs/migrations/six-pack-role-prompts.md` — all Section B/C role-prompt rows + the 3 new roles + final conf window order + the STRIP table (backup content ADRs reversed). +- `docs/migrations/0003-setup-skill-sources.md` — setup skill design recovery (net-new, no code). + +--- + +## A. `main` — scripts / skills / docs + +Script path: `swarmforge/scripts/swarmforge.sh`. Skills path: `swarmforge/skills/`. + +| ADR | Change (one line) | Where | Source | +|-----|-------------------|-------|--------| +| 0006 | In `prepare_worktrees` (`git worktree add`, ~L331) add `git sparse-checkout` excluding the pinned QA-suite path for **every worktree except specifier(`master`) and QA**; verify the path survives each role's handoff commit. | `swarmforge.sh` `prepare_worktrees` | ADR 0006 · **NET-NEW (no impl)** | +| 0012 | `parse_config` (~L182, today rejects ≠4 fields) → accept **≥4 fields**, parse `key=value` tail into a per-role map; `launch_role` (~L414) → append mapped flags per backend. **[edit]** | `swarmforge.sh` | ADR 0012 · recover `backup/main-pre-reset` · **advisor = `advisorModel` in settings.local.json, not `--advisor`** ✅ | +| 0014 | `write_agent_instruction_file` (~L389) → append project-root `AGENTS.md` + `.agents/roles/.md` when present, plus a preamble sentence; missing files silently skipped. | `swarmforge.sh` | ADR 0014 + I20B(PR-B) · **needs Idea B first** | +| 0013 | Upgrade `agent-retro` skill: per-action **scope tag** (`project\|swarmforge\|skill\|ephemeral`), **capture-first** (no pre-filter), **autonomous** mode marking actions `pending-curation` without a human prompt. | `swarmforge/skills/agent-retro/` | ADR 0013 + I20A + I20B(PR-A) | +| 0003 | New **`setup-swarm` skill** (stack detection; writes tooling/permissions/skill-pins/session-tracking; emits the **swarm-ready marker** `.swarmforge/setup-complete`); **setup-first** — operator runs `/setup-swarm` as step one, `./swarm` only **guards** on the marker and refuses if unset (never auto-runs setup). Absorbs Idea O scaffold. *Impl details open: marker format, stack detection (no backup artifact).* | `swarmforge/skills/setup-swarm/` (new) | ADR 0003 | + +--- + +## B. `six-pack` — prompts / constitution / templates / conf + +Roles: `swarmforge/roles/*.prompt` · constitution: `swarmforge/constitution/articles/*.prompt` · `swarmforge/swarmforge.conf`. + +| ADR | Change (one line) | Where | Source | +|-----|-------------------|-------|--------| +| 0002/0003 | Remove the `At startup, install/make-ready …` directive(s): `coder`:9, `QA`:7, `cleaner`:19, `hardender`:8–9. **[edit]** | `roles/*.prompt` | ADR 0002, 0003 | +| 0002 | Add idle-gate rule to each role prompt: "Wait for a handoff. Do not act without one." | `roles/*.prompt` | ADR 0002 | +| 0009 | Add `swarmforge/templates/feature.feature` — **8-section** spec header (TRACKING/CONTRACT/CONSTRAINTS/SEQUENCING/NFR/SIDE EFFECTS/SCOPE + UX INTENT). | `templates/feature.feature` (new) | ADR 0009 + B6 | +| 0009 | Specifier phase 1 starts from the template, addresses **all** sections before scenarios; fix stale count "seven" → **"eight"/"all"**. | `roles/specifier.prompt` | ADR 0009 + B6 | +| 0011 | Add `swarmforge/dependency-manifest.prompt` (3 tier defs inline + Rules section, body `(none)`); auto-resolved by the bundle resolver. | `dependency-manifest.prompt` (new) | ADR 0011 · recover `feat/baseline-scenarios-six` (**obs-harness-six over-deleted the Rules section**) | +| 0011 | Specifier reads the manifest before scenarios; on an undeclared external system → stop, propose name/tier/impl/gaps, wait for approval. | `roles/specifier.prompt` | ADR 0011 + B6 | +| 0010 | Add **surface-tool table** + context-driven acquisition rule (tmux/PTY · Playwright · HTTP client · ingress event-injection) to `engineering.prompt`. | `constitution/articles/engineering.prompt` | ADR 0010 + B6 | +| 0010 | Require a per-surface **baseline scenario** committed with every feature's flow scenarios (idle stability / no console errors / no-op event = no state change). | spec-header + role prompts | ADR 0010 | +| 0015 | Add platform-feasibility **stop rule** to `workflow.prompt` (spec-vs-platform conflict → stop & report; a workaround comment is a defect). | `constitution/articles/workflow.prompt` | ADR 0015 | +| 0005 | Rewrite QA to a **refute** posture (assume build fails spec & tests are weak; attack within the spec; conversion fidelity); replace "Fix bugs found by the QA suite…" (`QA`:14) — local fix in place, structural routes back. **[edit]** | `roles/QA.prompt` | ADR 0005 + B6 | +| 0010 | QA: replace "through the user interface only" (`QA`:13) with "**through the declared surface harness**"; add **every Expected bullet → a harness assertion or `NOT AUTOMATED — `**; QA re-executes committed `observation-harness/`, routes back if a user-facing surface has none. **[edit]** | `roles/QA.prompt` | ADR 0010 + B6 | +| 0004 | Add back-routing rule to role prompts: structural finding routes to its origin stage; local stays with finder; single-finding cap (back **once**) + feature cap **N=3** via routing count in the handoff trail (ux-engineer & integrator carry N=3). | `roles/*.prompt` | ADR 0004 | +| 0007 | Add **UX Engineer** role after coder (runs product, fixes rendering vs UX Intent, universal visual-quality bar incl. WCAG 4.5:1/3:1, writes `observation-harness/` + snapshots + rendering invariants; routes back per 0004 N=3); add conf window after coder. **Strip** DESIGN.md scaffold/walk-up from B6 draft. | `roles/ux-engineer.prompt` (new) + `swarmforge.conf` | ADR 0007 + B6 | +| 0007 | Coder reads UX Intent; specifier authors the UX INTENT section. | `roles/coder.prompt`, `roles/specifier.prompt` | ADR 0007 | +| 0008 | Add terminal **integrator** role (PR + green CI, post-merge gate, one PR/feature, autofix lint only, **hands off to curator**); add conf window. | `roles/integrator.prompt` (new) + `swarmforge.conf` | ADR 0008 + B6 | +| 0008 | Specifier **stops merging**: drop merge step (specifier:36), move specifier off `master` to its own worktree, reset to default branch per feature. **[edit]** | `roles/specifier.prompt` + `swarmforge.conf` | ADR 0008 | +| 0013 | Add terminal **curator** role (promotes retros → `.agents/`+`AGENTS.md` via one self-merging PR, then releases specifier; empty run = pass-through); rewire **integrator→curator→specifier**; conf curator window last; document chain in `workflow.prompt`. | `roles/curator.prompt` (new) + `swarmforge.conf` + `workflow.prompt` | ADR 0013 + B6 + I20B(PR-C) | +| — | **hardener** rendering-invariant property tests for pure rendering fns (state→string) — **unmanifested divergence found in audit**; consistent w/ 0007/0010. | `roles/hardender.prompt:18` | recover `backup/six-pre-reset` | +| 0016 | `cleaner` also scans **boundary files** at ~15–20-site threshold (vs 100 for testable source), extracts logic to a testable module; add the "stripped-view = untested" anti-pattern. | `roles/cleaner.prompt` | ADR 0016 + B6 | + +--- + +## C. Uncaptured implemented divergences — NO ADR (recover from backup, else lost on rebase) + +The 16 ADRs document the **behavioral/prompt** layer but not the **`main`-side script infrastructure**. The items below are real, implemented divergences with **no ADR**, living only in the monolith ADR (`backup/main-pre-reset:docs/adr/0001-fork-divergence.md`, "§Idea X") + the backup/feat branches. **Each verified as still a divergence vs current `upstream/main` (2026-06-14).** They are prerequisites/peers of Section A — a clean rebase that follows only the ADRs would drop them. **Decide per item: write an ADR, or carry as a manifest row.** + +| Idea | Divergence (one line) | Verified vs upstream | Source artifact | ADR? | +|------|----------------------|----------------------|-----------------|------| +| B | **Prompt-bundle inlining** — `write_agent_instruction_file` emits XML envelope `` + `resolve_prompt_bundle` (BFS over `*.prompt` refs, dedup). **KEEP (decision 2026-06-14).** Must be **disentangled from cmux**: port the resolver + envelope onto upstream's tmux harness and wire the bundle into upstream's delivery (NOT cmux's `write_deliver_script`). Prerequisite for M3/0014. | upstream has the naive read-recursively form only | `backup/main-pre-reset:swarmforge/scripts/swarmforge.sh` (`resolve_prompt_bundle`, `write_agent_instruction_file`); re-base, don't lift | **0017** | +| F | **Auto-compaction on role worktrees** — `write_worktree_permissions` merges into `.claude/settings.local.json`: `autoCompactEnabled:true`, `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE:"88"`, `CLAUDE_CODE_AUTO_COMPACT_WINDOW:"200000"`. | absent upstream | `backup/main-pre-reset` (commit 08e7f25); `mono §Idea F:207` | **0020** | +| J | **Session-retro plumbing** — `agent-retro` uses `entire session current`→`session info --transcript >/tmp`; fallback `~/.claude/projects/`; Codex-schema risk accepted; `agent-retro before idle` line in every role prompt. | absent upstream | `feat/issue-20-a…:swarmforge/skills/agent-retro/`; `mono §Idea J:189` | extend **0013** | +| N | **`./swarm upgrade`** — refresh scripts(main)+prompts(source branch)+skills; `install-pins.conf` SHA pinning; `.swarmforge/source-branch` tracking; auto-install skills on first launch via `.swarmforge/skills-installed`. | absent upstream | `mono §Idea N:88` | **0018** | +| O | **Install scaffold** — `.gitignore` gen (`logbook.json`,`tmp/`,`.swarmforge/`); default-branch probe→`swarmforge.conf`; permission allow-rules. **Overlaps setup-swarm skill (0003).** | absent upstream | `mono §Idea O:326` | folds into **0003** | +| — | **Autonomous permission mode** — `--permission-mode auto` (not `acceptEdits`) in `launch_role`. | upstream = `acceptEdits` (L433/442) | `backup/main-pre-reset` (commit 1097233) | **0019** | +| — | **cmux multiplexer backend** — `swarm-mux.sh`. **DROP — not wanted in the new fork (decision 2026-06-14).** Stay on upstream's tmux harness. Dropping this is what un-tangles Idea B / executing-fields / M3. | no mux file upstream | n/a — not reapplied | **DROP** | +| — | **`executing` logbook entry carries `{message,hash,sender}`** for session-restart recovery (ADR 0002 names only the idle/busy marker). | absent upstream | `feat/main-executing-context-fields:swarmforge/scripts/swarmforge.sh` | extend **0002** | +| — | **retro-triage skill** — `.claude/skills/retro-triage/` (~219 lines), diagnosis-first batch retro. **KEEP — restore (decision 2026-06-14).** Byte-identical on all branches; recover as-is. | absent upstream | `feat/issue-20-a…:.claude/skills/retro-triage/SKILL.md` | **0021** | +| — | **Self-referencing fork URL** — `./swarm` self-fetch points at the fork. | upstream points at unclebob | `backup/main-pre-reset` (commit ded6019) | row-only | +| — | **Richer `CONTEXT.md` glossary** — Task / Logbook / Prompt bundle / Bundle cache / Landing / Depth cap / full logbook-status spec; leaner than the backup version. | n/a (docs) | `backup/main-pre-reset:CONTEXT.md` | doc-merge | + +Not-lost / already consistent (no action): curator budget **60/40** (ADR 0013 + I20B spec win over backup prompts' stale 150/300); DESIGN.md **not scaffolded** (ADR 0007 wins over `mono §Idea M`); back-routing **to owning stage** (ADR 0004 wins over `mono §Idea E` "always to coder"). Genuinely rejected (no recover): ideas **G, H, I**. +Also unimplemented draft, not a divergence: `backup/main-pre-reset:docs/proposals/2026-06-11-factory-line-refactor.md` (architecture audit; status draft). + +--- + +## Cross-cutting invariants (do not break while applying) + +- **observation-harness/** is shared: ux-engineer writes (0007), doctrine (0010), QA re-executes (0010), hardener honors rendering invariants — keep consistent. +- **Back-route N=3** (0004) referenced by ux-engineer & integrator — keep the routing-count-in-handoff mechanic. +- **Refuting QA (0005)** is *new*; the B6 QA draft already has the 0010 surface-harness wording — **merge both** when writing QA.prompt. +- **DESIGN.md** is referenced-from-feature-file only (0007) — when porting B6 specifier/ux-engineer, delete scaffold-on-absence and nearest-file walk-up. +- **Curator PRs land in order** A→B→C→D (I20B); everything else is independently landable. + +## Still open (decisions / unknowns) + +*(resolved 2026-06-14, grilling session — what's-missing pass)* + +0. **Section C scope** — RESOLVED. All Section-C items kept (cmux already dropped). ADR assignments: B→**0017**, N→**0018**, auto-permission→**0019**, F→**0020**, retro-triage→**0021**; J→extend **0013**, executing-fields→extend **0002**, O→folds into **0003**; self-url→row-only; CONTEXT glossary→doc-merge. **Idea B remains a hard prerequisite for M3/ADR 0014.** +1. **ADR 0003 setup-swarm skill** — idea-K conflict RESOLVED: setup is **setup-first** (operator runs `/setup-swarm` as step one); `./swarm` **guards** on the `.swarmforge/setup-complete` marker and refuses if absent — it never auto-runs setup. Skill **renamed `setup` → `setup-swarm`**. Idea O folds in. *Remaining impl details (not blockers): marker content format, stack-detection mechanism, per-language tool selection — captured in `docs/migrations/0003-setup-skill-sources.md`.* +2. **ADR 0002 clear-first on six-pack** — RESOLVED: the model column is **configuration** (governed by ADR 0012's per-role model), not an architectural decision. No codex-hook work is added. ADR 0002 stands as written — clear-first is claude-first; codex roles keep upstream delivery as a documented property. +3. *(resolved earlier)* cmux **DROPPED** (stay on upstream tmux harness); Idea-B bundle-inlining **KEPT** but disentangled — port `resolve_prompt_bundle` + XML envelope onto upstream's harness, re-base executing-fields/M3 on it. ADR 0012 `--advisor` resolved (`advisorModel` in `settings.local.json`). +4. **four-pack** — RESOLVED: kept as a **pure merge-mirror of `upstream/four-pack`** (no fork content ever) to honor ADR 0001's "all branches == upstream"; resync via merge-only. +5. **PR shape for implementation** — DEFERRED to implementation time (does not affect the ADR set). Note the one-difference-per-ADR rule; likely grouped by layer + dependency (B → 0014/M3 → executing-fields ordered). + +**Overriding constraint (all items):** keep the diff vs upstream as small as possible — translate to the minimal additive form, do not lift the pre-reset implementation. See `feedback-minimize-upstream-diff` memory. + diff --git a/docs/migrations/0003-setup-skill-sources.md b/docs/migrations/0003-setup-skill-sources.md new file mode 100644 index 0000000..e3f11aa --- /dev/null +++ b/docs/migrations/0003-setup-skill-sources.md @@ -0,0 +1,55 @@ +# Migration source list — ADR 0003 setup skill + +Working source list to implement the **setup skill** (ADR 0003) without losing decisions already made in the pre-reset work. ADR 0003 decided *that* setup becomes a one-time skill; the *how* lives scattered across idea-K, the monolith ADR, ideas N/O, and the "At startup, install…" lines being removed. There is **no implemented setup skill in any branch** (confirmed) — this is design recovery, not code recovery. + +Refs: `idea-K` = `origin/docs/ideas-backlog:docs/ideas/idea-K-setup-preflight.md` · `mono` = `backup/main-pre-reset:docs/adr/0001-fork-divergence.md` · `ADR` = `docs/adr/0003-setup-is-a-one-time-skill.md`. + +## ✅ Resolved (2026-06-14): setup-first, guard-only; skill renamed `setup-swarm` + +- **idea-K** (auto-run on first launch) is **superseded.** `./swarm` never runs setup; the auto-run + stale `backup/main-pre-reset:CLAUDE.md:12` line are dead. +- **ADR 0003 form wins:** setup is **setup-first** — the operator runs `/setup-swarm` as the project's *first* action. `./swarm` is the *second* action and only **guards**: if `.swarmforge/setup-complete` is absent it refuses and tells the operator to run `setup-swarm` first. +- **Skill renamed `setup` → `setup-swarm`** (operator-facing `/setup-swarm`). Glossary updated (`CONTEXT.md`: `setup-swarm`, `swarm-ready marker`). Skill path: `swarmforge/skills/setup-swarm/`. + +## Decisions already made (cite before re-deciding) + +- Setup is a **skill** (fork-owned file, zero upstream conflict), not a `swarmforge.sh` function — `ADR`. +- Run path installs **no project tooling**; `./swarm` still self-fetches scripts, does worktree/session plumbing, **and auto-installs the swarm's own `entire` skills (pin-aware `ensure_skills_installed`, owned by ADR 0018)**; stops if the project isn't set up — `ADR`. *(Decision 2026-06-14: launcher infra-bootstrap stays automatic; only project provisioning is gated by the setup-swarm marker. See `main-script-layer.md` Idea N row.)* +- Skill **reasons about the stack** (Go vs Java vs Clojure → which tools/gates) — that's the point of a skill over a script — `ADR`. +- `entire enable --no-github --telemetry=false` (no `--agent`; hooks added separately) — `idea-K`, `mono §Idea K`. +- Backends derived from `swarmforge.conf` col 3 → `entire agent add ` per unique value; no user input — `idea-K`, `mono §K:178`. +- Warn-and-continue if `entire` absent (setup never blocks the swarm) — `idea-K`, `mono §K:182`. +- No `./swarm setup` subcommand; force re-run = operator deletes the marker — `idea-K`, `mono §K:180`. +- Idea G (per-tech engineering template system) **rejected** — adding a language is 2–3 lines in the shared table — `idea-G`, `mono:69`. + +## What the setup skill must take over (from the removed "At startup, install…" lines) + +| Category | Detail | Removed-line source | +|----------|--------|---------------------| +| Mutation/CRAP/DRY tools | language mutation + CRAP + DRY, from `engineering.prompt` | `upstream/six-pack:roles/cleaner.prompt:19`, `hardender.prompt:8`, `QA.prompt:7` | +| Acceptance Pipeline (APS) | ensure pipeline in place; build `gherkin-parser` + `gherkin-mutator` from `github.com/unclebob/Acceptance-Pipeline-Specification` | `upstream/six-pack:roles/coder.prompt:9`, `hardender.prompt:9` | +| Session tracking | `entire enable …` + `entire agent add ` per conf backend | `idea-K`, `mono §K` | +| ~~Skill pins~~ → **ADR 0018, not setup-swarm** | `entire` skills at pinned SHA (`install-pins.conf` `ENTIRE_SKILLS_SHA`); 11 skills + `agent-retro` to `.claude/skills/`. **Moved out of setup-swarm (decision 2026-06-14):** this is launcher infra-bootstrap, auto-installed by `./swarm` (`ensure_skills_installed`, pin-aware). Documented in **ADR 0018 (Idea N)**. | `mono §Idea N:100` | +| Permissions | write to `.claude/settings.json`: `Bash(gh pr merge*)` (integrator), `Bash(git reset --hard origin/)` (specifier) | `mono §Idea O:334` | +| Install scaffold | `.gitignore` ← `logbook.json`, `tmp/`, `.swarmforge/`; default-branch probe `git symbolic-ref refs/remotes/origin/HEAD` → `swarmforge.conf` | `mono §Idea O:330-332` | + +Note four-pack equivalents exist (architect/refactorer/coder) but four-pack is **frozen** — six-pack rows above are what matters. + +## Swarm-ready marker + +- Path **`.swarmforge/setup-complete`**; `./swarm` checks it before role launch; absent → refuse (ADR 0003 form). Operator deletes to force re-run. — `idea-K`, `mono §K:180`, `ADR`. +- **Marker content (defaulted 2026-06-14, impl detail):** timestamp + swarmforge SHA (debuggable); refusal message text is impl-level. Not an ADR decision. + +## Open design questions — resolved 2026-06-14 + +1. **Stack detection mechanism** — **RESOLVED: the skill's own domain, not an ADR decision.** setup-swarm is a *skill* precisely because it *reasons* about the stack; the ADR must not prescribe a rigid probe list (that would contradict why it's a skill). The `SKILL.md` reads the repo, infers the stack, and asks the operator only when genuinely ambiguous. +2. **Marker format** — defaulted (see above): timestamp + swarmforge SHA. Impl detail. +3. **How the skill ships** — path `swarmforge/skills/setup-swarm/SKILL.md`, mirroring `agent-retro`. Settled. +4. **Re-run / staleness trigger** — RESOLVED: *project* re-setup = operator deletes the marker (manual, by design). *Skill* staleness = `./swarm` auto-(re)installs pin-aware at launch (ADR 0018), no manual trigger needed. +5. **Idea O scope boundary** — RESOLVED: setup-swarm absorbs `.gitignore`/default-branch/permissions (Idea O); the **`entire` skill install moved OUT to ADR 0018** (launcher bootstrap). No `./swarm install` subcommand. +6. **Per-language tool selection** — **RESOLVED: the skill's domain (same as #1).** The skill reasons from `engineering.prompt`'s tool table; behavior on no-match is skill-level judgment (ask the operator), not an ADR rule. + +## Cross-references + +- Pairs with **Idea N (install/upgrade)** and **Idea O (install scaffold)** — both implemented pre-reset, both **without an ADR**; see the manifest's "Uncaptured implemented divergences" section. The setup skill overlaps their territory and must be designed jointly. +- The removed "At startup" lines are also removed for the idle-gate reason (ADR 0002) — shared seam. + diff --git a/docs/migrations/main-script-layer.md b/docs/migrations/main-script-layer.md new file mode 100644 index 0000000..2bb6a81 --- /dev/null +++ b/docs/migrations/main-script-layer.md @@ -0,0 +1,34 @@ +# Migration recovery — `main` script layer (`swarmforge/scripts/`) + +Per-divergence recovery for everything that touches the launch script on `main`. Base = pristine `upstream/main` (`swarmforge/scripts/swarmforge.sh`, ~554 lines, naive form). Primary source = `backup/main-pre-reset` (~1109 lines — all script divergences stacked linearly). **Re-merge onto current upstream; do not copy the whole file.** + +## ⚠ The entanglement (read first) + +**Idea B (bundle inlining) + cmux backend + M3 (0014) + executing-fields are NOT independent patches.** In `backup/main-pre-reset` they are one ~400-line restructure of the handoff/delivery model: +- cmux refactor introduces `write_deliver_script`, `write_notify_script`, `write_stop_hook`, `write_worktree_notify_wrapper` and **deletes** upstream's `install_shared_constitution_articles`, `sync_worktree_scripts`, `write_tmux_env_file`. +- Idea B's `resolve_prompt_bundle` + rewritten `write_agent_instruction_file` (XML envelope) produce the bundle that `write_deliver_script` passes via `$BUNDLE_PATH`. +- M3 (0014) is a 7-line addendum (commit `1b84895`) **inside** the rewritten `write_agent_instruction_file`. +- executing-fields (commit `a133c71`) live **inside** `write_deliver_script` and `write_stop_hook` heredocs. + +**DECIDED (2026-06-14): cmux is DROPPED; Idea B is KEPT.** So do NOT lift the cmux commit. Instead **disentangle**: port `resolve_prompt_bundle` + the XML-envelope `write_agent_instruction_file` onto upstream's current tmux harness, wire the resolved bundle into upstream's delivery path (not cmux's `write_deliver_script`), then layer M3/0014 and re-base executing-fields onto that. Idea F (auto-compaction) wiring is independent. Skip everything cmux: `swarm-mux.sh`, `swarm-stop.sh`, the `write_deliver_script`/`write_notify_script`/`write_stop_hook` family, `MUX_TARGETS`. + +## Recovery table + +| Row | Recover from | Delta vs upstream / notes | +|-----|-------------|---------------------------| +| **M1 / 0006** sparse-checkout in `prepare_worktrees` | **NET-NEW — no source anywhere** | Write fresh: `git sparse-checkout` excluding the pinned QA path on every worktree except specifier(`master`)+QA; verify path survives handoff commits. Prereq: QA path pinned in specifier prompt. | +| **M2 / 0012** per-role model/effort/advisor | `backup/main-pre-reset:swarmforge.sh` `parse_config`(~L212), `launch_role`(~L870), arrays `ROLE_MODELS/EFFORTS/ADVISORS`(~L42); commits `93f8c5d`, `d467ab7` | `!= 4`→`< 4` + `key=value` loop; per-backend flag locals. **Advisor is NOT a `--advisor` flag** — `write_worktree_advisor` writes `advisorModel` to `.claude/settings.local.json`. ✅ resolves the "does claude --advisor exist" open item. | +| **Idea B** bundle inlining | `backup/main-pre-reset:swarmforge.sh` `resolve_prompt_bundle`(~L797)+`write_agent_instruction_file`(~L825); same on `feat/main-executing-context-fields`, `feat/issue-20-b` | Replace upstream's 2-line "read constitution recursively" heredoc with BFS resolver + XML `` envelope. **Prereq for M3.** Entangled with cmux (see above). | +| **M3 / 0014** append AGENTS.md + .agents/roles | `backup/main-pre-reset:swarmforge.sh` (commit `1b84895`, inside `write_agent_instruction_file`) | 7-line loop appending `AGENTS.md` + `.agents/roles/.md` `` blocks before envelope close + preamble sentence. Cannot land without Idea B. | +| **Idea F** auto-compaction | `backup/main-pre-reset:swarmforge.sh` `write_worktree_permissions`(~L679); commit `93f8c5d` | New fn → `.claude/settings.local.json`: `autoCompactEnabled:true`, `PCT_OVERRIDE:"88"`, `WINDOW:"200000"`; called from `prepare_worktrees`. Shares the file with `write_worktree_advisor` (M2) — both use read-modify-write python3. | +| **executing-fields** | `feat/main-executing-context-fields:swarmforge.sh` (commit `a133c71`, clean +25/-10) | `executing` logbook entry carries `{message,hash,sender}`; inside `write_deliver_script` + `write_stop_hook` heredocs. Cherry-pick `a133c71` once cmux base is in. Partial-fills ADR 0002. | +| **Idea N** upgrade/skills (ADR 0018) | `backup/main-pre-reset:swarmforge.sh` `install_skills`+`ensure_skills_installed`(~L946) + new file `swarmforge/scripts/install-pins.conf`; **`swarm` bootstrap** (root, runnable branches) commit `8994322` adds `upgrade`/`write_source_branch`/`download_from_main` | `swarmforge.sh` part lands on `main`; the `upgrade` subcommand + `.swarmforge/source-branch` live in the root `swarm` file which is **on six-pack/four-pack, not main**. **Decision (2026-06-14): `ensure_skills_installed` STAYS at launch — auto-(re)install the `entire` skills pin-aware, as before.** This is launcher infra-bootstrap (peer of self-fetch + worktree/session plumbing), explicitly allowed; it does NOT violate idle-gate/setup-first, which govern *role* behavior and *project* provisioning, not the launcher bootstrapping its own deps. Skill install is **owned by ADR 0018, not setup-swarm (0003)**. `./swarm upgrade` = explicit refresh of scripts(main) + prompts(`source-branch`) + forced skill reinstall (clears `skills-installed`). | +| **Idea O** install scaffold | `backup/main-pre-reset:swarmforge.sh` `ensure_initial_gitignore`(~L105)+`ensure_runtime_git_excludes`(~L152)+`remove_nonessential_clone_files`(~L165) | `.gitignore`/excludes expansion is implemented (additive). **default-branch probe + permission allow-rules are NET-NEW** → fold into ADR 0003 setup skill. | +| **auto-permission** (ADR 0019) | `backup/main-pre-reset:swarmforge.sh` `launch_role` (commit `1097233`) | `--permission-mode acceptEdits`→`auto` for claude+grok (upstream L433/442). **`auto` verified a real flag value** (Claude Code v2.1.177 choices: acceptEdits, auto, bypassPermissions, default, dontAsk, plan — unlike the phantom `--advisor`). **Decision (2026-06-14): keep `auto`.** Rationale for the ADR: roles run unattended → any prompt is a silent hang; `acceptEdits` still prompts on bash/tool calls. `dontAsk` (deterministic, allow-list-only) was considered and **rejected for allow-list maintenance burden** across every language/tool the swarm runs; `bypassPermissions` rejected (ignores all safety, worktrees aren't sandboxed). `auto` needs ~no config and ships safety rails (blocks force-push-to-main, mass-delete). Consequence: setup-swarm's allow-rules (Idea O) stay a **small, targeted, advisory** set, not a load-bearing whitelist. | +| **cmux** | `backup/main-pre-reset:swarmforge/scripts/swarm-mux.sh` (175 lines, net-new) + `swarm-stop.sh`(66) + `swarmlog.sh`(16); + ~400 lines of `swarmforge.sh` restructure | Largest divergence; see entanglement. Source `swarm-mux.sh` at ~L169; `MUX_TARGETS` array; new write_* fns. | +| **self-url** | root `swarm` bootstrap (commit `ded6019`, runnable branches) | `SCRIPTS_REPO="${SWARMFORGE_SCRIPTS_REPO:-gabadi/swarm-forge}"`. Not on `main`. | + +## What lands where +- **`main` rebase needs:** M1, M2, M3, Idea B, F, O (gitignore part), auto-permission, executing-fields, cmux (`swarmforge.sh` + `swarm-mux.sh`/`swarm-stop.sh`/`swarmlog.sh`), `install-pins.conf`, `install_skills`. +- **Root `swarm` bootstrap (six-pack/four-pack, NOT main):** Idea N `upgrade` subcommand, `source-branch`, self-url. + diff --git a/docs/migrations/six-pack-role-prompts.md b/docs/migrations/six-pack-role-prompts.md new file mode 100644 index 0000000..6ab10b1 --- /dev/null +++ b/docs/migrations/six-pack-role-prompts.md @@ -0,0 +1,56 @@ +# Migration recovery — six-pack role prompts + +Per-role recovery for `swarmforge/roles/*.prompt`. Base = `upstream/six-pack`. **Re-merge deltas onto current upstream prompts; do not copy whole backup files** (they predate upstream and carry content ADRs reversed — see STRIP table). Primary source = `backup/six-pre-reset` unless noted. + +Universal add to **every** role prompt: idle-gate line `"Wait for a handoff. Do not act without one."` (0002) and `"Run agent-retro before going idle."` Back-routing (0004) general rule has **no backup source** — author fresh from ADR 0004 wherever a role needs it (structural finding → origin stage once; local → fix in place; single-finding back-once cap). + +## Existing roles — deltas + +| Role | Re-merge (recover-from `backup/six-pre-reset` unless noted) | STRIP / fix | +|------|------------------------------------------------------------|-------------| +| **coder** | idle-gate; UX-Intent read line (0007); handoff `notify cleaner`→`notify ux-engineer` (0007) | STRIP `## Acceptance Pipeline` block (upstream L8–11, the "At startup… APS" bullets) (0003) | +| **QA** ⚠ | idle-gate; **0010** surface-harness: L13 "through the user interface only"→"through the project surface harness only" + Expected-bullet→assertion/`NOT AUTOMATED` rule + re-execute `observation-harness/` + route-back-if-missing; handoff →`notify integrator` (0008) | STRIP `## Startup Tools` (L7) (0003); `logbook.json`→keep upstream `logbook.jsonl`. **0005 refute posture has NO backup source — author fresh**, replacing L14 "Fix bugs found by the QA suite…" with structural→route-back / local→fix-in-place. Merge 0005 (new) + 0010 (backup) into one prompt. | +| **cleaner** | idle-gate; **0016** boundary-file scan (>15 mutation sites → extract) + stripped-view-as-untested anti-pattern (cleanest source: `feat/baseline-scenarios-six`) | STRIP `At startup, install…` (L19) (0003) | +| **hardender** | idle-gate; rendering-invariant property-test line (L18 — **unmanifested divergence**, see note) | STRIP `## Startup Tools` (L8–9) (0003). STRIP backup's `"merge all queued architect handoffs together"` — **unauthorized, no ADR**; keep upstream's "batch in sorted filename order". | +| **specifier** ⚠ | idle-gate; **0008** worktree reset `git reset --hard origin/` via `git symbolic-ref` (recover from `feat/six-pack-pipeline-order-and-scaffold`, NOT backup); **0008** handoff L36 "merge the changes and ask the user"→"When the curator notifies you… ask the user for the next feature"; **0007** UX-Intent authoring; **0009** start from template + "seven"→**"eight"**; **0011** read dependency-manifest + propose-on-undeclared (recover from `backup`/`feat/issue-20-c`, NOT pipeline-order which dropped it) | STRIP DESIGN.md walk-up + scaffold-on-absence (0007); STRIP backup's `git merge --ff-only origin/master` startup (0008, also hardcodes `master`) | + +⚠ **QA and specifier are the complex merges** — multiple overlapping layers, several from different branches. Apply carefully. + +## STRIP / STALE table (backup content ADRs reversed) +| Stale content | In | Reversed by | +|---------------|-----|-------------| +| DESIGN.md walk-up + scaffold | specifier, ux-engineer | ADR 0007 (reference-from-feature-file only) | +| "seven header sections" | specifier | ADR 0009 (six-pack = eight) | +| `git merge --ff-only origin/master` startup | specifier | ADR 0008 (specifier stops merging; `master` stale) | +| "merge all queued architect handoffs together" | hardender | no ADR — keep upstream sorted-batch | +| `logbook.json` | QA | upstream renamed → `logbook.jsonl` | +| curator budgets 150/300 | curator | ADR 0013 + locked spec = 60/40 | + +## New roles (net-new files) + +### ux-engineer (ADR 0007) — recover `backup/six-pre-reset:swarmforge/roles/ux-engineer.prompt` (≡ `origin/feat/obs-harness-six`; NOT pipeline-order/baseline which lack the `observation-harness/` commit step) +Outline: identity+idle · skip if no `## UX Intent` (→notify cleaner) · UX-Intent verification across Visual Composition/Information Hierarchy/Interaction Feel/State Transitions by running the binary · fix rendering only (back-route to coder for model-state, N=3) · durable artifacts: golden snapshots + rendering invariants + `observation-harness/` scenarios via surface tool · run test suite · `## Visual quality standards` (AI-aesthetic anti-patterns, type hierarchy, WCAG 4.5:1/3:1) · notify cleaner. +**STRIP:** DESIGN.md walk-up; make DESIGN.md fix-authority conditional on a feature-file reference (not tree discovery). + +### integrator (ADR 0008) — recover `backup/six-pre-reset:swarmforge/roles/integrator.prompt` (≡ `feat/issue-20-c`; NOT baseline-scenarios-six which still says "notify specifier") +Outline: identity+idle · own landing, one PR/feature, autofix-lint-only · steps: receive from QA → branch `feat/` → `gh pr create` → watch CI → green: `gh pr merge --squash --delete-branch` + post-merge gate → **notify curator** → CI-red routing (tests→coder, coverage/CRAP/DRY→cleaner, arch→architect; autofix doesn't count; N=3 then `FAILED: depth cap reached`) → agent-retro. +**FIX (locked spec wins):** step 7 must add "Include the specifier handoff name and the post-merge master commit hash." + +### curator (ADR 0013/0014) — authoritative source = `feat/issue-20-b:docs/specs/issue-20-knowledge-promotion-loop.md` **PR C2 verbatim block** (branch `curator.prompt` artifacts have STALE 150/300 budgets — do not cargo-cult) +Outline: identity+idle · only writes `AGENTS.md`+`.agents/` · sources `~/.claude/worklog/retros/*.md` · routing ladder (backlog→AGENTS.md≤60→roles≤40→references→skills-on-2nd→upstream→ledger) · ledger `date|session-id|role|failure-class|verdict|summary` · lifecycle (empty-run→pass-through, knowledge branch, self-merge PR with metric line, move retros to processed/, notify specifier) · 9-check per-item algorithm (scope→recurrence→non-inferable→rule-not-phenomenon→dup/contradiction→global-fix-routing→trigger-load-fit→evidence-pull→sizing). +**Companion changes (locked spec, not on any branch):** specifier wait-on-curator (PR C4); `workflow.prompt` integrator→curator→specifier chain bullet (PR C5). + +## Final `swarmforge.conf` window order (recover `feat/issue-20-c` for 8 windows + curator from `backup/six-pre-reset`) +``` +window specifier codex specifier # was: codex master (0008 moves specifier off master) +window coder codex coder +window ux-engineer codex ux-engineer # 0007: after coder +window cleaner codex cleaner +window architect codex architect +window hardender codex hardender +window QA codex QA +window integrator codex integrator # 0008: after QA +window curator codex curator # 0013: last (only in backup/six-pre-reset) +``` +Note: all roles still on `codex` → clear-first (0002) inert until roles move to `claude` or codex hooks built (open item). `default_branch` is per-feature specifier logic, not a conf field. + From 1fb38b721413ce5682f915e2bbe82394199decae Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Sun, 14 Jun 2026 12:48:30 -0300 Subject: [PATCH 12/67] docs(adr-0017,0018,0019,0020,0021): document section-C script-layer divergences Promote the uncaptured main-side script divergences to ADRs: inlined prompt bundle (idea B, translated onto the tmux harness), swarm self-pin/upgrade (idea N), autonomous permission mode (--permission-mode auto, verified real), worktree auto-compaction (idea F), and the retro-triage operator skill. Co-Authored-By: Claude Opus 4.8 (1M context) --- docs/adr/0017-inlined-prompt-bundle.md | 20 +++++++++++++++++++ .../0018-swarm-pins-and-upgrades-itself.md | 20 +++++++++++++++++++ docs/adr/0019-autonomous-permission-mode.md | 17 ++++++++++++++++ docs/adr/0020-worktree-auto-compaction.md | 15 ++++++++++++++ docs/adr/0021-retro-triage-skill.md | 20 +++++++++++++++++++ 5 files changed, 92 insertions(+) create mode 100644 docs/adr/0017-inlined-prompt-bundle.md create mode 100644 docs/adr/0018-swarm-pins-and-upgrades-itself.md create mode 100644 docs/adr/0019-autonomous-permission-mode.md create mode 100644 docs/adr/0020-worktree-auto-compaction.md create mode 100644 docs/adr/0021-retro-triage-skill.md diff --git a/docs/adr/0017-inlined-prompt-bundle.md b/docs/adr/0017-inlined-prompt-bundle.md new file mode 100644 index 0000000..14fff03 --- /dev/null +++ b/docs/adr/0017-inlined-prompt-bundle.md @@ -0,0 +1,20 @@ +--- +status: accepted +--- + +# The agent context is one inlined, deduplicated prompt bundle + +Upstream builds a role's launch context by concatenating its constitution and role prompt, following `*.prompt` references with a simple recursive read — no deduplication and no structure, just text appended to text. The fork replaces this with a **resolved prompt bundle**: a breadth-first walk over the `*.prompt` reference graph that visits each file once (dedup by resolved path, already-visited references skipped so a cycle cannot loop), emitted as a single XML envelope `` with each source file in its own `` block. + +**The bundle is the unit of delivery, not just of launch.** Clear-first delivery (ADR 0002) wipes the session with `/clear` and then *re-injects the role bundle* before every task. That re-injection needs a single, complete, deduplicated context to re-send — which is exactly what the resolver produces. A naive recursive concatenation is fine to build once at launch but is the wrong shape to re-send reliably on every handoff. + +**It is the prerequisite for knowledge injection.** ADR 0014 appends the project's `AGENTS.md` and the role's `.agents/` file into this same envelope. There is nowhere to append them, and no well-defined boundary to append them at, until the context is a structured bundle rather than flat concatenated text. 0014 and the session-restart `executing` fields (ADR 0002) both build on top of the bundle. + +**Why an XML envelope.** Explicit `` boundaries let the agent tell its constitution from its role prompt from its promoted knowledge, instead of inferring breaks in a wall of concatenated text; and the BFS dedup keeps a cross-referenced constitution (articles, the dependency manifest) from appearing two or three times. + +This divergence is taken in its **minimal translated form**: the resolver and envelope are ported onto upstream's current tmux delivery harness, not lifted from the pre-reset implementation where they were entangled with the dropped cmux backend. + +## Pending implementation + +- `main`: replace the recursive-read heredoc in `write_agent_instruction_file` with `resolve_prompt_bundle` (BFS, dedup by resolved path) emitting the `` envelope; wire the resolved bundle into upstream's delivery path. Source: `backup/main-pre-reset:swarmforge/scripts/swarmforge.sh` — re-base onto current upstream, do not copy. +- Prerequisite for ADR 0014 (`.agents` injection) and the ADR 0002 `executing`-field recovery; both re-base on the bundle. diff --git a/docs/adr/0018-swarm-pins-and-upgrades-itself.md b/docs/adr/0018-swarm-pins-and-upgrades-itself.md new file mode 100644 index 0000000..208053a --- /dev/null +++ b/docs/adr/0018-swarm-pins-and-upgrades-itself.md @@ -0,0 +1,20 @@ +--- +status: accepted +--- + +# The swarm pins and upgrades its own dependencies + +The swarm depends on an external skill set (the `entire` skills) that it installs into the target project's `.claude/skills/`. The fork makes that dependency **pinned and upgradable**: a SHA recorded in `swarmforge/scripts/install-pins.conf`, installed automatically at launch, and refreshable through an explicit `./swarm upgrade`. + +**Pinned, not floating.** `install-pins.conf` records `ENTIRE_SKILLS_SHA`; the swarm installs exactly that SHA and writes it to `.swarmforge/skills-installed`. Moving versions means bumping the pin and committing it on `main` — so two runs weeks apart install identical skills, and an upstream skill change can never alter a run mid-flight. + +**Auto-install is launcher bootstrap, not project setup.** `ensure_skills_installed` runs at launch: if the recorded sentinel matches the pin it does nothing, otherwise it (re)installs. This is the program fetching its own dependencies — the same category as `./swarm` self-fetching its scripts — and is deliberately kept separate from the two things that are *not* automatic: project provisioning (the `setup-swarm` skill, ADR 0003) and role work (the idle gate, ADR 0002). It does not contradict "roles do nothing at startup": the launcher, not any role, installs the skills, and it does so before a single role starts. Skill installation therefore lives here, not in `setup-swarm`. + +**`./swarm upgrade` refreshes the installation.** It re-pulls the scripts (from `main`) and the role prompts (from the branch recorded in `.swarmforge/source-branch`) and forces a skill reinstall. `source-branch` is written on first run so `upgrade` knows whether a checkout's prompts came from `six-pack` or `four-pack`. + +**Why the swarm needs this at all.** A tool whose job is to adapt arbitrary projects must itself be reproducible and updatable in place; without a pin, runs drift; without `upgrade`, an operator's only way to take a fix is to re-clone. + +## Pending implementation + +- `main`: `install_skills` + `ensure_skills_installed` (pin-aware, idempotent) in `swarmforge.sh`, plus the new `swarmforge/scripts/install-pins.conf`. Source: `backup/main-pre-reset` (`~L946`). +- Runnable branches (`six-pack`/`four-pack`, root `swarm` bootstrap — not `main`): the `upgrade` subcommand, `download_from_main`, `write_source_branch`, and `.swarmforge/source-branch` tracking. Source: `swarm` bootstrap commit `8994322`. diff --git a/docs/adr/0019-autonomous-permission-mode.md b/docs/adr/0019-autonomous-permission-mode.md new file mode 100644 index 0000000..28934eb --- /dev/null +++ b/docs/adr/0019-autonomous-permission-mode.md @@ -0,0 +1,17 @@ +--- +status: accepted +--- + +# Roles run unattended in autonomous permission mode + +Upstream launches the `claude` and `grok` roles with `--permission-mode acceptEdits`, which auto-approves file edits but still raises an interactive permission prompt on every bash/tool call. The fork's roles run **fully unattended** in isolated worktrees — there is no human present to answer that prompt, so for the fork the prompt is not a safety net, it is a silent hang. The fork launches with `--permission-mode auto`. + +**Why `auto` and not the other never-prompt modes.** Claude Code offers three modes that never block on a prompt: `auto`, `dontAsk`, and `bypassPermissions`. `bypassPermissions` ignores all allow/deny rules and ships no safety checks — unacceptable for worktrees that touch a real repository and the network. `dontAsk` is deterministic but runs only an explicit allow-list and denies everything else, which would mean building and maintaining an exhaustive command allow-list spanning every language and tool the swarm drives — ongoing complexity the fork chooses not to take on. `auto` keeps roles moving with near-zero configuration while retaining built-in guardrails (it still refuses force-pushes to the main branch, mass deletion, and similar high-blast-radius actions). Because `auto` is in force, the permission allow-rules that `setup-swarm` writes (ADR 0003) stay a small, targeted, advisory set rather than a load-bearing whitelist. + +**This is a real mode, deliberately verified.** `auto` is one of Claude Code's documented `--permission-mode` values — unlike the per-role advisor knob of ADR 0012, which turned out to have no CLI flag and had to be written to settings instead. The lesson there was applied here before committing to the divergence. + +The `codex` backend launches with no permission-mode flag at all, so this change touches only the `claude` and `grok` launch lines. + +## Pending implementation + +- `main`: change `--permission-mode acceptEdits` → `auto` on the `claude` and `grok` lines in `launch_role` (a one-word change on each line; reapply on every upstream sync). Source: `backup/main-pre-reset` commit `1097233`. diff --git a/docs/adr/0020-worktree-auto-compaction.md b/docs/adr/0020-worktree-auto-compaction.md new file mode 100644 index 0000000..0aa26b7 --- /dev/null +++ b/docs/adr/0020-worktree-auto-compaction.md @@ -0,0 +1,15 @@ +--- +status: accepted +--- + +# Role worktrees auto-compact before context overflow + +A swarm role can run a long, many-turn session — build, run the suite, read failures, fix, re-verify — that walks its context toward the model's window limit. Upstream leaves context management to the client's defaults. The fork provisions each role worktree so the role **compacts its own context before it overflows** rather than failing partway through a task. + +**The settings.** Each worktree's `.claude/settings.local.json` is given `autoCompactEnabled: true`, `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE: "88"` (compact at 88% of the window) and `CLAUDE_CODE_AUTO_COMPACT_WINDOW: "200000"`. The thresholds are tunable; these are the fork's current defaults, set to leave headroom ahead of a hard limit so compaction happens on the role's terms, not as a crash. + +**Why per-worktree `settings.local.json`.** The file is fork-owned and not upstream-tracked, so writing to it adds no merge-conflict surface — the additive divergence ADR 0001 asks for. It is also the same provisioning seam already used to write the per-role advisor (ADR 0012); both perform a read-modify-write into this one file, so they share a single mechanism rather than each inventing its own. + +## Pending implementation + +- `main`: write the three keys into each worktree's `.claude/settings.local.json` (a `write_worktree_permissions` step, or folded into the existing advisor writer), called from `prepare_worktrees`; share the read-modify-write with the ADR 0012 advisor writer. Source: `backup/main-pre-reset` (`write_worktree_permissions`, ~L679; commit `93f8c5d`). diff --git a/docs/adr/0021-retro-triage-skill.md b/docs/adr/0021-retro-triage-skill.md new file mode 100644 index 0000000..146bd9d --- /dev/null +++ b/docs/adr/0021-retro-triage-skill.md @@ -0,0 +1,20 @@ +--- +status: accepted +--- + +# retro-triage: operator root-cause diagnosis, distinct from the curator + +The fork keeps a `retro-triage` skill: an operator-invoked tool that turns a *batch* of session retros into a validated, cross-session **root-cause diagnosis** from which a human files issues. It lives in `.claude/skills/` (an operator tool), not `swarmforge/skills/` (the skills the swarm's own roles run). + +**Why it exists alongside the curator.** The curator (ADR 0013) already consumes session retros — but autonomously, one item at a time, to promote agent-facing knowledge into the repo. retro-triage is its complement, not a duplicate. The curator fixes *"the swarm doesn't **know** X"* — a missing rule becomes repo knowledge. retro-triage fixes *"the swarm is **structurally doing** X wrong"* — a pipeline, tooling, or strategy defect becomes a filed issue for a human to act on. The structural causes it hunts (one upstream decision surfacing as different pains across five roles) are precisely what a per-item consumer like the curator cannot see, because they live *across* retros and below any single retro's notice. + +**Diagnosis is the product, not sorting.** The skill exists to prevent two failure modes that occurred in real runs: codifying a workaround as a win (a slick technique that only exists to cope with a self-inflicted problem is evidence of cost, not a pattern), and inheriting a retro's own proposed fix (the retro reports a symptom; its suggested fix is a hypothesis, not a finding). Every root cause is recorded with re-pullable receipts — a transcript quote by session id, git output, a `file:line` — and validated against the artifacts; an unvalidated cause is not a finding and cannot be filed. + +**Why `.claude/skills/`, not `swarmforge/skills/`.** It is a human's meta-analysis tool, not a step any swarm role executes. Keeping it with the operator skills leaves the swarm's own skill set to the things the swarm itself runs (`agent-retro`, `setup-swarm`). + +**Sharing the retro pool without starvation.** Both the curator and retro-triage read `~/.claude/worklog/retros/`. They must not consume each other's unseen retros: the curator processes and archives retros to `processed/` each run (ADR 0013), and retro-triage reads the full history — live pool plus archive — while tracking its own consolidation independently of the curator's mark. Neither destroys what the other has not yet seen. + +## Pending implementation + +- `main`: restore `.claude/skills/retro-triage/SKILL.md` as-is (byte-identical across branches). Source: `feat/issue-20-a-retro-skill-upgrade`. +- Make retro-triage's retro detector glob the curator's `processed/` archive in addition to the live `~/.claude/worklog/retros/` directory, so curated retros remain visible to a later diagnosis. From 181cb4b458911787b01938f78b92d7c2d9d415b1 Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Sun, 14 Jun 2026 12:48:30 -0300 Subject: [PATCH 13/67] docs(adr-0002,0003,0010,0013): extend ADRs + CONTEXT glossary 0002 session-restart executing {message,hash,sender}; 0013 session-retro transcript capture + before-idle trigger (idea J); 0003 setup-swarm rename, setup-first/guard model (idea-K dropped) + install scaffold (idea O); 0010 hardener rendering-invariant property tests. Glossary: setup-swarm, swarm-ready marker, session retro, retro-triage, prompt bundle. Co-Authored-By: Claude Opus 4.8 (1M context) --- CONTEXT.md | 22 ++++++++++++++++--- ...0002-idle-gate-and-clear-first-delivery.md | 5 ++++- docs/adr/0003-setup-is-a-one-time-skill.md | 10 ++++++--- docs/adr/0010-surface-harness-doctrine.md | 3 +++ docs/adr/0013-curator-knowledge-promotion.md | 3 +++ 5 files changed, 36 insertions(+), 7 deletions(-) diff --git a/CONTEXT.md b/CONTEXT.md index 54ec8ab..f62125f 100644 --- a/CONTEXT.md +++ b/CONTEXT.md @@ -16,9 +16,17 @@ _Avoid_: awake handoff, ready handoff The steps that start a work handoff on a receiver: `/clear` → re-inject the role bundle → send the task message. Runs for work handoffs only, never for presence pings. Delivered immediately if the receiver is idle, or by its Stop hook when it next stops if busy. (Upstream instead types the message straight into the terminal with no clear.) _Avoid_: inject, dispatch -**Setup skill**: -The one-time, stack-aware step that makes a project swarm-ready — installs the project's language quality tools, enables session tracking, grants the agents' permissions, pins skill versions. Ships inside the swarm install and is the first thing the operator runs. The run path (`./swarm`) does no project setup; it stops if the skill has not run. (Upstream instead installs tooling per-role at startup.) -_Avoid_: preflight, bootstrap, onboarding +**Prompt bundle** (role bundle): +The single, structured context a role launches and re-launches with: its constitution and role prompt resolved into one deduplicated XML envelope, into which _promoted knowledge_ (`AGENTS.md` + the role's `.agents/` file) is also injected. It is the unit re-sent on every _delivery sequence_ after `/clear`, not just built once at launch. (Upstream concatenates the prompt files with a plain recursive read, with no dedup or structure.) +_Avoid_: context blob, prompt file, instruction file + +**setup-swarm** (the skill): +The one-time, stack-aware step that makes a project swarm-ready — installs the project's language quality tools, enables session tracking, grants the agents' permissions, pins skill versions, and emits the _swarm-ready marker_. Ships inside the swarm install. **It is the operator's first action on a project** (`/setup-swarm`), before the run path is ever invoked. The run path (`./swarm`) performs no **project provisioning** and never triggers the skill itself; it only **guards** — if the marker is absent it refuses and tells the operator to run `setup-swarm` first. (It still bootstraps the swarm's *own* runtime skills automatically — launcher infrastructure, distinct from project provisioning.) (Upstream instead installs tooling per-role at startup.) +_Avoid_: setup skill, preflight, bootstrap, onboarding + +**Swarm-ready marker**: +The file (`.swarmforge/setup-complete`) that `setup-swarm` writes to record that a project has been made swarm-ready. The run path guards on its presence; the operator deletes it to force a re-run. There is no `./swarm setup` subcommand. +_Avoid_: setup flag, ready file, lock **Integrator**: The terminal role that lands finished work. From the QA-approved commit it opens a pull request, gates on CI, merges only on green, runs the post-merge verification, and notifies the specifier — one PR per feature. It never merges locally: CI is a hard precondition, so a project without CI is not swarm-ready (setup ensures CI; see [[project-fork-divergence-adr-structure]] / ADR 0003). CI failures route to the owning role via [[back-routing]]. (Upstream has no integrator — the specifier merges ad hoc.) @@ -79,3 +87,11 @@ _Avoid_: docs, memory, knowledge base **Knowledge ledger**: `.agents/ledger.md` — the append-only audit the _curator_ writes, one never-pruned line per processed retro item (`date | session-id | role | failure-class | verdict`). Makes recurrence provable: an item rejected before and seen again has proven itself worth promoting. _Avoid_: changelog, history, log + +**Session retro**: +The single per-role, per-session retrospective the `agent-retro` skill writes (automatically, as each role's last step before idle) to the shared retro pool. A symptom report from one role's one session under a keyhole view — its proposed fixes are hints, never findings. The shared input consumed independently by both the _curator_ and _retro-triage_; neither destroys what the other has not yet seen. +_Avoid_: retrospective, session log, postmortem + +**Retro-triage**: +The operator-invoked analysis (the `retro-triage` skill) that turns a *batch* of _session retros_ into a validated, cross-session **root-cause diagnosis** from which a human files issues. Distinct from the _curator_: the curator is autonomous and per-item and fixes "the swarm doesn't *know* X" by promoting agent-facing knowledge into the repo; retro-triage is manual and cross-batch and fixes "the swarm is *structurally doing* X wrong" by surfacing causes no single retro names (and that the per-item curator structurally cannot see) for pipeline/tooling/strategy changes a human must make. Diagnosis is the product, validated against transcripts and git artifacts — not the retros' own framing. +_Avoid_: triage, consolidation, retro processing diff --git a/docs/adr/0002-idle-gate-and-clear-first-delivery.md b/docs/adr/0002-idle-gate-and-clear-first-delivery.md index 5099056..c537fa0 100644 --- a/docs/adr/0002-idle-gate-and-clear-first-delivery.md +++ b/docs/adr/0002-idle-gate-and-clear-first-delivery.md @@ -21,6 +21,9 @@ The marker is set *busy* when a delivery starts and *idle* when the Stop hook fi Ready is implicit (idle + empty queue = ready). Upstream's startup "I'm awake" ping is kept only as an operator-visible **presence** signal — stamped a distinct `presence` type and excluded from the clear-first path, so the Stop hook never clears for it. +**Session-restart recovery.** The idle/busy marker records *whether* a role is working, not *what* it is working on. So the `executing` logbook entry carries the in-flight task itself — `{message, hash, sender}`: the handoff message being acted on, the commit hash it started from, and who sent it. If a role's session dies and is restarted mid-task, that is enough to resume the task rather than lose it along with the handoff. (Upstream's `executing` entry records no such context.) These fields live inside the delivery and Stop-hook scripts, so they re-base on the prompt bundle of ADR 0017. + ## Pending implementation -- `codex`/`grok` hook-based delivery (Claude Code first). The current `six-pack` `swarmforge.conf` runs all six roles on `codex`, so until that is built — or those roles move to `claude` — clear-first delivery applies only to `claude` roles. +- `codex`/`grok` hook-based delivery (Claude Code first). The current `six-pack` `swarmforge.conf` runs all six roles on `codex`, so until that is built — or those roles move to `claude` — clear-first delivery applies only to `claude` roles. The `claude`/`codex` choice is a per-role configuration knob (ADR 0012), not an architectural decision; no `codex` hook work is required for this ADR to stand. +- Add the `{message, hash, sender}` fields to the `executing` logbook entry written in the delivery script and the Stop hook, re-based onto the ADR 0017 bundle delivery. Source: `feat/main-executing-context-fields` commit `a133c71`. diff --git a/docs/adr/0003-setup-is-a-one-time-skill.md b/docs/adr/0003-setup-is-a-one-time-skill.md index a2016d6..c4c87af 100644 --- a/docs/adr/0003-setup-is-a-one-time-skill.md +++ b/docs/adr/0003-setup-is-a-one-time-skill.md @@ -4,9 +4,9 @@ status: accepted # Setup is a one-time skill, not in-execution work -Adapting a project to the swarm — installing the project's language quality tools (mutation, CRAP, DRY, the Acceptance Pipeline commands), enabling session tracking, granting the permissions the agents need, pinning skill versions — lives in a **setup skill** that ships inside the swarm install and is the first thing the operator runs. The run path does no project setup. +Adapting a project to the swarm — installing the project's language quality tools (mutation, CRAP, DRY, the Acceptance Pipeline commands), enabling session tracking, and granting the permissions the agents need — lives in a **`setup-swarm` skill** that ships inside the swarm install. It is the operator's *first* action on a project (`/setup-swarm`); the run path does no project provisioning. (Installing the swarm's *own* pinned `entire` skills is launcher bootstrap, not project setup — that belongs to ADR 0018.) -**Execution installs nothing.** `./swarm` still fetches its own code when missing (the program obtaining itself, not project setup) and still does per-launch plumbing (worktrees, sessions, copying constitution files). It never adapts the project to its stack. If the project has not been set up, `./swarm` stops and says so rather than installing anything. +**Setup runs first; the run path only guards.** `setup-swarm` writes a **swarm-ready marker** (`.swarmforge/setup-complete`) when it finishes. `./swarm` checks that marker before launching any role and, if it is absent, refuses and tells the operator to run `setup-swarm` first — it never runs setup itself. (An earlier design had `./swarm` auto-run setup on first launch; that is superseded — setup is an explicit operator step and the launcher merely verifies it happened.) `./swarm` still fetches its own code when missing, bootstraps its own pinned skills (ADR 0018), and does per-launch plumbing (worktrees, sessions, copying constitution files); it never adapts the *project* to its stack. The operator deletes the marker to force a re-run. **The only edits to upstream files are four role-prompt lines.** The "At startup, install the language tools" directives in `coder`, `QA`, `cleaner`, and `hardender` are removed; that install work moves into the setup skill and runs once. ADR 0002 already removes these same lines for the idle gate (a role does nothing until handed off); here they go for a second, complementary reason — tool install is a one-time setup step, not per-task startup work. The removal is the seam between the two decisions; neither owns it alone. @@ -14,6 +14,10 @@ Adapting a project to the swarm — installing the project's language quality to **Why replace rather than overlay.** Setup is an explicit one-time step; the run path stays pure "start the agents." The accepted cost is that the swarm no longer self-installs project tooling on first run — the operator runs the setup skill once before the first `./swarm`. Any setup step this moves out of the run path is named and documented so the divergence stays auditable. +**Setup also lays down the project scaffold.** Beyond tooling, the skill writes the one-time repository scaffold the swarm assumes: a `.gitignore` covering the swarm's runtime artifacts (`logbook.json`, `tmp/`, `.swarmforge/`), the project's default branch probed once (`git symbolic-ref refs/remotes/origin/HEAD`) and recorded in `swarmforge.conf`, and a small, targeted set of permission allow-rules in `.claude/settings.json` (for example `Bash(gh pr merge*)` for the integrator, `Bash(git reset --hard origin/)` for the specifier). Under autonomous permission mode (ADR 0019) those allow-rules are advisory hints rather than a load-bearing whitelist, so the set is kept deliberately small. + ## Pending implementation -- The skill itself: stack detection, the exact tooling/permissions/pins it writes, how it is shipped inside the install, and the "swarm-ready" marker `./swarm` checks before launching. +- The `setup-swarm` skill, shipped at `swarmforge/skills/setup-swarm/` (mirroring `agent-retro`): it reasons about the stack and writes the project tooling, session tracking (`entire enable …` plus `entire agent add ` per `swarmforge.conf` backend), the permission allow-rules, and the `.gitignore`/default-branch scaffold, then writes the marker. *How* it detects the stack is the skill's own domain — deliberately not prescribed here, since reasoning about the stack is the whole reason setup is a skill and not a script. +- `main`: `./swarm` checks `.swarmforge/setup-complete` before launching roles and refuses (with a message to run `setup-swarm`) if it is absent. +- The `entire` skill install is **not** part of this skill — it is launcher bootstrap (ADR 0018). diff --git a/docs/adr/0010-surface-harness-doctrine.md b/docs/adr/0010-surface-harness-doctrine.md index fc8282e..7626d1c 100644 --- a/docs/adr/0010-surface-harness-doctrine.md +++ b/docs/adr/0010-surface-harness-doctrine.md @@ -18,8 +18,11 @@ This is the reference verification loop's execute-and-observe layer (its Steps 5 **QA verifies through the declared surface harness, not "the UI" (idea Q).** Upstream QA's "operate through the user interface only" was right in intent but mechanically silent — it let in-process function calls masquerade as UI verification. The fork replaces the phrase with "through the declared surface harness," and adds an auditable conversion rule: **every Expected bullet maps to a harness assertion, or is explicitly marked `NOT AUTOMATED — `.** This is the mechanism that makes the conversion-fidelity guard of ADR 0005 checkable rather than a matter of QA's word — a silently dropped bullet becomes a visible marker. Findings route back per ADR 0004. +**The hardener pins pure rendering with property tests.** Where rendering is a pure function of state (`state → string`), the hardener writes property-based tests over that function — the structural complement to the UX Engineer's golden snapshots and rendering invariants. A snapshot pins one concrete state's output; a property pins the rule across the input space (every state renders without truncation, every cell stays within bounds), catching the rendering defects that no single captured frame happens to exhibit. + ## Pending implementation - Add the surface tool table + context-driven acquisition rule to `engineering.prompt` on `four-pack` and `six-pack`. - Change QA's "through the UI only" to "through the declared surface harness" and add the Expected-bullet → assertion / `NOT AUTOMATED` rule in `QA.prompt` (both packs). - Require the per-surface baseline scenario to be committed with every feature's flow scenarios. +- `six-pack`: add the rendering-invariant property-test rule for pure rendering functions to `hardender.prompt`. Source: recover from `backup/six-pre-reset`. diff --git a/docs/adr/0013-curator-knowledge-promotion.md b/docs/adr/0013-curator-knowledge-promotion.md index 45a2087..e6cdf08 100644 --- a/docs/adr/0013-curator-knowledge-promotion.md +++ b/docs/adr/0013-curator-knowledge-promotion.md @@ -18,8 +18,11 @@ Upstream ends the line at QA: the specifier merges and asks for the next feature **Loop health is self-reported.** Each PR body carries running totals (`promoted | rejected | upstream | ephemeral`). Kill criterion: fewer than three promotions that survive contact with later sessions over 90 days → disable the curator window; the ledger and promoted docs stay. +**Retros are captured automatically, from the transcript, before idle.** The loop only has something to promote because every role runs `agent-retro` as its last step before going idle — a line added to every role prompt — so a retro is produced for each role-session with no one asking. The skill reconstructs the session from its transcript rather than the role's from-memory account: it extracts via the `entire` CLI (`entire session current` → `session info --transcript`), falling back to Claude Code's `~/.claude/projects/` transcript path when `entire` is absent. Grounding the retro in the transcript is what lets the curator (and `retro-triage`, ADR 0021) judge against what actually happened, not what the role remembers happening. + ## Pending implementation - `six-pack` then `four-pack`: new `curator` role prompt; `swarmforge.conf` gains the curator window (last); rewire — integrator notifies the curator, specifier waits on the curator before the next feature, `workflow.prompt` documents the integrator→curator→specifier chain. - `main`: upgrade the `agent-retro` skill — scope tag on every action, capture-first (no pre-filter), and an autonomous mode that marks actions `pending-curation` without prompting a human. +- `main`: `agent-retro` transcript capture (`entire session current` → `session info --transcript`, with the `~/.claude/projects/` fallback); add the "run `agent-retro` before going idle" line to every role prompt. Source: `feat/issue-20-a-retro-skill-upgrade`. - Pairs with ADR 0014 (the `.agents/` contract the curator writes and the launcher injects). From fbc4c2777a4beb5dda442ef5b50dde02877ce2d9 Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Sun, 14 Jun 2026 13:01:24 -0300 Subject: [PATCH 14/67] docs: fix cross-doc inconsistencies from code review Resolve six consistency/accuracy issues across the fork-divergence ADRs: - four-pack freeze: scrub "apply to four-pack" instructions from ADRs 0008/0009/0010/0011/0013/0015/0016 (frozen per ADR 0001 / manifest) - logbook.json -> logbook.jsonl (real runtime file; handoff-lib.sh:33) - ADR 0012 advisor: no --advisor flag; written as advisorModel into settings.local.json (per backup write_worktree_advisor) - manifest Section C: drop stale "16 ADRs / no ADR" (now 0017-0021 assigned) - CONTEXT.md: remove leaked memory-slug wikilink, keep ADR 0003 ref - ADR 0006 holdout: key sparse-checkout on specifier role, not the master worktree name (ADR 0008 renames it to specifier) Co-Authored-By: Claude Opus 4.8 (1M context) --- CONTEXT.md | 2 +- docs/adr/0003-setup-is-a-one-time-skill.md | 2 +- docs/adr/0006-harness-enforced-holdout.md | 4 ++-- docs/adr/0008-integrator-role.md | 2 +- docs/adr/0009-feature-file-spec-header.md | 6 +++--- docs/adr/0010-surface-harness-doctrine.md | 6 +++--- docs/adr/0011-dependency-fidelity-manifest.md | 6 +++--- docs/adr/0012-per-role-model-effort-advisor.md | 6 +++--- docs/adr/0013-curator-knowledge-promotion.md | 2 +- docs/adr/0015-platform-feasibility-stop-rule.md | 2 +- docs/adr/0016-boundary-logic-detection.md | 3 +-- docs/fork-change-manifest.md | 6 +++--- docs/migrations/0003-setup-skill-sources.md | 2 +- docs/migrations/main-script-layer.md | 2 +- 14 files changed, 25 insertions(+), 26 deletions(-) diff --git a/CONTEXT.md b/CONTEXT.md index f62125f..c5fe15f 100644 --- a/CONTEXT.md +++ b/CONTEXT.md @@ -29,7 +29,7 @@ The file (`.swarmforge/setup-complete`) that `setup-swarm` writes to record that _Avoid_: setup flag, ready file, lock **Integrator**: -The terminal role that lands finished work. From the QA-approved commit it opens a pull request, gates on CI, merges only on green, runs the post-merge verification, and notifies the specifier — one PR per feature. It never merges locally: CI is a hard precondition, so a project without CI is not swarm-ready (setup ensures CI; see [[project-fork-divergence-adr-structure]] / ADR 0003). CI failures route to the owning role via [[back-routing]]. (Upstream has no integrator — the specifier merges ad hoc.) +The terminal role that lands finished work. From the QA-approved commit it opens a pull request, gates on CI, merges only on green, runs the post-merge verification, and notifies the specifier — one PR per feature. It never merges locally: CI is a hard precondition, so a project without CI is not swarm-ready (setup ensures CI; see ADR 0003). CI failures route to the owning role via [[back-routing]]. (Upstream has no integrator — the specifier merges ad hoc.) _Avoid_: merger, releaser, deployer **UX Engineer** (six-pack only): diff --git a/docs/adr/0003-setup-is-a-one-time-skill.md b/docs/adr/0003-setup-is-a-one-time-skill.md index c4c87af..7d4f295 100644 --- a/docs/adr/0003-setup-is-a-one-time-skill.md +++ b/docs/adr/0003-setup-is-a-one-time-skill.md @@ -14,7 +14,7 @@ Adapting a project to the swarm — installing the project's language quality to **Why replace rather than overlay.** Setup is an explicit one-time step; the run path stays pure "start the agents." The accepted cost is that the swarm no longer self-installs project tooling on first run — the operator runs the setup skill once before the first `./swarm`. Any setup step this moves out of the run path is named and documented so the divergence stays auditable. -**Setup also lays down the project scaffold.** Beyond tooling, the skill writes the one-time repository scaffold the swarm assumes: a `.gitignore` covering the swarm's runtime artifacts (`logbook.json`, `tmp/`, `.swarmforge/`), the project's default branch probed once (`git symbolic-ref refs/remotes/origin/HEAD`) and recorded in `swarmforge.conf`, and a small, targeted set of permission allow-rules in `.claude/settings.json` (for example `Bash(gh pr merge*)` for the integrator, `Bash(git reset --hard origin/)` for the specifier). Under autonomous permission mode (ADR 0019) those allow-rules are advisory hints rather than a load-bearing whitelist, so the set is kept deliberately small. +**Setup also lays down the project scaffold.** Beyond tooling, the skill writes the one-time repository scaffold the swarm assumes: a `.gitignore` covering the swarm's runtime artifacts (`logbook.jsonl`, `tmp/`, `.swarmforge/`), the project's default branch probed once (`git symbolic-ref refs/remotes/origin/HEAD`) and recorded in `swarmforge.conf`, and a small, targeted set of permission allow-rules in `.claude/settings.json` (for example `Bash(gh pr merge*)` for the integrator, `Bash(git reset --hard origin/)` for the specifier). Under autonomous permission mode (ADR 0019) those allow-rules are advisory hints rather than a load-bearing whitelist, so the set is kept deliberately small. ## Pending implementation diff --git a/docs/adr/0006-harness-enforced-holdout.md b/docs/adr/0006-harness-enforced-holdout.md index 58bd199..e6ff310 100644 --- a/docs/adr/0006-harness-enforced-holdout.md +++ b/docs/adr/0006-harness-enforced-holdout.md @@ -10,7 +10,7 @@ Upstream holds the end-to-end QA suite back from the coder by prompt instruction **Mechanism: `git sparse-checkout`, not file deletion.** The worktree-prep step the harness already runs sets a sparse-checkout on each role worktree that excludes the QA-suite path. Sparse-checkout makes the file *absent from disk but still tracked in the commit* — so the role cannot read it, yet its commit cannot accidentally drop it downstream. Naive deletion (`rm` from the worktree) was rejected for exactly this reason: the role commits with `git add`, the deletion gets staged, and the suite vanishes for QA. A separate QA-only branch was rejected as more flow change for no extra protection. -**Scope: hide from implementers, keep for author and verifier.** The exclusion applies to every worktree *except* the specifier's (`master` — it authors the suite) and QA's (it runs the suite — it is the verifier). Coder, UX Engineer, cleaner, architect, and hardener all touch the implementation before QA and so are walled. The integrator never touches implementation; its worktree is irrelevant either way. +**Scope: hide from implementers, keep for author and verifier.** The exclusion applies to every worktree *except* the **specifier's** (it authors the suite) and **QA's** (it runs the suite — it is the verifier). Key the exclusion on the specifier *role*, not a fixed worktree name: it is the `master` worktree on upstream today, but ADR 0008 moves the specifier to its own `specifier` worktree, and this rule must follow it. Coder, UX Engineer, cleaner, architect, and hardener all touch the implementation before QA and so are walled. The integrator never touches implementation; its worktree is irrelevant either way. **Precondition: a fixed QA-suite path.** For the harness to exclude the suite it must live at a deterministic path; the specifier writes the end-to-end QA suite under a pinned location (e.g. `qa/`). This is the only added convention. The existing coder-prompt "ignore it" line stays as defense-in-depth. @@ -18,6 +18,6 @@ Upstream holds the end-to-end QA suite back from the coder by prompt instruction ## Pending implementation -- Add the sparse-checkout exclusion to the worktree-prep step (`six-pack`/scripts), keyed to skip the specifier(master) and QA worktrees. +- Add the sparse-checkout exclusion to the worktree-prep step (`six-pack`/scripts), keyed to skip the specifier's worktree (whatever its name — `master` today, `specifier` once ADR 0008 lands) and QA's. - Pin the end-to-end QA-suite path in the specifier prompt. - Confirm sparse-checkout interacts cleanly with the coder→cleaner→…→QA handoff commits (the excluded path must survive each role's commit untouched). diff --git a/docs/adr/0008-integrator-role.md b/docs/adr/0008-integrator-role.md index a5830a0..2a83ae2 100644 --- a/docs/adr/0008-integrator-role.md +++ b/docs/adr/0008-integrator-role.md @@ -18,7 +18,7 @@ Upstream has no integrator: when QA signals done, the **specifier** merges the w ## Pending implementation -- Runnable branches (`six-pack`; `four-pack` where present): new terminal `integrator` role; `swarmforge.conf` window; specifier worktree change and removal of its merge step. +- Runnable branch (`six-pack`): new terminal `integrator` role; `swarmforge.conf` window; specifier worktree change and removal of its merge step. (four-pack is frozen per ADR 0001 / the change manifest.) - The PR/CI mechanism (platform, e.g. `gh`) named at implementation. - CI-in-place enforced as a setup precondition (`0003`); routing per `0004`. - Terminal handoff target is the curator (`0013`), not the specifier; autofixable lint/format is the integrator's only allowed code change. diff --git a/docs/adr/0009-feature-file-spec-header.md b/docs/adr/0009-feature-file-spec-header.md index c2911ce..2262dd6 100644 --- a/docs/adr/0009-feature-file-spec-header.md +++ b/docs/adr/0009-feature-file-spec-header.md @@ -8,7 +8,7 @@ Upstream feature files are pure Gherkin: a `Feature:` line, then scenarios. The The header is the **spec-authoring layer** the reference verification loop puts ahead of the scenarios (its Step 1): the Gherkin scenarios are the contract *by example*, but they cannot state what is out of scope, what was assumed, what non-functional targets apply, or what side effects must be observed. The header carries exactly that — the WHAT/WHY around the examples — so those concerns are stated once, up front, where every downstream role reads them. -**Sections (four-pack):** `TRACKING` (traceability to an issue), `CONTRACT` (every input, every response shape and status, fields deliberately absent), `CONSTRAINTS` (dataset bounds, validation, exclusions), `SEQUENCING` (ordering / async dependencies, defaults `none`), `NFR` (latency, idempotency key+window, in-flight UI, error distinguishability), `SIDE EFFECTS` (public-contract changes, derived artifacts to regenerate, defaults `none`), `SCOPE` (`Does NOT:` exclusions and `ASSUMED:` assumptions). Each section pairs an `Ask:` (the questions that elicit it) with a `Format:` (how to write the answer). +**Sections (seven base):** `TRACKING` (traceability to an issue), `CONTRACT` (every input, every response shape and status, fields deliberately absent), `CONSTRAINTS` (dataset bounds, validation, exclusions), `SEQUENCING` (ordering / async dependencies, defaults `none`), `NFR` (latency, idempotency key+window, in-flight UI, error distinguishability), `SIDE EFFECTS` (public-contract changes, derived artifacts to regenerate, defaults `none`), `SCOPE` (`Does NOT:` exclusions and `ASSUMED:` assumptions). Each section pairs an `Ask:` (the questions that elicit it) with a `Format:` (how to write the answer). **Six-pack adds an eighth section, `UX INTENT`**, with four dimensions — Visual Composition, Information Hierarchy, Interaction Feel, State Transitions — written as concrete observable statements. Its content and semantics are owned by ADR 0007; the header is merely its home in the feature file. It is six-pack-only because the UX Engineer that consumes it is six-pack-only. @@ -16,5 +16,5 @@ The header is the **spec-authoring layer** the reference verification loop puts ## Pending implementation -- Template already drafted on `four-pack` (7 sections) and `six-pack` (8, with `UX INTENT`); land both. -- Specifier phase 1 starts from the template and addresses all header sections before scenarios. Fix the stale count in the **six-pack** specifier prompt: it says "complete all seven header sections" but the six-pack template has eight — change to "eight" (or "all"). Four-pack's "seven" is correct. +- Template already drafted on `six-pack` (8 sections, with `UX INTENT`); land it. (four-pack is frozen per ADR 0001 / the change manifest — it keeps pure Gherkin, no header.) +- Specifier phase 1 starts from the template and addresses all header sections before scenarios. Fix the stale count in the **six-pack** specifier prompt: it says "complete all seven header sections" but the six-pack template has eight — change to "eight" (or "all"). diff --git a/docs/adr/0010-surface-harness-doctrine.md b/docs/adr/0010-surface-harness-doctrine.md index 7626d1c..a34c4be 100644 --- a/docs/adr/0010-surface-harness-doctrine.md +++ b/docs/adr/0010-surface-harness-doctrine.md @@ -8,7 +8,7 @@ Two defects (a screen blink and a runaway key-repeat) once survived a 250-scenar This is the reference verification loop's execute-and-observe layer (its Steps 5–7) made concrete: build the real thing, drive it through its surface, assert on what comes out. -**Surface tool table (in `engineering.prompt`).** Following the existing language-tool-table pattern, the constitution declares the harness tool per surface type: tmux/PTY for a TUI (`send-keys -l` for raw input at controlled timing, `capture-pane` for screen state over time), Playwright for web, an HTTP client for HTTP APIs, event-injection-at-ingress for headless services. Roles owning live verification — **QA** (both packs) and the **UX Engineer** (six-pack, ADR 0007) — identify the project's surface *from the codebase* and acquire the matching tool before their first harness run, exactly as they acquire language tools. +**Surface tool table (in `engineering.prompt`).** Following the existing language-tool-table pattern, the constitution declares the harness tool per surface type: tmux/PTY for a TUI (`send-keys -l` for raw input at controlled timing, `capture-pane` for screen state over time), Playwright for web, an HTTP client for HTTP APIs, event-injection-at-ingress for headless services. Roles owning live verification — **QA** and the **UX Engineer** (six-pack, ADR 0007) — identify the project's surface *from the codebase* and acquire the matching tool before their first harness run, exactly as they acquire language tools. **No surface field in `project.prompt`.** Roles read the code to know the surface; an explicit declaration would be a meaningless placeholder until the project is customised. @@ -22,7 +22,7 @@ This is the reference verification loop's execute-and-observe layer (its Steps 5 ## Pending implementation -- Add the surface tool table + context-driven acquisition rule to `engineering.prompt` on `four-pack` and `six-pack`. -- Change QA's "through the UI only" to "through the declared surface harness" and add the Expected-bullet → assertion / `NOT AUTOMATED` rule in `QA.prompt` (both packs). +- Add the surface tool table + context-driven acquisition rule to `engineering.prompt` on `six-pack`. (four-pack is frozen per ADR 0001 / the change manifest; all `six-pack`-only below for the same reason.) +- Change QA's "through the UI only" to "through the declared surface harness" and add the Expected-bullet → assertion / `NOT AUTOMATED` rule in `QA.prompt` (`six-pack`). - Require the per-surface baseline scenario to be committed with every feature's flow scenarios. - `six-pack`: add the rendering-invariant property-test rule for pure rendering functions to `hardender.prompt`. Source: recover from `backup/six-pre-reset`. diff --git a/docs/adr/0011-dependency-fidelity-manifest.md b/docs/adr/0011-dependency-fidelity-manifest.md index c27f814..f967718 100644 --- a/docs/adr/0011-dependency-fidelity-manifest.md +++ b/docs/adr/0011-dependency-fidelity-manifest.md @@ -6,7 +6,7 @@ status: accepted A scenario that rests on an emulated dependency the emulator does not actually implement passes green and proves nothing — the system was never exercised against the behavior the scenario claims to cover. The fork makes dependency fidelity **explicit and refusable** through a new constitution sub-file, `swarmforge/dependency-manifest.prompt`, that declares every dependency beyond the system itself by fidelity tier. This is the reference loop's Digital-Twin discipline: a twin is only trustworthy if its fidelity — and its gaps — are stated. -**A separate constitution file, not `project.prompt`.** The manifest holds project-specific dependency data that would clutter `project.prompt`; it lives in its own file, auto-resolved by the same bundle resolver as the other constitution sub-files. It ships on both packs and defaults to `(none)` — a project with no external dependencies declares nothing. +**A separate constitution file, not `project.prompt`.** The manifest holds project-specific dependency data that would clutter `project.prompt`; it lives in its own file, auto-resolved by the same bundle resolver as the other constitution sub-files. It ships on `six-pack` (four-pack is frozen) and defaults to `(none)` — a project with no external dependencies declares nothing. **Three tiers (the system itself is always implicit).** Tier 1 — owned infrastructure run locally as the real engine (e.g. Postgres in Docker). Tier 2 — stateful, protocol-level emulation (preference order: vendor-official emulator > established third-party > a swarm-built twin only as last resort). Tier 3 — external domain the swarm does not own (third-party APIs, other teams' services), wire-level stubbed against a referenced contract. Entry format: `name: tier N; implementation; gaps: `. @@ -16,5 +16,5 @@ A scenario that rests on an emulated dependency the emulator does not actually i ## Pending implementation -- Add `swarmforge/dependency-manifest.prompt` (tier definitions inline, body `(none)`) on `four-pack` and `six-pack`. -- Add the read-manifest / propose-on-undeclared rule to `specifier.prompt` (both packs); QA's refusal of gap-resting scenarios is part of refuting QA (ADR 0005). +- Add `swarmforge/dependency-manifest.prompt` (tier definitions inline, body `(none)`) on `six-pack`. (four-pack is frozen per ADR 0001 / the change manifest.) +- Add the read-manifest / propose-on-undeclared rule to `specifier.prompt` (`six-pack`); QA's refusal of gap-resting scenarios is part of refuting QA (ADR 0005). diff --git a/docs/adr/0012-per-role-model-effort-advisor.md b/docs/adr/0012-per-role-model-effort-advisor.md index 0cddac6..758e19d 100644 --- a/docs/adr/0012-per-role-model-effort-advisor.md +++ b/docs/adr/0012-per-role-model-effort-advisor.md @@ -24,12 +24,12 @@ window architect codex architect model=o3 |-----|-----------|---------| | `model` | all backends | `claude`/`copilot`/`grok`: `--model ` · `codex`: `-c model=""` | | `effort` | claude, copilot, grok | `--effort ` (codex has no effort flag — skipped) | -| `advisor` | claude only | `--advisor ` (ignored for other backends) | +| `advisor` | claude only | written as `advisorModel` into the worktree's `.claude/settings.local.json` — there is **no** `--advisor` CLI flag (ignored for other backends) | **Per-role granularity, not per-backend.** Two `claude` roles can run different models; a global per-backend setting would throw away the value of the role abstraction. **No pre-populated values** ship in the runnable configs — those express topology (roles + worktrees), not opinions about model cost. The feature is fully opt-in: operators add keys only to the lines they care about. ## Pending implementation - `main`: extend `parse_config` in `swarmforge.sh` to accept ≥4 fields and read the `key=value` tail into per-role maps; extend `launch_role` to append the mapped flags per backend when set. (Script lives on `main`; the conf grammar is exercised there.) -- Verify each mapped flag actually exists on the target CLI before relying on it — in particular confirm `claude --advisor` is a real flag; if not, treat `advisor` as reserved-but-inert until the CLI supports it. -- Runnable configs (`four-pack`/`six-pack`) stay topology-only — no keys added. +- `model`/`effort` map to real CLI flags; `advisor` does **not** — there is no `claude --advisor` flag. It is implemented by writing `advisorModel` into each worktree's `.claude/settings.local.json` (a `write_worktree_advisor` step that shares the read-modify-write seam with ADR 0020). Source: `backup/main-pre-reset:swarmforge.sh` `write_worktree_advisor`. +- Runnable config (`six-pack`) stays topology-only — no keys added. (four-pack is frozen per ADR 0001 / the change manifest.) diff --git a/docs/adr/0013-curator-knowledge-promotion.md b/docs/adr/0013-curator-knowledge-promotion.md index e6cdf08..12e7908 100644 --- a/docs/adr/0013-curator-knowledge-promotion.md +++ b/docs/adr/0013-curator-knowledge-promotion.md @@ -22,7 +22,7 @@ Upstream ends the line at QA: the specifier merges and asks for the next feature ## Pending implementation -- `six-pack` then `four-pack`: new `curator` role prompt; `swarmforge.conf` gains the curator window (last); rewire — integrator notifies the curator, specifier waits on the curator before the next feature, `workflow.prompt` documents the integrator→curator→specifier chain. +- `six-pack` (four-pack is frozen per ADR 0001 / the change manifest): new `curator` role prompt; `swarmforge.conf` gains the curator window (last); rewire — integrator notifies the curator, specifier waits on the curator before the next feature, `workflow.prompt` documents the integrator→curator→specifier chain. - `main`: upgrade the `agent-retro` skill — scope tag on every action, capture-first (no pre-filter), and an autonomous mode that marks actions `pending-curation` without prompting a human. - `main`: `agent-retro` transcript capture (`entire session current` → `session info --transcript`, with the `~/.claude/projects/` fallback); add the "run `agent-retro` before going idle" line to every role prompt. Source: `feat/issue-20-a-retro-skill-upgrade`. - Pairs with ADR 0014 (the `.agents/` contract the curator writes and the launcher injects). diff --git a/docs/adr/0015-platform-feasibility-stop-rule.md b/docs/adr/0015-platform-feasibility-stop-rule.md index cf9c8c5..f969ca6 100644 --- a/docs/adr/0015-platform-feasibility-stop-rule.md +++ b/docs/adr/0015-platform-feasibility-stop-rule.md @@ -12,4 +12,4 @@ Upstream has no rule for what a role does when a spec requirement conflicts with ## Pending implementation -- `four-pack` + `six-pack`: add the rule to `swarmforge/constitution/workflow.prompt`. +- `six-pack`: add the rule to `swarmforge/constitution/workflow.prompt`. (four-pack is frozen per ADR 0001 / the change manifest.) diff --git a/docs/adr/0016-boundary-logic-detection.md b/docs/adr/0016-boundary-logic-detection.md index 0a1ddba..64ee55e 100644 --- a/docs/adr/0016-boundary-logic-detection.md +++ b/docs/adr/0016-boundary-logic-detection.md @@ -12,5 +12,4 @@ Boundary files — environmentally-unsuitable adapter shells like TUI drivers, O ## Pending implementation -- `six-pack`: extend `swarmforge/roles/cleaner.prompt` to scan boundary files at the ~15–20 site threshold and add the stripped-view anti-pattern. -- `four-pack`: same in `swarmforge/roles/refactorer.prompt`. +- `six-pack`: extend `swarmforge/roles/cleaner.prompt` to scan boundary files at the ~15–20 site threshold and add the stripped-view anti-pattern. (four-pack — whose equivalent role is `refactorer` — is frozen per ADR 0001 / the change manifest; no change there.) diff --git a/docs/fork-change-manifest.md b/docs/fork-change-manifest.md index e17f379..d6fb0d7 100644 --- a/docs/fork-change-manifest.md +++ b/docs/fork-change-manifest.md @@ -31,7 +31,7 @@ Script path: `swarmforge/scripts/swarmforge.sh`. Skills path: `swarmforge/skills | ADR | Change (one line) | Where | Source | |-----|-------------------|-------|--------| -| 0006 | In `prepare_worktrees` (`git worktree add`, ~L331) add `git sparse-checkout` excluding the pinned QA-suite path for **every worktree except specifier(`master`) and QA**; verify the path survives each role's handoff commit. | `swarmforge.sh` `prepare_worktrees` | ADR 0006 · **NET-NEW (no impl)** | +| 0006 | In `prepare_worktrees` (`git worktree add`, ~L331) add `git sparse-checkout` excluding the pinned QA-suite path for **every worktree except the specifier's and QA's** (key on the specifier role, not the `master` name — ADR 0008 renames its worktree to `specifier`); verify the path survives each role's handoff commit. | `swarmforge.sh` `prepare_worktrees` | ADR 0006 · **NET-NEW (no impl)** | | 0012 | `parse_config` (~L182, today rejects ≠4 fields) → accept **≥4 fields**, parse `key=value` tail into a per-role map; `launch_role` (~L414) → append mapped flags per backend. **[edit]** | `swarmforge.sh` | ADR 0012 · recover `backup/main-pre-reset` · **advisor = `advisorModel` in settings.local.json, not `--advisor`** ✅ | | 0014 | `write_agent_instruction_file` (~L389) → append project-root `AGENTS.md` + `.agents/roles/.md` when present, plus a preamble sentence; missing files silently skipped. | `swarmforge.sh` | ADR 0014 + I20B(PR-B) · **needs Idea B first** | | 0013 | Upgrade `agent-retro` skill: per-action **scope tag** (`project\|swarmforge\|skill\|ephemeral`), **capture-first** (no pre-filter), **autonomous** mode marking actions `pending-curation` without a human prompt. | `swarmforge/skills/agent-retro/` | ADR 0013 + I20A + I20B(PR-A) | @@ -69,7 +69,7 @@ Roles: `swarmforge/roles/*.prompt` · constitution: `swarmforge/constitution/art ## C. Uncaptured implemented divergences — NO ADR (recover from backup, else lost on rebase) -The 16 ADRs document the **behavioral/prompt** layer but not the **`main`-side script infrastructure**. The items below are real, implemented divergences with **no ADR**, living only in the monolith ADR (`backup/main-pre-reset:docs/adr/0001-fork-divergence.md`, "§Idea X") + the backup/feat branches. **Each verified as still a divergence vs current `upstream/main` (2026-06-14).** They are prerequisites/peers of Section A — a clean rebase that follows only the ADRs would drop them. **Decide per item: write an ADR, or carry as a manifest row.** +The behavioral/prompt-layer ADRs (0002–0016) did not originally cover the **`main`-side script infrastructure**. The items below were uncaptured implemented divergences living only in the monolith ADR (`backup/main-pre-reset:docs/adr/0001-fork-divergence.md`, "§Idea X") + the backup/feat branches — **each since dispositioned** (right-hand `ADR?` column): most now have their own ADR (0017–0021), the rest extend an existing ADR, fold into one, or stay a row here. **Each verified as still a divergence vs current `upstream/main` (2026-06-14).** They are prerequisites/peers of Section A — a clean rebase that follows only the original ADRs would drop them. | Idea | Divergence (one line) | Verified vs upstream | Source artifact | ADR? | |------|----------------------|----------------------|-----------------|------| @@ -77,7 +77,7 @@ The 16 ADRs document the **behavioral/prompt** layer but not the **`main`-side s | F | **Auto-compaction on role worktrees** — `write_worktree_permissions` merges into `.claude/settings.local.json`: `autoCompactEnabled:true`, `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE:"88"`, `CLAUDE_CODE_AUTO_COMPACT_WINDOW:"200000"`. | absent upstream | `backup/main-pre-reset` (commit 08e7f25); `mono §Idea F:207` | **0020** | | J | **Session-retro plumbing** — `agent-retro` uses `entire session current`→`session info --transcript >/tmp`; fallback `~/.claude/projects/`; Codex-schema risk accepted; `agent-retro before idle` line in every role prompt. | absent upstream | `feat/issue-20-a…:swarmforge/skills/agent-retro/`; `mono §Idea J:189` | extend **0013** | | N | **`./swarm upgrade`** — refresh scripts(main)+prompts(source branch)+skills; `install-pins.conf` SHA pinning; `.swarmforge/source-branch` tracking; auto-install skills on first launch via `.swarmforge/skills-installed`. | absent upstream | `mono §Idea N:88` | **0018** | -| O | **Install scaffold** — `.gitignore` gen (`logbook.json`,`tmp/`,`.swarmforge/`); default-branch probe→`swarmforge.conf`; permission allow-rules. **Overlaps setup-swarm skill (0003).** | absent upstream | `mono §Idea O:326` | folds into **0003** | +| O | **Install scaffold** — `.gitignore` gen (`logbook.jsonl`,`tmp/`,`.swarmforge/`); default-branch probe→`swarmforge.conf`; permission allow-rules. **Overlaps setup-swarm skill (0003).** | absent upstream | `mono §Idea O:326` | folds into **0003** | | — | **Autonomous permission mode** — `--permission-mode auto` (not `acceptEdits`) in `launch_role`. | upstream = `acceptEdits` (L433/442) | `backup/main-pre-reset` (commit 1097233) | **0019** | | — | **cmux multiplexer backend** — `swarm-mux.sh`. **DROP — not wanted in the new fork (decision 2026-06-14).** Stay on upstream's tmux harness. Dropping this is what un-tangles Idea B / executing-fields / M3. | no mux file upstream | n/a — not reapplied | **DROP** | | — | **`executing` logbook entry carries `{message,hash,sender}`** for session-restart recovery (ADR 0002 names only the idle/busy marker). | absent upstream | `feat/main-executing-context-fields:swarmforge/scripts/swarmforge.sh` | extend **0002** | diff --git a/docs/migrations/0003-setup-skill-sources.md b/docs/migrations/0003-setup-skill-sources.md index e3f11aa..8e09a11 100644 --- a/docs/migrations/0003-setup-skill-sources.md +++ b/docs/migrations/0003-setup-skill-sources.md @@ -30,7 +30,7 @@ Refs: `idea-K` = `origin/docs/ideas-backlog:docs/ideas/idea-K-setup-preflight.md | Session tracking | `entire enable …` + `entire agent add ` per conf backend | `idea-K`, `mono §K` | | ~~Skill pins~~ → **ADR 0018, not setup-swarm** | `entire` skills at pinned SHA (`install-pins.conf` `ENTIRE_SKILLS_SHA`); 11 skills + `agent-retro` to `.claude/skills/`. **Moved out of setup-swarm (decision 2026-06-14):** this is launcher infra-bootstrap, auto-installed by `./swarm` (`ensure_skills_installed`, pin-aware). Documented in **ADR 0018 (Idea N)**. | `mono §Idea N:100` | | Permissions | write to `.claude/settings.json`: `Bash(gh pr merge*)` (integrator), `Bash(git reset --hard origin/)` (specifier) | `mono §Idea O:334` | -| Install scaffold | `.gitignore` ← `logbook.json`, `tmp/`, `.swarmforge/`; default-branch probe `git symbolic-ref refs/remotes/origin/HEAD` → `swarmforge.conf` | `mono §Idea O:330-332` | +| Install scaffold | `.gitignore` ← `logbook.jsonl`, `tmp/`, `.swarmforge/`; default-branch probe `git symbolic-ref refs/remotes/origin/HEAD` → `swarmforge.conf` | `mono §Idea O:330-332` | Note four-pack equivalents exist (architect/refactorer/coder) but four-pack is **frozen** — six-pack rows above are what matters. diff --git a/docs/migrations/main-script-layer.md b/docs/migrations/main-script-layer.md index 2bb6a81..0222c3d 100644 --- a/docs/migrations/main-script-layer.md +++ b/docs/migrations/main-script-layer.md @@ -16,7 +16,7 @@ Per-divergence recovery for everything that touches the launch script on `main`. | Row | Recover from | Delta vs upstream / notes | |-----|-------------|---------------------------| -| **M1 / 0006** sparse-checkout in `prepare_worktrees` | **NET-NEW — no source anywhere** | Write fresh: `git sparse-checkout` excluding the pinned QA path on every worktree except specifier(`master`)+QA; verify path survives handoff commits. Prereq: QA path pinned in specifier prompt. | +| **M1 / 0006** sparse-checkout in `prepare_worktrees` | **NET-NEW — no source anywhere** | Write fresh: `git sparse-checkout` excluding the pinned QA path on every worktree except the specifier's + QA's (key on the specifier role, not the `master` name — ADR 0008 renames its worktree to `specifier`); verify path survives handoff commits. Prereq: QA path pinned in specifier prompt. | | **M2 / 0012** per-role model/effort/advisor | `backup/main-pre-reset:swarmforge.sh` `parse_config`(~L212), `launch_role`(~L870), arrays `ROLE_MODELS/EFFORTS/ADVISORS`(~L42); commits `93f8c5d`, `d467ab7` | `!= 4`→`< 4` + `key=value` loop; per-backend flag locals. **Advisor is NOT a `--advisor` flag** — `write_worktree_advisor` writes `advisorModel` to `.claude/settings.local.json`. ✅ resolves the "does claude --advisor exist" open item. | | **Idea B** bundle inlining | `backup/main-pre-reset:swarmforge.sh` `resolve_prompt_bundle`(~L797)+`write_agent_instruction_file`(~L825); same on `feat/main-executing-context-fields`, `feat/issue-20-b` | Replace upstream's 2-line "read constitution recursively" heredoc with BFS resolver + XML `` envelope. **Prereq for M3.** Entangled with cmux (see above). | | **M3 / 0014** append AGENTS.md + .agents/roles | `backup/main-pre-reset:swarmforge.sh` (commit `1b84895`, inside `write_agent_instruction_file`) | 7-line loop appending `AGENTS.md` + `.agents/roles/.md` `` blocks before envelope close + preamble sentence. Cannot land without Idea B. | From a846c5b59a9cb2cabbb419795a7132b75bdda89d Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Sun, 14 Jun 2026 13:43:04 -0300 Subject: [PATCH 15/67] =?UTF-8?q?docs:=20fork=20divergence=20implementatio?= =?UTF-8?q?n=20plan=20=E2=80=94=202=20PRs,=20one=20per=20delivery=20branch?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Task-by-task plan (C1-C11 on main, D1-D14 on six-pack) re-applying every documented divergence onto pristine upstream. One PR per delivery branch, each divergence an ordered squashed commit. four-pack frozen; cmux dropped. Co-Authored-By: Claude Opus 4.8 (1M context) --- ...26-06-14-fork-divergence-implementation.md | 1232 +++++++++++++++++ 1 file changed, 1232 insertions(+) create mode 100644 docs/superpowers/plans/2026-06-14-fork-divergence-implementation.md diff --git a/docs/superpowers/plans/2026-06-14-fork-divergence-implementation.md b/docs/superpowers/plans/2026-06-14-fork-divergence-implementation.md new file mode 100644 index 0000000..188b245 --- /dev/null +++ b/docs/superpowers/plans/2026-06-14-fork-divergence-implementation.md @@ -0,0 +1,1232 @@ +# Fork Divergence Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Re-apply every documented SwarmForge fork divergence (ADRs 0001–0021 + manifest rows) on top of pristine `upstream`, as **two pull requests — one per delivery branch**: one PR on `main` (scripts + skills), one PR on `six-pack` (prompts + constitution + conf + root swarm). Each PR is the minimal additive diff vs upstream, built from ordered, per-divergence commits. + +**Architecture:** Two delivery branches. `main` carries scripts + skills + docs/ADRs; `six-pack` carries role prompts, constitution articles, templates, the fidelity manifest, `swarmforge.conf`, and the root `swarm` bootstrap. Every branch is kept identical to its `upstream/` and advanced by **merge**, never rebase (ADR 0001). **four-pack is frozen** — no fork content is ever applied to it (manifest decision 2026-06-14); it stays a pure merge-mirror of `upstream/four-pack`. + +**Tech Stack:** zsh (`swarmforge.sh` and the handoff scripts run under zsh — note `${=var}` word-splitting, `typeset -a/-A`, `${var:h}`/`${var:t}` modifiers), Python 3 (settings.local.json read-modify-write), Markdown skills (`SKILL.md`), `*.prompt` plain-text role/constitution files, Gherkin `.feature` templates, `gh` CLI. + +--- + +## Conventions (read before any task) + +**Two PRs, two branches.** Exactly one branch and one PR per delivery branch: +- **PR 1 (MAIN)** — branch `feat/fork-divergences-main` off `origin/main`; all of the MAIN TRACK commits below; PR opened `--base main`. +- **PR 2 (SIX-PACK)** — branch `feat/fork-divergences-six-pack` off `origin/six-pack`; all of the SIX-PACK TRACK commits below; PR opened `--base six-pack`. + +There is **no four-pack PR** (frozen). The two PRs are independent of each other and can proceed in parallel. + +**Commits.** Each divergence is one commit on its track branch, applied in the listed order (the order encodes the within-branch dependencies — e.g. the bundle commit precedes the knowledge-injection commit that extends it). One commit per divergence keeps the single PR reviewable and tailored. Do **not** create extra branches or PRs. + +**Baseline anchor.** The fork layer is re-applied onto a recorded pristine-upstream baseline (ADR 0001): `main` @ `d947f67` (tag `fork-base/2026-06-14-main`) and `six-pack` @ `cbd1697` (tag `fork-base/2026-06-14-six-pack`). As of 2026-06-14 `origin/main`/`origin/six-pack` equal these exactly, so branching off `origin/` == branching off the tag. The two implementation branches come off the real delivery branches, **not** off this docs branch. If `origin` has since advanced, branch off the tag instead so the diff stays measured against the recorded baseline. + +**Merge style.** Fork divergences are **squash-merged** (ADR 0001), so each of these two PRs lands as one clean commit on its delivery branch. Upstream syncs, by contrast, are history-preserving merges (never squashed/rebased — keep upstream's story). A landed commit is never rewritten. + +**Pushing.** **Never** push `main`, `six-pack`, or `upstream` directly without explicit request — push only the two feature branches. `gh` defaults to the `unclebob` upstream remote — always pass `--repo gabadi/swarm-forge`. + +**Minimize-diff rule (overriding constraint).** Translate each divergence to its smallest additive form vs current upstream. Do **not** lift whole files from the backup branches for existing files — re-merge the delta onto the *current* upstream file. Net-new files (new roles, templates, skills) may be recovered whole, but you MUST apply the STRIP/FIX edits called out per commit (the backup artifacts predate upstream and carry behavior the ADRs reversed). + +**Recovery sources.** Recover exact prior content with `git show :`. Key sources: `backup/main-pre-reset` (main script layer), `backup/six-pre-reset` (six-pack prompts/templates), `feat/issue-20-a-retro-skill-upgrade` (agent-retro + retro-triage skills), `feat/issue-20-b-bundle-knowledge-injection` (knowledge-promotion spec / curator), `feat/baseline-scenarios-six` (dependency-manifest, cleaner boundary scan), `feat/six-pack-pipeline-order-and-scaffold` (specifier worktree reset), `feat/issue-20-c-curator-six-pack` (8-window conf, integrator). Line numbers are approximate (`~L###`) — they drift; locate by function/section name, not by line. + +**Verification approach.** There is no bash unit-test harness in this repo. "Tests" are: (a) `shellcheck` on changed shell files where available, (b) `zsh -n ` syntax check, (c) a scratch-project smoke run of the generated artifact (e.g. inspect the bundle `write_agent_instruction_file` produces), and (d) `grep` assertions on prompt/skill text. Each commit states the concrete verification command and expected result. Verify after each commit; a whole-track verification runs before each PR is opened. + +**Commit message footer.** End every commit body with: +``` +Co-Authored-By: Claude Opus 4.8 (1M context) +``` + +--- + +## Commit order (within each branch) + +**MAIN branch** (`feat/fork-divergences-main`) — commit in this order; the only hard dependency is C3→C2 (knowledge injection extends the bundle envelope). C1–C6, C8, C11 all edit `swarmforge.sh`, so a linear commit order avoids any in-file conflict: + +| # | ADR | What | `swarmforge.sh` region / new file | +|---|-----|------|-----------------------------------| +| C1 | 0019 | auto-permission | `launch_role` | +| C2 | 0017 | bundle inlining | `write_agent_instruction_file` + new `resolve_prompt_bundle` | +| C3 | 0014 | knowledge injection (**after C2**) | `write_agent_instruction_file` | +| C4 | 0012 | per-role model/effort/advisor | `parse_config`, `launch_role` + new `write_worktree_advisor` | +| C5 | 0020 | auto-compaction | `prepare_worktrees` + new `write_worktree_permissions` | +| C6 | 0006 | QA holdout sparse-checkout | `prepare_worktrees` | +| C7 | 0002-ext | executing-entry fields | handoff scripts (`swarmforge/scripts/*.sh`) | +| C8 | 0018 | pinned skill install | new `install_skills`/`ensure_skills_installed` + new `install-pins.conf` | +| C9 | 0013/J | agent-retro skill | new `swarmforge/skills/agent-retro/` | +| C10 | 0021 | retro-triage skill | new `.claude/skills/retro-triage/` | +| C11 | 0003 + O | setup-swarm skill + marker guard + scaffold | new `swarmforge/skills/setup-swarm/` + `swarmforge.sh` guard/gitignore | + +**SIX-PACK branch** (`feat/fork-divergences-six-pack`) — commit in this order; the order resolves the shared-file sequencing (`specifier.prompt`: D1,D3,D4,D5,D8,D9,D10 · `QA.prompt`: D1,D2,D3,D6,D7,D9 · `swarmforge.conf`: D8,D9,D10 · `workflow.prompt`: D10,D11): + +| # | ADR | What | Touches | +|---|-----|------|---------| +| D1 | 0002 | idle-gate + agent-retro line | all 6 role prompts | +| D2 | 0003 | strip startup-install directives | coder, QA, cleaner, hardener | +| D3 | 0004 | back-routing rule | role prompts | +| D4 | 0009 | spec-header template + specifier | new `templates/feature.feature`, specifier | +| D5 | 0011 | fidelity manifest + specifier | new `dependency-manifest.prompt`, specifier | +| D6 | 0010 | surface harness | `engineering.prompt`, QA | +| D7 | 0005 | refute QA | QA | +| D8 | 0007 | UX engineer | new `ux-engineer.prompt`, coder, specifier, `swarmforge.conf` | +| D9 | 0008 | integrator + specifier stops merging | new `integrator.prompt`, specifier, QA, `swarmforge.conf` | +| D10 | 0013 | curator + chain rewiring | new `curator.prompt`, integrator, specifier, `workflow.prompt`, `swarmforge.conf` | +| D11 | 0015 | platform-feasibility stop rule | `workflow.prompt` | +| D12 | 0016 | cleaner boundary scan | cleaner | +| D13 | — | hardener rendering invariants | hardener | +| D14 | 0018 | root swarm upgrade + self-url | root `swarm` | + +--- + +# MAIN TRACK → PR 1 + +## Setup: create the main branch + +- [ ] **Create the single branch for all MAIN commits** + +```bash +git fetch origin && git switch -c feat/fork-divergences-main origin/main +# If origin/main has advanced past the recorded baseline, branch off the tag instead: +# git switch -c feat/fork-divergences-main fork-base/2026-06-14-main +``` +All C1–C11 commits land on this one branch. Do not create per-commit branches. This PR is squash-merged (fork-divergence policy, ADR 0001). + +--- + +## C1: ADR 0019 — autonomous permission mode + +**Files:** Modify `swarmforge/scripts/swarmforge.sh` (`launch_role`, the `claude)` and `grok)` arms, ~L433 / ~L442) + +- [ ] **Step 1: Locate the two launch arms** + +Run: `grep -n "permission-mode acceptEdits" swarmforge/scripts/swarmforge.sh` +Expected: two hits inside `launch_role` — the `claude)` arm and the `grok)` arm. + +- [ ] **Step 2: Apply the edit** + +Replace `--permission-mode acceptEdits` with `--permission-mode auto` in both arms. (`auto` is a real Claude Code flag value — verified, unlike the phantom `--advisor`. Roles run unattended, so `acceptEdits` bash/tool prompts hang silently; `auto` ships rails — blocks force-push-to-main and mass-delete.) + +```bash +sed -i '' 's/--permission-mode acceptEdits/--permission-mode auto/g' swarmforge/scripts/swarmforge.sh +``` + +- [ ] **Step 3: Verify** + +Run: `grep -c "permission-mode auto" swarmforge/scripts/swarmforge.sh; grep -c "acceptEdits" swarmforge/scripts/swarmforge.sh; zsh -n swarmforge/scripts/swarmforge.sh && echo SYNTAX_OK` +Expected: `2`, `0`, `SYNTAX_OK`. + +- [ ] **Step 4: Commit** + +```bash +git add swarmforge/scripts/swarmforge.sh +git commit -m "feat(swarmforge): autonomous permission mode for unattended roles (ADR 0019)" +``` + +--- + +## C2: ADR 0017 — prompt-bundle inlining + +**Files:** Modify `swarmforge/scripts/swarmforge.sh` (replace `write_agent_instruction_file` ~L389–413; add `resolve_prompt_bundle`) + +Upstream emits two naive "read recursively" lines. The fork pre-resolves the constitution + role prompt into one deduplicated XML envelope. **Disentangle from cmux:** port ONLY `resolve_prompt_bundle` + the envelope `write_agent_instruction_file`. Do NOT port `write_deliver_script`/`write_notify_script`/`write_stop_hook`/`MUX_TARGETS`. + +- [ ] **Step 1: Read the current naive function** + +Run: `grep -n "write_agent_instruction_file" swarmforge/scripts/swarmforge.sh` +Confirm it emits `Read swarmforge/constitution.prompt, then read every file it refers to recursively...` and uses globals `$CONSTITUTION_FILE`, `$ROLES_DIR`, `$WORKING_DIR` (all set upstream). + +- [ ] **Step 2: Add `resolve_prompt_bundle` above `write_agent_instruction_file`** + +```zsh +resolve_prompt_bundle() { + local role="$1" + typeset -a bundle=() + typeset -A seen=() + typeset -a queue=("$CONSTITUTION_FILE" "$ROLES_DIR/${role}.prompt") + local file rel_path ref ref_abs + + while (( ${#queue[@]} > 0 )); do + file="${queue[1]}" + shift queue + + rel_path="${file#${WORKING_DIR}/}" + [[ ${+seen[$rel_path]} -eq 1 ]] && continue + [[ ! -f "$file" ]] && continue + + seen[$rel_path]=1 + bundle+=("$rel_path") + + while IFS= read -r ref; do + [[ -z "$ref" ]] && continue + ref_abs="$WORKING_DIR/$ref" + [[ ${+seen[$ref]} -eq 0 ]] && queue+=("$ref_abs") + done < <(grep -oE 'swarmforge/[A-Za-z0-9_./-]+\.prompt' "$file" 2>/dev/null || true) + done + + printf '%s\n' "${bundle[@]}" +} +``` + +- [ ] **Step 3: Replace `write_agent_instruction_file` with the envelope form** + +```zsh +write_agent_instruction_file() { + local role="$1" + local prompt_file="$2" + typeset -a bundle_files=() + local rel abs_path + + while IFS= read -r rel; do + [[ -n "$rel" ]] && bundle_files+=("$rel") + done < <(resolve_prompt_bundle "$role") + + { + printf '\n' "$role" + printf '\n' + printf 'This prompt bundle is pre-resolved. Do not open or re-read any swarmforge/*.prompt files — all relevant instructions are already included below.\n' + printf '\n' + for rel in "${bundle_files[@]}"; do + abs_path="$WORKING_DIR/$rel" + [[ -f "$abs_path" ]] || continue + printf '\n' "$rel" + cat "$abs_path" + printf '\n\n' + done + printf '\n' + } > "$prompt_file" +} +``` + +- [ ] **Step 4: Verify** + +Run: `zsh -n swarmforge/scripts/swarmforge.sh && echo SYNTAX_OK` +Then confirm the function references only `$CONSTITUTION_FILE`, `$ROLES_DIR`, `$WORKING_DIR` (set in upstream's init/`parse_config`). For a live check, run the swarm in a scratch dir and inspect a generated `$PROMPTS_DIR/.md` — it should be a single `` envelope with deduped `` blocks, no "read recursively" lines. +Expected: `SYNTAX_OK` + a well-formed envelope. + +- [ ] **Step 5: Commit** + +```bash +git add swarmforge/scripts/swarmforge.sh +git commit -m "feat(swarmforge): pre-resolve role prompt bundle into XML envelope (ADR 0017)" +``` + +--- + +## C3: ADR 0014 — `.agents/` knowledge injection (after C2) + +**Files:** Modify `swarmforge/scripts/swarmforge.sh` (`write_agent_instruction_file`, as written by C2) + +- [ ] **Step 1: Update the preamble line** + +In `write_agent_instruction_file`, change the `` printf to: + +```zsh + printf 'This prompt bundle is pre-resolved. Do not open or re-read any swarmforge/*.prompt files — all relevant instructions are already included below. Project knowledge files (AGENTS.md and your role file under .agents/roles/) are included below when present.\n' +``` + +- [ ] **Step 2: Add the knowledge loop** + +Add `knowledge` to the locals (`local rel abs_path knowledge`). Insert **after** the bundle-files `for` loop and **before** `printf '\n'`: + +```zsh + for knowledge in "AGENTS.md" ".agents/roles/${role}.md"; do + abs_path="$WORKING_DIR/$knowledge" + [[ -f "$abs_path" ]] || continue + printf '\n' "$knowledge" + cat "$abs_path" + printf '\n\n' + done +``` + +- [ ] **Step 3: Acceptance** + +Run: `zsh -n swarmforge/scripts/swarmforge.sh && echo SYNTAX_OK` +In a scratch project with `AGENTS.md` and `.agents/roles/coder.md`: every role's generated bundle carries `AGENTS.md`; only the coder's carries `.agents/roles/coder.md`; removing both produces bundles with no knowledge blocks and no errors. +Expected: `SYNTAX_OK` + the per-role assertions hold. + +- [ ] **Step 4: Commit** + +```bash +git add swarmforge/scripts/swarmforge.sh +git commit -m "feat(swarmforge): inject AGENTS.md + .agents/roles into role bundle (ADR 0014)" +``` + +--- + +## C4: ADR 0012 — per-role model / effort / advisor + +**Files:** Modify `swarmforge/scripts/swarmforge.sh` (`parse_config`, `launch_role`; add arrays + `write_worktree_advisor`) + +- [ ] **Step 1: Declare the three arrays** + +Next to the existing `ROLES`/`AGENTS`/`SESSIONS` declarations, add: + +```zsh +typeset -a ROLE_MODELS=() +typeset -a ROLE_EFFORTS=() +typeset -a ROLE_ADVISORS=() +``` + +- [ ] **Step 2: Relax field count + parse the kv tail in `parse_config`** + +Change `if (( ${#fields[@]} != 4 )); then` → `if (( ${#fields[@]} < 4 )); then`. After the `keyword/role/agent/worktree` assignments, add: + +```zsh + local role_model="" role_effort="" role_advisor="" kv key val kv_i + for (( kv_i = 5; kv_i <= ${#fields[@]}; kv_i++ )); do + kv="${fields[$kv_i]}" + key="${kv%%=*}" + val="${kv#*=}" + case "$key" in + model) role_model="$val" ;; + effort) role_effort="$val" ;; + advisor) role_advisor="$val" ;; + esac + done +``` + +Where the existing arrays are appended, add the parallel appends: + +```zsh + ROLE_MODELS+=("$role_model") + ROLE_EFFORTS+=("$role_effort") + ROLE_ADVISORS+=("$role_advisor") +``` + +- [ ] **Step 3: Add `write_worktree_advisor`** + +```zsh +write_worktree_advisor() { + local worktree_path="$1" + local advisor_model="$2" + local settings_dir="$worktree_path/.claude" + local settings_file="$settings_dir/settings.local.json" + + mkdir -p "$settings_dir" + SETTINGS_FILE="$settings_file" ADVISOR_MODEL="$advisor_model" python3 -c ' +import json, os +p = os.environ["SETTINGS_FILE"] +cfg = {} +try: + with open(p) as f: cfg = json.load(f) +except: pass +cfg["advisorModel"] = os.environ["ADVISOR_MODEL"] +with open(p, "w") as f: json.dump(cfg, f, indent=2) + ' +} +``` + +- [ ] **Step 4: Wire flags into `launch_role`** + +After the existing locals, add: + +```zsh + local role_model="${ROLE_MODELS[$index]}" + local role_effort="${ROLE_EFFORTS[$index]}" + local role_advisor="${ROLE_ADVISORS[$index]}" +``` + +After `write_agent_instruction_file "$role" "$prompt_file"`, add: + +```zsh + [[ -n "$role_advisor" ]] && write_worktree_advisor "$role_worktree" "$role_advisor" +``` + +In the `claude)` arm: + +```zsh + local claude_flags="" + [[ -n "$role_model" ]] && claude_flags+=" --model '$role_model'" + [[ -n "$role_effort" ]] && claude_flags+=" --effort '$role_effort'" +``` +then insert `${claude_flags}` immediately after `claude` in `launch_cmd`. Apply the analogue for `copilot)` (`--model`/`--effort`) and `grok)` (`--model`/`--effort`); for `codex)` use `-c model="$role_model"` only when set. + +- [ ] **Step 5: Verify** + +Run: `zsh -n swarmforge/scripts/swarmforge.sh && echo SYNTAX_OK` +Add a temporary conf line `window coder claude coder model=opus effort=high advisor=sonnet` and confirm `parse_config` accepts it; the existing 4-field lines still parse; `advisorModel` lands in the role worktree's `settings.local.json`. +Expected: `SYNTAX_OK` + both 4-field and 7-field lines parse. + +- [ ] **Step 6: Commit** + +```bash +git add swarmforge/scripts/swarmforge.sh +git commit -m "feat(swarmforge): per-role model/effort/advisor in swarmforge.conf (ADR 0012)" +``` + +--- + +## C5: ADR 0020 — auto-compaction on role worktrees + +**Files:** Modify `swarmforge/scripts/swarmforge.sh` (add `write_worktree_permissions`; call in `prepare_worktrees`) + +- [ ] **Step 1: Add `write_worktree_permissions`** + +```zsh +write_worktree_permissions() { + local worktree_path="$1" + local settings_dir="$worktree_path/.claude" + local settings_file="$settings_dir/settings.local.json" + + mkdir -p "$settings_dir" + SETTINGS_FILE="$settings_file" python3 -c ' +import json, os +p = os.environ["SETTINGS_FILE"] +cfg = {} +try: + with open(p) as f: cfg = json.load(f) +except: pass +cfg["autoCompactEnabled"] = True +cfg.setdefault("env", {}) +cfg["env"]["CLAUDE_AUTOCOMPACT_PCT_OVERRIDE"] = "88" +cfg["env"]["CLAUDE_CODE_AUTO_COMPACT_WINDOW"] = "200000" +with open(p, "w") as f: json.dump(cfg, f, indent=2) + ' +} +``` + +- [ ] **Step 2: Call it from `prepare_worktrees`** + +Inside the per-role loop, after the `git worktree add` block (and after C4's advisor call site), add: + +```zsh + write_worktree_permissions "$worktree_path" +``` +Both writers JSON-merge `settings.local.json`, so calling both is safe and order-independent. + +- [ ] **Step 3: Verify** + +Run: `zsh -n swarmforge/scripts/swarmforge.sh && echo SYNTAX_OK` +After a scratch run, a role worktree's `.claude/settings.local.json` contains `"autoCompactEnabled": true` and the two `env` overrides (alongside any `advisorModel`). +Expected: `SYNTAX_OK` + merged JSON. + +- [ ] **Step 4: Commit** + +```bash +git add swarmforge/scripts/swarmforge.sh +git commit -m "feat(swarmforge): enable auto-compaction on role worktrees (ADR 0020)" +``` + +--- + +## C6: ADR 0006 — harness-enforced QA holdout (sparse-checkout) + +**NET-NEW — no source artifact.** Write fresh. **Files:** Modify `swarmforge/scripts/swarmforge.sh` (`prepare_worktrees`) + +- [ ] **Step 1: Identify the loop variables** + +Run: `grep -n "worktree add\|WORKTREE_NAMES\|ROLES\[" swarmforge/scripts/swarmforge.sh` +Confirm the role variable in `prepare_worktrees`, the specifier worktree (`specifier`, not `master` — ADR 0008), and the QA role name. + +- [ ] **Step 2: Add a pinned QA-path constant** + +Near the top config constants (single source of truth, matches the specifier-authored path): + +```zsh +QA_HOLDOUT_PATH="${SWARMFORGE_QA_HOLDOUT_PATH:-qa-e2e}" +``` + +- [ ] **Step 3: Add conditional sparse-checkout after `git worktree add`** + +Key on the **role** (not worktree name); exclude the holdout from every worktree except specifier's and QA's: + +```zsh + if [[ "$role" != "specifier" && "$role" != "QA" ]]; then + git -C "$worktree_path" sparse-checkout init --no-cone >/dev/null 2>&1 + { + printf '/*\n' + printf '!/%s/\n' "$QA_HOLDOUT_PATH" + } > "$worktree_path/.git/info/sparse-checkout" 2>/dev/null \ + || git -C "$worktree_path" sparse-checkout set --no-cone '/*' "!/${QA_HOLDOUT_PATH}/" >/dev/null 2>&1 + git -C "$worktree_path" read-tree -mu HEAD >/dev/null 2>&1 || true + fi +``` +(Substitute the real role-variable name from Step 1 for `$role`. The holdout stays in the commit/tree — only absent from disk — so it survives each role's handoff commit.) + +- [ ] **Step 4: Verify holdout invisibility + commit survival** + +In a scratch run with a committed `qa-e2e/`: coder/cleaner/architect/hardener worktrees have **no** `qa-e2e/` on disk; specifier + QA **do**; after a role's handoff commit, `git show HEAD:qa-e2e/` still resolves. +Run: `zsh -n swarmforge/scripts/swarmforge.sh && echo SYNTAX_OK` +Expected: `SYNTAX_OK` + invisibility/survival hold. + +- [ ] **Step 5: Commit** + +```bash +git add swarmforge/scripts/swarmforge.sh +git commit -m "feat(swarmforge): sparse-checkout the QA holdout from shaping roles (ADR 0006)" +``` + +--- + +## C7: ADR 0002 (extend) — executing-entry context fields + +**Files:** Modify upstream's handoff scripts under `swarmforge/scripts/` (the script that writes the `executing` logbook entry + the notify + stop-hook paths) + +> ⚠ Reference commit `a133c71` is on the **cmux lineage** (its diff is inside `swarmforge.sh` heredocs that don't exist on pristine upstream). Do **not** cherry-pick — re-author the same field semantics onto upstream's separate handoff scripts. + +- [ ] **Step 1: Find the `executing` entry write site** + +Run: `grep -rn '"executing"\|status.*executing\|executing' swarmforge/scripts/` +The write site is one of `receive-handoff.sh` / `complete-handoff.sh` / `handoff-lib.sh` / the deliver step. Read the intended semantics: +Run: `git show a133c71` +Expected: the entry must carry `{status, timestamp, message, hash, sender}` instead of `{status, timestamp}`. + +- [ ] **Step 2: Add the three fields** + +At the write site, extend the JSON object with: `message` (the task message text the delivery already passes), `hash` (the handoff commit hash in scope), `sender` (the sender role resolved from `sessions.tsv` by matching the sender worktree — mirror `notify-agent.sh`'s existing role resolution). Thread `sender` from `notify-agent.sh` → deliver step → stop-hook re-queue path, following upstream's existing argument-passing convention. + +- [ ] **Step 3: Verify** + +Run: `for f in swarmforge/scripts/*.sh; do zsh -n "$f" || echo "BAD: $f"; done; echo CHECKED` +In a scratch run, trigger a delivery and inspect the `executing` line in `logbook.jsonl`. +Expected: `CHECKED` + the entry carries non-empty `message`, `hash`, `sender`. + +- [ ] **Step 4: Commit** + +```bash +git add swarmforge/scripts +git commit -m "feat(swarmforge): carry {message,hash,sender} in executing logbook entry (ADR 0002)" +``` + +--- + +## C8: ADR 0018 — pinned skill install (main half) + +The `upgrade` subcommand + `source-branch` + self-url live in the root `swarm` (six-pack, D14). This is the `main` script half: pin-aware, idempotent skill install at launch (launcher infra-bootstrap — allowed; does not violate idle-gate/setup-first). + +**Files:** Create `swarmforge/scripts/install-pins.conf`; modify `swarmforge/scripts/swarmforge.sh` + +- [ ] **Step 1: Create `install-pins.conf`** + +```bash +cat > swarmforge/scripts/install-pins.conf <<'EOF' +# Pinned external dependency versions for swarm install/upgrade. +# Bump a SHA here and commit on main to pull in a newer version. + +# entireio/skills — installed to .claude/skills/ in the target project +ENTIRE_SKILLS_SHA=4c9a02513c3ec6ebabd9a9dc6bd8240854a218ac +EOF +``` +Confirm the SHA against `backup/main-pre-reset:swarmforge/scripts/install-pins.conf` and bump if it has moved. + +- [ ] **Step 2: Add `install_skills` + `ensure_skills_installed`** + +Run: `git show backup/main-pre-reset:swarmforge/scripts/swarmforge.sh | grep -n "install_skills\|ensure_skills_installed"` +Add `install_skills()` (sources `install-pins.conf`; copies the in-repo `agent-retro` skill into `.claude/skills/`; fetches entire's skills tarball at `$ENTIRE_SKILLS_SHA` into `.claude/skills/`; writes the SHA to `$STATE_DIR/skills-installed`; warns and continues if offline) and `ensure_skills_installed()` (returns early if the sentinel matches the pinned SHA, else calls `install_skills`). Use the canonical bodies from `backup/main-pre-reset` (`~L946`), kept additive. + +- [ ] **Step 3: Call it in the launch flow** + +After config is parsed and `$STATE_DIR` is known, add: + +```zsh +ensure_skills_installed +``` + +- [ ] **Step 4: Verify** + +Run: `zsh -n swarmforge/scripts/swarmforge.sh && echo SYNTAX_OK` +A second launch is a no-op (sentinel matches); an offline launch warns rather than failing. +Expected: `SYNTAX_OK` + idempotent re-run. + +- [ ] **Step 5: Commit** + +```bash +git add swarmforge/scripts/swarmforge.sh swarmforge/scripts/install-pins.conf +git commit -m "feat(swarmforge): pin-aware idempotent skill install at launch (ADR 0018)" +``` + +--- + +## C9: ADR 0013 / Idea J — agent-retro skill (net-new) + +upstream/main has no `skills/` dir — this is a net-new add. Source = `feat/issue-20-a-retro-skill-upgrade:swarmforge/skills/agent-retro/`. + +**Files:** Create `swarmforge/skills/agent-retro/` + +- [ ] **Step 1: Recover the skill files** + +```bash +for f in $(git ls-tree -r --name-only feat/issue-20-a-retro-skill-upgrade -- swarmforge/skills/agent-retro); do + mkdir -p "$(dirname "$f")" + git show "feat/issue-20-a-retro-skill-upgrade:$f" > "$f" +done +``` + +- [ ] **Step 2: Verify the four locked behaviors** + +```bash +grep -c "pending-curation" swarmforge/skills/agent-retro/SKILL.md # >= 1 +grep -ci "scope" swarmforge/skills/agent-retro/SKILL.md # >= 2 (tag + table column) +grep -ci "capture" swarmforge/skills/agent-retro/SKILL.md # >= 1 +grep -c "session info --transcript\|.claude/projects" swarmforge/skills/agent-retro/SKILL.md # >= 1 +``` +Expected: all thresholds met. If any is 0, re-check the source branch. + +- [ ] **Step 3: Commit** + +```bash +git add swarmforge/skills/agent-retro +git commit -m "feat(swarmforge): add agent-retro skill — scoped, capture-first, autonomous (ADR 0013)" +``` + +--- + +## C10: ADR 0021 — retro-triage skill (net-new, byte-identical) + +Lives under `.claude/skills/` (operator-invoked), distinct from `swarmforge/skills/`. **Files:** Create `.claude/skills/retro-triage/SKILL.md` + +- [ ] **Step 1: Recover byte-identical** + +```bash +mkdir -p .claude/skills/retro-triage +git show feat/issue-20-a-retro-skill-upgrade:.claude/skills/retro-triage/SKILL.md > .claude/skills/retro-triage/SKILL.md +``` + +- [ ] **Step 2: Verify** + +```bash +git diff --no-index <(git show feat/issue-20-a-retro-skill-upgrade:.claude/skills/retro-triage/SKILL.md) .claude/skills/retro-triage/SKILL.md && echo IDENTICAL +wc -l .claude/skills/retro-triage/SKILL.md +``` +Expected: `IDENTICAL`, ~219 lines. + +- [ ] **Step 3: Commit** + +```bash +git add .claude/skills/retro-triage +git commit -m "feat: restore retro-triage skill (ADR 0021)" +``` + +--- + +## C11: ADR 0003 + Idea O — setup-swarm skill, marker guard, scaffold + +NET-NEW skill design (no backup artifact). **Files:** Create `swarmforge/skills/setup-swarm/SKILL.md`; modify `swarmforge/scripts/swarmforge.sh` + +- [ ] **Step 1: Read the design recovery doc** + +Run: `cat docs/migrations/0003-setup-skill-sources.md` +Confirm: setup is **setup-first** (operator runs `/setup-swarm` first); `./swarm` only **guards** on `.swarmforge/setup-complete` and refuses if absent (never auto-runs setup); skill named `setup-swarm`; Idea O folds in; the `entire` skill pins are NOT here (that is C8). + +- [ ] **Step 2: Author `setup-swarm/SKILL.md`** + +Mirror `agent-retro`'s SKILL.md shape; cover, per the design doc: +- **Stack detection** (reason about the language → which quality tools/gates to install — *why* setup is a skill, not a script; don't over-prescribe the mechanism). +- Install the project's mutation/CRAP/DRY tools (those stripped from cleaner/hardener/QA) and APS `gherkin-parser`/`gherkin-mutator` (stripped from coder/hardener). +- Session tracking: `entire enable --no-github --telemetry=false`, then `entire agent add ` per unique backend in `swarmforge.conf` column 3; warn-and-continue if `entire` absent. +- Permission allow-rules to `.claude/settings.json` (`Bash(gh pr merge*)` for integrator, `Bash(git reset --hard origin/*)` for specifier) — a small, advisory set, not a load-bearing whitelist (ADR 0019 `auto` already ships rails). +- Scaffold: ensure `.gitignore` covers `logbook.jsonl`, `tmp/`, `.swarmforge/`; probe the default branch (`git symbolic-ref refs/remotes/origin/HEAD`) and record it for the specifier's per-feature reset. +- Emit the swarm-ready marker `.swarmforge/setup-complete` (content: timestamp + swarmforge SHA — impl detail). + +- [ ] **Step 3: Add the marker guard to `swarmforge.sh`** + +Early in the launch flow (before any role launch; distinct from the `ensure_skills_installed` launcher bootstrap), add: + +```zsh +if [[ ! -f "$STATE_DIR/setup-complete" ]]; then + echo -e "${RED}Error:${RESET} project is not swarm-ready. Run /setup-swarm first." >&2 + exit 1 +fi +``` +The guard never runs setup; it only refuses. + +- [ ] **Step 4: Expand the gitignore/excludes scaffold (Idea O)** + +In `ensure_initial_gitignore`, add `logbook.jsonl`, `tmp/` (plus backup's `swarmtools/`/`logs/`/`agent_context/` if still relevant) — each as an idempotent `grep -qx || append` block and in the initial-creation heredoc. In `ensure_runtime_git_excludes`, expand the `for pattern in ...` loop to the same set. Add `remove_nonessential_clone_files` (recover from `backup/main-pre-reset`) and call it once in the init flow. + +- [ ] **Step 5: Verify** + +Run: `zsh -n swarmforge/scripts/swarmforge.sh && echo SYNTAX_OK` +Launching without the marker exits with "Run /setup-swarm first"; creating `.swarmforge/setup-complete` lets launch proceed; running twice doesn't duplicate `.gitignore` lines. +Expected: `SYNTAX_OK` + guard + idempotent gitignore. + +- [ ] **Step 6: Commit** + +```bash +git add swarmforge/skills/setup-swarm swarmforge/scripts/swarmforge.sh +git commit -m "feat(swarmforge): setup-swarm skill + swarm-ready marker guard + scaffold (ADR 0003, Idea O)" +``` + +--- + +## Finalize PR 1 (MAIN) + +- [ ] **Step 1: Whole-track verification** + +```bash +zsh -n swarmforge/scripts/swarmforge.sh && echo SYNTAX_OK +git diff --stat origin/main # review: only intended files changed, all additive +``` +Expected: `SYNTAX_OK`; the diff touches only `swarmforge/scripts/*`, `swarmforge/skills/*`, `.claude/skills/retro-triage/*` — no role prompts, no conf (those are PR 2). + +- [ ] **Step 2: Push the branch** + +```bash +git push -u origin feat/fork-divergences-main +``` + +- [ ] **Step 3: Open the single PR** + +```bash +gh pr create --base main --repo gabadi/swarm-forge \ + --title "feat: fork divergences — main script + skill layer" \ + --body "Re-applies the main-side fork divergences on pristine upstream, one commit per ADR: 0019 auto-permission, 0017 bundle inlining, 0014 knowledge injection, 0012 per-role config, 0020 auto-compaction, 0006 QA holdout, 0002 executing-fields, 0018 skill install, 0013 agent-retro, 0021 retro-triage, 0003 setup-swarm + Idea O. cmux dropped; four-pack frozen. See docs/superpowers/plans/2026-06-14-fork-divergence-implementation.md and docs/fork-change-manifest.md (Sections A + C)." +``` + +--- + +# SIX-PACK TRACK → PR 2 + +## Setup: create the six-pack branch + +- [ ] **Create the single branch for all SIX-PACK commits** + +```bash +git fetch origin && git switch -c feat/fork-divergences-six-pack origin/six-pack +# If origin/six-pack has advanced past the recorded baseline, branch off the tag instead: +# git switch -c feat/fork-divergences-six-pack fork-base/2026-06-14-six-pack +``` +All D1–D14 commits land on this one branch. This PR is squash-merged (fork-divergence policy, ADR 0001). + +--- + +## D1: ADR 0002 — idle-gate + agent-retro line (all roles) + +**Files:** Modify `swarmforge/roles/{specifier,coder,cleaner,architect,hardender,QA}.prompt` + +- [ ] **Step 1: Add the idle-gate line** + +After the `You are the .` opening of each of the six prompts, insert a blank line then: + +``` +Wait for a handoff. Do not act without one. +``` + +- [ ] **Step 2: Add the agent-retro line** + +As the last bullet of each role's Handoff section: + +``` +- Run `agent-retro` before going idle. +``` + +- [ ] **Step 3: Verify** + +```bash +for r in specifier coder cleaner architect hardender QA; do + grep -q "Wait for a handoff. Do not act without one." "swarmforge/roles/$r.prompt" || echo "MISSING idle-gate: $r" + grep -q "agent-retro\` before going idle" "swarmforge/roles/$r.prompt" || echo "MISSING retro: $r" +done; echo CHECKED +``` +Expected: only `CHECKED`. + +- [ ] **Step 4: Commit** + +```bash +git add swarmforge/roles +git commit -m "feat(roles): idle-gate + agent-retro-before-idle on every role (ADR 0002)" +``` + +--- + +## D2: ADR 0003 — strip startup-install directives + +Install work moves to the setup-swarm skill (C11). **Files:** Modify `swarmforge/roles/{coder,QA,cleaner,hardender}.prompt` + +- [ ] **Step 1: Strip the directives** + +- `coder.prompt`: remove the entire `## Acceptance Pipeline` block (the "At startup, make sure the normal acceptance pipeline …" bullets, ~L8–14). +- `QA.prompt`: remove the `## Startup Tools` section (~L6–7). +- `cleaner.prompt`: remove the "At startup, install the language mutation, CRAP, and DRY tools …" line (~L19). +- `hardender.prompt`: remove the `## Startup Tools` section + APS build line (~L7–10). + +- [ ] **Step 2: Verify** + +```bash +grep -rn "At startup" swarmforge/roles/ ; echo "--- (expect no startup-install directives remain)" +``` +Expected: no remaining "At startup, install/make-ready" directives. + +- [ ] **Step 3: Commit** + +```bash +git add swarmforge/roles +git commit -m "refactor(roles): remove startup install directives — moved to setup-swarm (ADR 0003)" +``` + +--- + +## D3: ADR 0004 — back-routing rule + +No backup source — author fresh from ADR 0004. **Files:** Modify the rework-owning role prompts (coder, cleaner, architect, hardender, QA) + +- [ ] **Step 1: Read the ADR for the exact mechanic** + +Run: `cat docs/adr/0004-rework-routes-back.md` +Confirm: structural finding (re-opens an earlier stage's job) → routes to that origin stage, carried in the handoff; local work stays with the finder; a single finding bounces back at most once; a feature tolerates N=3 cycles total (routing count in the handoff trail); on exceeding, stop and ask the user. + +- [ ] **Step 2: Insert a `## Rework Routing` section before each role's Handoff** + +``` +## Rework Routing +- A structural finding — one that re-opens an earlier stage's decision (an ambiguous or missing spec, a weak or missing test, a design that cannot hold the required behavior) — routes back to the stage that owns that decision, carried in the handoff. +- Local work you can resolve without re-opening an earlier decision stays with you; fix it in place. +- A single finding bounces back at most once. A feature tolerates at most three back-route cycles total (N=3), tracked by the routing count in the handoff trail. On the fourth, stop and ask the user. +``` + +- [ ] **Step 3: Verify** + +```bash +for r in coder cleaner architect hardender QA; do grep -q "## Rework Routing" "swarmforge/roles/$r.prompt" || echo "MISSING: $r"; done; echo CHECKED +``` +Expected: only `CHECKED`. + +- [ ] **Step 4: Commit** + +```bash +git add swarmforge/roles +git commit -m "feat(roles): structural-finding back-routing with N=3 cap (ADR 0004)" +``` + +--- + +## D4: ADR 0009 — spec-header template + specifier wiring + +**Files:** Create `swarmforge/templates/feature.feature`; modify `swarmforge/roles/specifier.prompt` + +- [ ] **Step 1: Recover the template** + +```bash +mkdir -p swarmforge/templates +git show backup/six-pre-reset:swarmforge/templates/feature.feature > swarmforge/templates/feature.feature +``` +Confirm all eight comment sections: `TRACKING`, `CONTRACT`, `CONSTRAINTS`, `SEQUENCING`, `NFR`, `SIDE EFFECTS`, `SCOPE`, `UX INTENT`. + +- [ ] **Step 2: Wire the specifier** + +In Feature Workflow phase 1: start from the template and address all eight header sections (several may resolve to `none` — a deliberate answer) before scenarios. Change any "seven" header-count wording to **"eight"** / "all". + +- [ ] **Step 3: Verify** + +```bash +grep -c "^ # \(TRACKING\|CONTRACT\|CONSTRAINTS\|SEQUENCING\|NFR\|SIDE EFFECTS\|SCOPE\|UX INTENT\)" swarmforge/templates/feature.feature # 8 +grep -n "template\|eight" swarmforge/roles/specifier.prompt +grep -c "seven" swarmforge/roles/specifier.prompt # 0 +``` +Expected: 8 sections; specifier references the template + "eight"; no "seven". + +- [ ] **Step 4: Commit** + +```bash +git add swarmforge/templates/feature.feature swarmforge/roles/specifier.prompt +git commit -m "feat(spec): 8-section feature template; specifier starts from it (ADR 0009)" +``` + +--- + +## D5: ADR 0011 — fidelity manifest + specifier check + +**Files:** Create `swarmforge/dependency-manifest.prompt`; modify `swarmforge/roles/specifier.prompt` + +- [ ] **Step 1: Recover the manifest (with its Rules section)** + +```bash +git show feat/baseline-scenarios-six:swarmforge/dependency-manifest.prompt > swarmforge/dependency-manifest.prompt +``` +⚠ From `feat/baseline-scenarios-six`, NOT `obs-harness-six` (which over-deleted the Rules section). Confirm the 3 tier defs, a `Rules for every declared dependency:` section, and a `## Dependencies` body of `(none)`. + +- [ ] **Step 2: Wire the specifier** + +Add a `## Dependency Manifest` instruction before Feature Workflow: read the manifest before scenarios; on a scenario touching an undeclared external system → stop, propose name/tier/implementation/gaps, wait for approval before adding the entry; never write scenarios resting on an undeclared dependency or a declared gap. Recover exact wording from `backup/six-pre-reset:.../specifier.prompt` or `feat/issue-20-c:.../specifier.prompt` (NOT pipeline-order, which dropped it). + +- [ ] **Step 3: Verify** + +```bash +grep -ci "tier" swarmforge/dependency-manifest.prompt # >= 3 +grep -q "Rules for every declared dependency" swarmforge/dependency-manifest.prompt && echo RULES_OK +grep -q "dependency-manifest" swarmforge/roles/specifier.prompt && echo SPECIFIER_WIRED +``` +Expected: tiers present, `RULES_OK`, `SPECIFIER_WIRED`. + +- [ ] **Step 4: Commit** + +```bash +git add swarmforge/dependency-manifest.prompt swarmforge/roles/specifier.prompt +git commit -m "feat(spec): dependency fidelity manifest + specifier propose-on-undeclared (ADR 0011)" +``` + +--- + +## D6: ADR 0010 — surface harness (engineering article + QA) + +**Files:** Modify `swarmforge/constitution/articles/engineering.prompt`, `swarmforge/roles/QA.prompt` + +- [ ] **Step 1: Add the surface-tool table to `engineering.prompt`** + +Recover the table + context-driven acquisition rule from `backup/six-pre-reset:swarmforge/constitution/articles/engineering.prompt` and merge onto current upstream (a `## Surface Tools` section: tmux/PTY · Playwright · HTTP client · ingress event-injection; live-verification roles pick the minimal sufficient tool per surface). + +- [ ] **Step 2: Edit QA for surface-harness verification** + +In `QA.prompt`: +- Replace "through the user interface only" → "through the project surface harness only". +- Add: every Expected bullet maps to a harness assertion, or is `NOT AUTOMATED — `; asserting constants/config never satisfies a behavioral assertion. +- Add: re-execute the committed `observation-harness/` scenarios before final verification; a user-facing surface with no scenarios routes back (per D3). +- Add the per-surface **baseline scenario** requirement (idle stability / no console errors / no-op event = no state change). + +- [ ] **Step 3: Verify** + +```bash +grep -qi "surface" swarmforge/constitution/articles/engineering.prompt && echo ENG_OK +grep -q "project surface harness only" swarmforge/roles/QA.prompt && echo QA_SURFACE_OK +grep -q "observation-harness" swarmforge/roles/QA.prompt && echo QA_OBS_OK +grep -c "user interface only" swarmforge/roles/QA.prompt # 0 +``` +Expected: `ENG_OK`, `QA_SURFACE_OK`, `QA_OBS_OK`, zero "user interface only". + +- [ ] **Step 4: Commit** + +```bash +git add swarmforge/constitution/articles/engineering.prompt swarmforge/roles/QA.prompt +git commit -m "feat(qa): declared surface-harness verification + baseline scenarios (ADR 0010)" +``` + +--- + +## D7: ADR 0005 — refuting QA posture + +No backup source for the refute posture — author fresh; merge with D6's surface wording. **Files:** Modify `swarmforge/roles/QA.prompt` + +- [ ] **Step 1: Replace the confirm posture with refute** + +Replace the "Fix bugs found by the QA suite or final verification." line and surrounding confirm framing with: + +``` +- Assume the build does not meet the spec and the acceptance tests are too weak to notice, until proven otherwise. Attack the specified contract — try to make it fail within the spec — rather than run a checklist and confirm. +- Stay bounded by the spec: a gap the spec never settled is not a QA pass/fail; route it back to the specifier (per Rework Routing). +- Enforce conversion fidelity: a QA procedure converted into an executable script must encode the procedure's full intent. A green script that asserts nothing is test theater and is itself a defect. +- A structural finding (weak/missing test, ambiguous spec) routes back; a local defect you can fix without re-opening an earlier stage you fix in place. +``` + +- [ ] **Step 2: Confirm against the ADR** + +Run: `cat docs/adr/0005-qa-refutes-not-confirms.md` +Ensure the text matches the ADR's intent (refute, spec-bounded, conversion fidelity / no test theater). + +- [ ] **Step 3: Verify** + +```bash +grep -qi "assume the build does not meet the spec" swarmforge/roles/QA.prompt && echo REFUTE_OK +grep -ci "test theater\|asserts nothing" swarmforge/roles/QA.prompt # >= 1 +grep -c "Fix bugs found by the QA suite" swarmforge/roles/QA.prompt # 0 +``` +Expected: `REFUTE_OK`, conversion-fidelity line present, old confirm line gone. + +- [ ] **Step 4: Commit** + +```bash +git add swarmforge/roles/QA.prompt +git commit -m "feat(qa): refute posture — attack the contract, no test theater (ADR 0005)" +``` + +--- + +## D8: ADR 0007 — UX Engineer role + +**Files:** Create `swarmforge/roles/ux-engineer.prompt`; modify `swarmforge/roles/coder.prompt`, `swarmforge/roles/specifier.prompt`, `swarmforge/swarmforge.conf` + +- [ ] **Step 1: Recover the ux-engineer role** + +```bash +git show backup/six-pre-reset:swarmforge/roles/ux-engineer.prompt > swarmforge/roles/ux-engineer.prompt +``` +⚠ From `backup/six-pre-reset` (≡ `origin/feat/obs-harness-six`), NOT pipeline-order/baseline (they lack the `observation-harness/` commit step). **STRIP** DESIGN.md scaffold-on-absence + walk-up; make DESIGN.md fix-authority conditional on a feature-file reference, not tree discovery. Ensure it carries: the idle-gate line, the N=3 back-route to coder, the `observation-harness/` commit step, golden snapshots + rendering invariants, the `## Visual quality standards` block (WCAG 4.5:1 / 3:1), notify→cleaner. + +- [ ] **Step 2: Wire coder + specifier** + +- `coder.prompt`: add a "read the feature's `## UX Intent` and implement from it alongside the Gherkin" line; change handoff `notify the cleaner` → `notify the ux-engineer`. +- `specifier.prompt`: add UX INTENT authoring (it authors the feature file's `## UX Intent` section — concrete observable statements across Visual Composition / Information Hierarchy / Interaction Feel / State Transitions). STRIP any DESIGN.md scaffold/walk-up here too (reference-from-feature-file only). + +- [ ] **Step 3: Add the conf window after coder** + +In `swarmforge.conf`, after the coder line: +``` +window ux-engineer codex ux-engineer +``` + +- [ ] **Step 4: Verify** + +```bash +grep -q "Wait for a handoff" swarmforge/roles/ux-engineer.prompt && echo UX_IDLE_OK +grep -q "observation-harness" swarmforge/roles/ux-engineer.prompt && echo UX_OBS_OK +grep -c "scaffold" swarmforge/roles/ux-engineer.prompt # 0 +grep -q "notify the ux-engineer" swarmforge/roles/coder.prompt && echo CODER_OK +grep -q "window ux-engineer" swarmforge/swarmforge.conf && echo CONF_OK +``` +Expected: `UX_IDLE_OK`, `UX_OBS_OK`, zero scaffold, `CODER_OK`, `CONF_OK`. + +- [ ] **Step 5: Commit** + +```bash +git add swarmforge/roles/ux-engineer.prompt swarmforge/roles/coder.prompt swarmforge/roles/specifier.prompt swarmforge/swarmforge.conf +git commit -m "feat(roles): UX Engineer after coder; UX Intent authoring + read (ADR 0007)" +``` + +--- + +## D9: ADR 0008 — integrator role + specifier stops merging + +**Files:** Create `swarmforge/roles/integrator.prompt`; modify `swarmforge/roles/specifier.prompt`, `swarmforge/roles/QA.prompt`, `swarmforge/swarmforge.conf` + +- [ ] **Step 1: Recover the integrator role + apply the FIX** + +```bash +git show backup/six-pre-reset:swarmforge/roles/integrator.prompt > swarmforge/roles/integrator.prompt +``` +⚠ From `backup/six-pre-reset` (≡ `feat/issue-20-c`), NOT baseline-scenarios-six (still says "notify specifier"). **FIX step 7** to: `Notify the curator that the feature has landed. Include the specifier handoff name and the post-merge master commit hash.` Confirm: one PR/feature, autofix-lint-only, branch → `gh pr create` → watch CI → green `gh pr merge --squash --delete-branch` + post-merge gate, CI-red routing (tests→coder, coverage/CRAP/DRY→cleaner, arch→architect; autofix doesn't count; N=3 then `FAILED: depth cap reached`), idle-gate line, agent-retro line. + +- [ ] **Step 2: Specifier stops merging + per-feature reset** + +In `specifier.prompt`: +- Drop the merge step (upstream's "merge the changes and ask the user", ~L36); replace the completion line with a placeholder D10 finalizes — for now: "When the work is landed, ask the user for the next feature to add." +- Add the per-feature worktree reset: on receiving a handoff, `git reset --hard "origin/$(git symbolic-ref refs/remotes/origin/HEAD | sed 's|refs/remotes/origin/||')"` in the specifier's own worktree (recover the exact form from `feat/six-pack-pipeline-order-and-scaffold`). STRIP any `git merge --ff-only origin/master` startup line. + +- [ ] **Step 3: QA hands off to integrator + conf windows** + +- `QA.prompt`: change the final handoff to `notify the integrator` (replacing the broadcast list). +- `swarmforge.conf`: change line 1 `window specifier codex master` → `window specifier codex specifier`; insert after QA: `window integrator codex integrator`. + +- [ ] **Step 4: Verify** + +```bash +grep -q "Notify the curator" swarmforge/roles/integrator.prompt && echo INT_FIX_OK +grep -q "post-merge master commit hash" swarmforge/roles/integrator.prompt && echo INT_HASH_OK +grep -q "notify the integrator" swarmforge/roles/QA.prompt && echo QA_INT_OK +grep -q "symbolic-ref" swarmforge/roles/specifier.prompt && echo SPEC_RESET_OK +grep -q "window specifier codex specifier" swarmforge/swarmforge.conf && echo CONF_SPEC_OK +grep -q "window integrator" swarmforge/swarmforge.conf && echo CONF_INT_OK +grep -c "codex master" swarmforge/swarmforge.conf # 0 +``` +Expected: all six `*_OK`, zero `codex master`. + +- [ ] **Step 5: Commit** + +```bash +git add swarmforge/roles/integrator.prompt swarmforge/roles/specifier.prompt swarmforge/roles/QA.prompt swarmforge/swarmforge.conf +git commit -m "feat(roles): terminal integrator; specifier stops merging, runs own worktree (ADR 0008)" +``` + +--- + +## D10: ADR 0013 — curator role + chain rewiring + +Authoritative source = the locked spec's PR-C2 block (budgets **60/40**, NOT the stale 150/300 on artifact branches). **Files:** Create `swarmforge/roles/curator.prompt`; modify `swarmforge/roles/integrator.prompt`, `swarmforge/roles/specifier.prompt`, `swarmforge/constitution/articles/workflow.prompt`, `swarmforge/swarmforge.conf` + +- [ ] **Step 1: Extract the curator from the locked spec** + +Run: `git show feat/issue-20-b-bundle-knowledge-injection:docs/specs/issue-20-knowledge-promotion-loop.md` +Copy the **PR-C2 verbatim block** into `swarmforge/roles/curator.prompt`. Confirm: idle-gate; writes only `AGENTS.md` + `.agents/`; sources `~/.claude/worklog/retros/*.md`; the routing ladder (enforcement-gate backlog → AGENTS.md ≤60 → role files ≤40 → references → skills-on-2nd → upstream → ledger); ledger line `date | session-id | role | failure-class | verdict | summary`; lifecycle (empty-run pass-through, knowledge branch, self-merging PR with metric line, move retros to `processed/`, notify specifier); 9-check per-item algorithm. **Budgets must read 60 and 40.** + +- [ ] **Step 2: Rewire the chain** + +- `integrator.prompt`: confirm step 7 notifies the curator (done in D9); fix if drifted. +- `specifier.prompt`: change the wait line to "When the **curator** notifies you that the job is complete, run the per-feature reset, then ask the user for the next feature. The curator's handoff means the knowledge PR for the previous feature has already landed." +- `workflow.prompt`: append: "The landing chain is integrator → curator → specifier. The curator promotes retro knowledge before the specifier is released; an empty curation run notifies the specifier immediately — the pipeline never stalls on the curator." +- `swarmforge.conf`: append last: `window curator codex curator`. + +- [ ] **Step 3: Verify** + +```bash +grep -q "Wait for a handoff" swarmforge/roles/curator.prompt && echo CUR_IDLE_OK +grep -Eq "60" swarmforge/roles/curator.prompt && grep -Eq "40" swarmforge/roles/curator.prompt && echo BUDGETS_OK +grep -c "150\|300" swarmforge/roles/curator.prompt # 0 +grep -q "When the curator notifies you" swarmforge/roles/specifier.prompt && echo SPEC_CUR_OK +grep -qi "integrator.*curator.*specifier" swarmforge/constitution/articles/workflow.prompt && echo WF_OK +grep -c "^window" swarmforge/swarmforge.conf # 9 +``` +Expected: `CUR_IDLE_OK`, `BUDGETS_OK`, zero 150/300, `SPEC_CUR_OK`, `WF_OK`, and **9** windows (specifier, coder, ux-engineer, cleaner, architect, hardender, QA, integrator, curator). + +- [ ] **Step 4: Commit** + +```bash +git add swarmforge/roles/curator.prompt swarmforge/roles/specifier.prompt swarmforge/constitution/articles/workflow.prompt swarmforge/swarmforge.conf +git commit -m "feat(roles): terminal curator; integrator->curator->specifier chain (ADR 0013)" +``` + +--- + +## D11: ADR 0015 — platform-feasibility stop rule + +**Files:** Modify `swarmforge/constitution/articles/workflow.prompt` + +- [ ] **Step 1: Add the stop rule** + +Append to `workflow.prompt`: + +``` +## Platform Feasibility +- When the spec and the platform conflict — the spec calls for a capability the target platform does not provide — stop and report instead of working around it. A workaround comment ("we can't do X here, so we do Y") is a defect, not a resolution. Wait for a spec revision. +``` + +- [ ] **Step 2: Verify** + +```bash +grep -qi "platform" swarmforge/constitution/articles/workflow.prompt && grep -qi "workaround.*defect" swarmforge/constitution/articles/workflow.prompt && echo OK +``` +Expected: `OK`. + +- [ ] **Step 3: Commit** + +```bash +git add swarmforge/constitution/articles/workflow.prompt +git commit -m "feat(workflow): platform-feasibility stop rule (ADR 0015)" +``` + +--- + +## D12: ADR 0016 — cleaner boundary-file scan + +**Files:** Modify `swarmforge/roles/cleaner.prompt` + +- [ ] **Step 1: Add the boundary-file rule** + +Recover the cleanest wording from `feat/baseline-scenarios-six:swarmforge/roles/cleaner.prompt`. After the ">100 mutation sites → split" rule, add: + +``` +- Also run the mutation scan/count mode on boundary files (the environmentally unsuitable modules excluded from the test tools). If a boundary file exceeds ~15 mutation sites, it holds implementation logic, not adaptation — extract that logic to a testable module before handoff. +- Treat a test that asserts only a stripped or simplified view of output (e.g. ANSI-stripped text when the real output carries escape codes) as not covering the un-stripped behavior. Add coverage for the full output. +``` + +- [ ] **Step 2: Verify** + +```bash +grep -qi "boundary" swarmforge/roles/cleaner.prompt && grep -q "15" swarmforge/roles/cleaner.prompt && echo BOUNDARY_OK +grep -qi "stripped" swarmforge/roles/cleaner.prompt && echo STRIPPED_OK +``` +Expected: `BOUNDARY_OK`, `STRIPPED_OK`. + +- [ ] **Step 3: Commit** + +```bash +git add swarmforge/roles/cleaner.prompt +git commit -m "feat(cleaner): boundary-file mutation scan at ~15 sites; stripped-view anti-pattern (ADR 0016)" +``` + +--- + +## D13: hardener rendering-invariant property tests (manifest row, no ADR) + +Unmanifested divergence found in audit; consistent with ADR 0007/0010. **Files:** Modify `swarmforge/roles/hardender.prompt` + +- [ ] **Step 1: Add the rendering-invariant line** + +Recover the exact text from `backup/six-pre-reset:swarmforge/roles/hardender.prompt` (~L18) and merge it in (don't lift the whole file). Rule: for pure rendering functions (state → string, no side effects), add property tests asserting structural invariants — required elements present per state, character set bounded to the declared vocabulary, mutually exclusive states never co-rendered. Confirm D2 already stripped Startup Tools and the unauthorized "merge all queued architect handoffs together" line is absent (keep upstream's sorted-filename batch). + +- [ ] **Step 2: Verify** + +```bash +grep -qi "rendering" swarmforge/roles/hardender.prompt && grep -qi "property test\|invariant" swarmforge/roles/hardender.prompt && echo OK +grep -c "merge all queued architect handoffs" swarmforge/roles/hardender.prompt # 0 +``` +Expected: `OK`, zero unauthorized merge-all line. + +- [ ] **Step 3: Commit** + +```bash +git add swarmforge/roles/hardender.prompt +git commit -m "feat(hardener): property tests for pure rendering functions (manifest row)" +``` + +--- + +## D14: ADR 0018 — root `swarm` upgrade subcommand + self-url + +The main script half (skill install) is C8. This is the runnable-branch half. **Files:** Modify the root `swarm` bootstrap (exists on `six-pack`) + +- [ ] **Step 1: Inspect current + recover the target deltas** + +Run: `git show origin/six-pack:swarm | head -60` +Run: `git show 8994322:swarm 2>/dev/null | head -120` (adds `upgrade`/`write_source_branch`/`download_from_main`) and `git show ded6019:swarm 2>/dev/null | head -40` (self-url). +Merge the minimal deltas onto the current six-pack root `swarm`: +- `SCRIPTS_REPO="${SWARMFORGE_SCRIPTS_REPO:-gabadi/swarm-forge}"` (self-referencing; replaces hardcoded `unclebob/swarm-forge`). +- `download_from_main` (refresh scripts + skills from `main`). +- `write_source_branch` (record the runnable source branch in `.swarmforge/source-branch`). +- The `upgrade` subcommand: refresh scripts(main) + prompts(`source-branch`) + force skill reinstall (clear `.swarmforge/skills-installed`). + +- [ ] **Step 2: Verify** + +```bash +grep -q "gabadi/swarm-forge" swarm && echo SELF_URL_OK +{ grep -q "upgrade)" swarm || grep -q '"upgrade"' swarm; } && echo UPGRADE_OK +{ zsh -n swarm 2>/dev/null || bash -n swarm; } && echo SYNTAX_OK +``` +Expected: `SELF_URL_OK`, `UPGRADE_OK`, `SYNTAX_OK`. + +- [ ] **Step 3: Commit** + +```bash +git add swarm +git commit -m "feat(swarm): self-url + upgrade subcommand with source-branch tracking (ADR 0018)" +``` + +--- + +## Finalize PR 2 (SIX-PACK) + +- [ ] **Step 1: Whole-track verification** + +```bash +grep -c "^window" swarmforge/swarmforge.conf # 9, in order +for r in specifier coder ux-engineer cleaner architect hardender QA integrator curator; do + test -f "swarmforge/roles/$r.prompt" || echo "MISSING role file: $r" +done; echo ROLES_CHECKED +git diff --stat origin/six-pack # review: only prompts/articles/templates/conf/swarm changed +``` +Expected: 9 windows; all 9 role files present; `ROLES_CHECKED`; the diff touches only six-pack-owned files. + +- [ ] **Step 2: Push the branch** + +```bash +git push -u origin feat/fork-divergences-six-pack +``` + +- [ ] **Step 3: Open the single PR** + +```bash +gh pr create --base six-pack --repo gabadi/swarm-forge \ + --title "feat: fork divergences — six-pack prompts + constitution + conf" \ + --body "Re-applies the six-pack fork divergences on pristine upstream, one commit per ADR: 0002 idle-gate, 0003 startup-strip, 0004 back-routing, 0009 spec header, 0011 fidelity manifest, 0010 surface harness, 0005 refute QA, 0007 UX engineer, 0008 integrator, 0013 curator, 0015 platform-feasibility, 0016 cleaner boundary scan, hardener invariants, 0018 root swarm upgrade. Final pipeline: specifier→coder→ux-engineer→cleaner→architect→hardener→QA→integrator→curator (9 windows). DESIGN.md reference-only; curator budgets 60/40; four-pack frozen. See docs/superpowers/plans/2026-06-14-fork-divergence-implementation.md and docs/fork-change-manifest.md (Section B)." +``` + +--- + +## Out of scope (explicitly NOT implemented) + +- **four-pack PR** — frozen (manifest 2026-06-14): pure merge-mirror of `upstream/four-pack`. The issue-20 spec's "PR D on four-pack" is **dropped**. +- **cmux multiplexer** (`swarm-mux.sh`, `write_deliver_script`/`write_notify_script`/`write_stop_hook`, `MUX_TARGETS`) — DROPPED; stay on upstream's tmux harness. +- **Ideas G, H, I** — genuinely rejected, no recovery. +- **DESIGN.md scaffolding** — ADR 0007 wins: reference-from-feature-file only; recovered roles STRIP scaffold-on-absence + walk-up. +- **curator budgets 150/300** — superseded by the locked spec's 60/40. + +--- + +## Self-Review + +**Spec coverage** (manifest sections A/B/C + cross-cutting): +- Section A (main → PR 1): 0006→C6, 0012→C4, 0014→C3, 0013-skill→C9, 0003→C11 ✓ +- Section B (six-pack → PR 2): 0002→D1, 0009→D4, 0011→D5, 0010→D6, 0005→D7, 0004→D3, 0007→D8, 0008→D9, 0013→D10, 0015→D11, 0016→D12, hardener-row→D13 ✓ +- Section C (uncaptured): B/0017→C2, F/0020→C5, J→C9, N/0018→C8+D14, O→C11, auto-permission/0019→C1, executing-fields→C7, retro-triage/0021→C10, self-url→D14 ✓ +- Cross-cutting: observation-harness shared (D6 QA re-exec, D8 ux-engineer writes, D13 hardener honors); N=3 back-route (D3, carried by D8/D9); refute+surface QA merged across D6→D7; DESIGN.md reference-only (D8); curator chain order (D10) ✓ + +**Structure:** exactly two branches (`feat/fork-divergences-main`, `feat/fork-divergences-six-pack`), one PR each; per-divergence commits in a linear, dependency-correct order on each branch; no per-ADR branches, no four-pack PR. + +**Within-branch ordering:** MAIN — only hard dep is C3 after C2; all `swarmforge.sh` commits are linear so no in-file conflict. SIX-PACK — D1:` + specific STRIP/FIX deltas; verification commands are concrete with expected output. + +**Naming consistency:** `resolve_prompt_bundle`, `write_agent_instruction_file`, `write_worktree_advisor`, `write_worktree_permissions`, `ensure_skills_installed`, `install_skills` consistent across C2/C3/C4/C5/C8/C11; markers `.swarmforge/setup-complete` / `.swarmforge/skills-installed` consistent; conf window names match across D8/D9/D10. + +**Known soft spots to confirm during execution (not blockers):** +- C2/C3/C4 line numbers drift — locate by function name. +- C7 executing-fields: find the actual executing-entry write site in the upstream handoff scripts (NOT a `swarmforge.sh` heredoc as on the cmux lineage). +- C6 QA holdout path (`qa-e2e`) must match the specifier-authored path — keep the one `QA_HOLDOUT_PATH` constant as the single source of truth. +- D10 curator: budgets are 60/40 from the locked spec, not 150/300. From 5514154a506aeee3a191008979596313b8a04b53 Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Sun, 14 Jun 2026 13:43:05 -0300 Subject: [PATCH 16/67] docs(adr-0001): upstream baseline anchor + squash-divergence/merge-upstream policy Record the pristine-upstream baseline explicitly (manifest SHA line + fork-base/- tag) since a hard reset erases merge-history anchoring. Fork divergences are squash-merged (one clean commit each); upstream syncs are history-preserving merges (never squash/rebase). Resolves manifest still-open #5 (PR shape). Co-Authored-By: Claude Opus 4.8 (1M context) --- docs/adr/0001-permanent-fork-synced-by-merge.md | 4 ++++ docs/fork-change-manifest.md | 4 +++- 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/docs/adr/0001-permanent-fork-synced-by-merge.md b/docs/adr/0001-permanent-fork-synced-by-merge.md index 9895a6d..ef0d616 100644 --- a/docs/adr/0001-permanent-fork-synced-by-merge.md +++ b/docs/adr/0001-permanent-fork-synced-by-merge.md @@ -5,3 +5,7 @@ status: accepted # Permanent fork of unclebob/swarm-forge, synced by merge This repo is a permanent fork of `unclebob/swarm-forge` (remote `upstream`); nothing is contributed back. Upstream moves fast, so we keep current by **merging** `upstream/` into our branches — never rebasing — because the fork is published/shared and rebasing would rewrite shared history and re-surface every conflict on each sync. `git rerere` is enabled (`rerere.enabled`, `rerere.autoupdate`) so conflict resolutions replay automatically. Every divergence should be **additive** (a new file or an appended rule) and recorded as its own ADR in this directory; a non-additive edit to an upstream line is a conscious, documented cost. Two branches are maintained: `main` (shared scripts + these docs) and `six-pack` (runnable: role prompts, `swarmforge.conf`, templates). + +Because the fork can be hard-reset back to a pristine upstream commit (see the `backup/*-pre-reset` branches), the merge history that would otherwise encode the integration point is not a dependable anchor. The upstream baseline each branch's fork layer is re-applied onto is therefore recorded **explicitly**: a SHA line in `docs/fork-change-manifest.md` and an annotated `fork-base/-` tag at each sync, both surviving a reset. + +Two merge styles, by source: **fork divergences are squash-merged** — every divergence PR lands as a single commit, so the fork layer reads as one clean, revertible, re-appliable commit per divergence. **Upstream is integrated by a history-preserving merge** — never squashed and never rebased, so upstream's commit story stays intact and `rerere`-replayable. A landed commit is never rewritten afterward. diff --git a/docs/fork-change-manifest.md b/docs/fork-change-manifest.md index d6fb0d7..6c61f45 100644 --- a/docs/fork-change-manifest.md +++ b/docs/fork-change-manifest.md @@ -4,6 +4,8 @@ Compact, permanent record of **every divergence to apply on top of a pristine `u ## Sync policy (ADR 0001) +- **Current upstream baseline (re-apply the fork layer onto this):** `main` ← `upstream/main` @ `d947f67` · `six-pack` ← `upstream/six-pack` @ `cbd1697` (2026-06-14). Bump these on every sync; an annotated `fork-base/-` tag pins the same commit so the anchor survives a hard reset (merge history alone does not — this fork has been reset before). +- **Merge style by source:** every **fork divergence is squash-merged** (one divergence PR → one clean commit on the delivery branch). **Upstream syncs are history-preserving merges** — never squashed, never rebased (keep upstream's story; `rerere` replays conflicts). The two initial re-implementation PRs follow the same squash rule (one squashed commit per branch). A landed commit is never rewritten. - `main`, `six-pack`, `four-pack` are kept **identical to `upstream/`** and advanced by **merge** (`git merge upstream/`), never rebase. `rerere` replays conflict resolutions. - **four-pack is frozen (decision 2026-06-14): no fork divergences are applied to it.** Only `main` and `six-pack` carry changes. (Open: whether four-pack is still resynced to upstream to honor "keep == upstream", or left as-is — see below.) - Every item below is **additive** (new file or appended rule) wherever possible; a non-additive edit to an upstream line is marked **[edit]** and is a conscious, documented conflict point. @@ -107,7 +109,7 @@ Also unimplemented draft, not a divergence: `backup/main-pre-reset:docs/proposal 2. **ADR 0002 clear-first on six-pack** — RESOLVED: the model column is **configuration** (governed by ADR 0012's per-role model), not an architectural decision. No codex-hook work is added. ADR 0002 stands as written — clear-first is claude-first; codex roles keep upstream delivery as a documented property. 3. *(resolved earlier)* cmux **DROPPED** (stay on upstream tmux harness); Idea-B bundle-inlining **KEPT** but disentangled — port `resolve_prompt_bundle` + XML envelope onto upstream's harness, re-base executing-fields/M3 on it. ADR 0012 `--advisor` resolved (`advisorModel` in `settings.local.json`). 4. **four-pack** — RESOLVED: kept as a **pure merge-mirror of `upstream/four-pack`** (no fork content ever) to honor ADR 0001's "all branches == upstream"; resync via merge-only. -5. **PR shape for implementation** — DEFERRED to implementation time (does not affect the ADR set). Note the one-difference-per-ADR rule; likely grouped by layer + dependency (B → 0014/M3 → executing-fields ordered). +5. **PR shape for implementation** — RESOLVED (2026-06-14): **one PR per delivery branch**, not one per ADR. Two PRs total — `main` (script + skill layer) and `six-pack` (prompts/constitution/conf/root-swarm); no four-pack PR (frozen). Each divergence is an **ordered commit** within its branch (dependency-linear), keeping the single PR tailored. These two initial PRs may be squash-merged (see Sync policy). Full task breakdown: `docs/superpowers/plans/2026-06-14-fork-divergence-implementation.md`. **Overriding constraint (all items):** keep the diff vs upstream as small as possible — translate to the minimal additive form, do not lift the pre-reset implementation. See `feedback-minimize-upstream-diff` memory. From 11e352afd209b3c4dba68d399d6dd7b39bf0d6c3 Mon Sep 17 00:00:00 2001 From: gabadi Date: Sun, 14 Jun 2026 13:46:03 -0300 Subject: [PATCH 17/67] docs: ADRs for fork divergences vs upstream (#28) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * docs: ADRs for fork divergences vs upstream Record only actual discrepancies vs unclebob/swarm-forge, one ADR each: - 0001 permanent fork, synced by merge (sync policy: merge not rebase, additive, rerere) - 0002 idle gate + clear-first Stop-hook delivery (the only engine discrepancy; presence side channel for the awake notification) Plus CONTEXT.md: minimal glossary of fork-specific terms only. Co-Authored-By: Claude Opus 4.8 (1M context) * docs(adr-0002): settle two-case delivery; universal re-injection; Claude Code first - idle/busy marker is the one new piece; reuse upstream's queue as the inbox - two cases: busy -> Stop hook delivers on next stop; idle -> deliver now (clear first either way) - /clear clears the session for any agent -> re-injection is universal (no backend split) - implement via Claude Code hooks first; codex/grok delivery pending Co-Authored-By: Claude Opus 4.8 (1M context) * docs(adr): capture five accepted pipeline divergences + one open item ADRs 0003-0008 define fork divergences vs upstream (definition only, not implementation): - 0003 setup is a one-time, stack-aware skill (additive file, no merge conflict); run path does no project setup - 0004 rework routes back to the stage whose decision it exposes; local fixes stay with finder; at most one bounce - 0005 QA refutes (assumes build fails spec + tests too weak) rather than confirms; includes conversion fidelity - 0006 harness-enforced holdout — status: proposed (open item) - 0007 UX Engineer role that fixes against UX Intent (six-pack) - 0008 terminal Integrator lands only behind a green CI gate, no local-merge fallback CONTEXT.md gains glossary terms: Setup skill, Integrator, UX Engineer, UX Intent, Refuting QA, Back-routing. Co-Authored-By: Claude Opus 4.8 (1M context) * docs(adr-0006): decide harness-enforced holdout via sparse-checkout Settle the open item as accepted. The end-to-end QA suite is held out mechanically, not by prompt instruction: - mechanism: git sparse-checkout excludes the suite's pinned path from each role worktree (absent from disk, still tracked in the commit) -- rejecting rm-from-worktree (commit would drop it) and separate-branch (more flow, no extra protection) - scope: hidden from all implementer roles; visible only to the specifier (authors it) and QA (runs it) - precondition: specifier writes the suite under a pinned path; existing coder "ignore it" line stays as defense-in-depth Backed by verification_loop.md: holdout leakage "must be enforced architecturally," not instructionally. Adds CONTEXT.md term "QA holdout". Co-Authored-By: Claude Opus 4.8 (1M context) * docs(adr-0009): feature files open with a structured spec header Record the feature-template divergence (idea L) from the real artifact on backup/six-pre-reset, not the idea-file summary: - structured comment header above the Gherkin: TRACKING, CONTRACT, CONSTRAINTS, SEQUENCING, NFR, SIDE EFFECTS, SCOPE (each with an Ask:/Format:); the spec-authoring layer the scenarios can't carry - six-pack adds an 8th section, UX INTENT (home for ADR 0007); four-pack has the 7 only, since the UX Engineer is six-pack-only - address every section; SEQUENCING/SIDE EFFECTS/UX INTENT default to `none` (a deliberate answer; `none` UX INTENT = UX Engineer skips) - impl note: six-pack specifier prompt still says "seven sections" but the template has eight -- fix to eight/all on landing Adds CONTEXT.md term "Spec header". Co-Authored-By: Claude Opus 4.8 (1M context) * docs(adr-0010,0011): split observation harness into surface + fidelity Idea P recorded from the real backup/six-pre-reset artifacts, split into two independent divergences (plus idea Q folded into 0010): - 0010 surface-harness doctrine: live-verification roles (QA both packs, UX Engineer six-pack) drive the running system through its real surface via a declared per-surface tool (surface tool table in engineering.prompt); every surface carries a mandatory idle baseline scenario -- the direct fix for the tetris idle-state defects. Replaces QA's "through the UI only" with "through the declared surface harness" + every Expected bullet -> assertion or NOT AUTOMATED (idea Q). No surface field in project.prompt (resolves the stale monolith table row) - 0011 dependency-fidelity-manifest: new dependency-manifest.prompt declaring deps by tier 1/2/3 with machine-readable gaps the specifier and QA refuse to build on; twin-authorship + post-interaction-state rules; specifier-owned, defaults (none) 0005 gains a cross-ref: conversion fidelity is audited by 0010's bullet->assertion-or-NOT-AUTOMATED rule. CONTEXT.md gains: Surface harness, Baseline scenario, Fidelity manifest, Dependency tier. Co-Authored-By: Claude Opus 4.8 (1M context) * docs(adr-0012): per-role model/effort/advisor in swarmforge.conf Record idea U from the backup design. Optional inline key=value tail on window lines (existing 4-field lines unchanged; upstream hard-rejects !=4 fields, so this is a real but backward-compatible parser change): - model (all backends), effort (claude/copilot/grok), advisor (claude only); unsupported keys silently ignored - per-role not per-backend; no pre-populated values (fully opt-in) - lands on main (swarmforge.sh); runnable configs stay topology-only - impl note: verify `claude --advisor` actually exists; treat as reserved-but-inert if not No new CONTEXT.md vocabulary. Co-Authored-By: Claude Opus 4.8 (1M context) * docs(adr): recapture lost role detail; lean ADR style (divergence + why only) Audit of the real backup/six-pre-reset prompts vs the intent-written ADRs surfaced lost decisions and contradictions; fix them and adopt the house style (no rejected-options sections, no historical/legacy notes): - 0007: add the universal visual-quality bar (AI-aesthetic anti-patterns, type hierarchy, WCAG contrast) the UX Engineer enforces regardless of project input; make the durable artifact concrete (observation-harness/, golden snapshots, rendering invariants); DESIGN.md is referenced-only (no scaffold, no walk-up) - 0010: add the observation-harness/ committed regression record that QA re-executes - 0008: integrator hands off to the curator; post-merge gate; N=3 cap - 0004: two-scope cap -- one bounce per finding, N=3 cycles per feature - 0005: conversion fidelity audited via 0010's bullet->assertion rule - strip "## Considered options" from every ADR; remove legacy mapping - CONTEXT.md: add Observation harness; update Back-routing for the caps Co-Authored-By: Claude Opus 4.8 (1M context) * docs(adr-0013,0014): curator role + .agents knowledge injection (idea V) Record the knowledge-promotion loop (issue #20), split into two divergences: - 0013 curator: terminal role after the integrator that promotes session retros to versioned repo knowledge via one self-merging PR, then releases the specifier. Capture-everything with scope tags; single discard gate (non-inferable check); routing ladder (gate > AGENTS.md > role file > reference > skill > upstream > ledger); append-only ledger; budgets 60/40; integrator->curator->specifier; empty run never stalls - 0014 .agents contract: AGENTS.md + .agents/ written only by the curator, versioned in the repo (not ~/.claude), injected into each role bundle at launch (AGENTS.md for all, role file role-scoped; references by pointer); missing files silently skipped CONTEXT.md: add Curator, Promoted knowledge, Knowledge ledger. Co-Authored-By: Claude Opus 4.8 (1M context) * docs(adr-0015,0016): platform-feasibility stop rule + boundary-logic detection Final pipeline divergences (ideas R, S): - 0015 R: constitution rule (workflow.prompt) -- a spec requirement that conflicts with real platform capability stops and reports to the user; a workaround comment in code is the smell the rule fired and was suppressed. Narrow to spec-vs-platform, binds all roles. - 0016 S: cleaner (six-pack) / refactorer (four-pack) also scans boundary files at a ~15-20 mutation-site threshold (vs 100 for testable files); above = logic in an adapter shell, extract before handoff. Plus the "tested only through a stripped view = untested" anti-pattern. Idea T (evidence-as-code) is covered by 0010's observation-harness loop; G/H/I are rejected (no divergence, no ADR). Co-Authored-By: Claude Opus 4.8 (1M context) * docs: fork change manifest + migration recovery docs (impl playbook) Permanent rebase index (one row per divergence: change + target + recover-from branch:path + ADR) plus per-row recovery detail for the main script layer, six-pack role prompts, and the setup-swarm skill. Records the section-C ADR assignments and the decisions settled in the what's-missing pass. Co-Authored-By: Claude Opus 4.8 (1M context) * docs(adr-0017,0018,0019,0020,0021): document section-C script-layer divergences Promote the uncaptured main-side script divergences to ADRs: inlined prompt bundle (idea B, translated onto the tmux harness), swarm self-pin/upgrade (idea N), autonomous permission mode (--permission-mode auto, verified real), worktree auto-compaction (idea F), and the retro-triage operator skill. Co-Authored-By: Claude Opus 4.8 (1M context) * docs(adr-0002,0003,0010,0013): extend ADRs + CONTEXT glossary 0002 session-restart executing {message,hash,sender}; 0013 session-retro transcript capture + before-idle trigger (idea J); 0003 setup-swarm rename, setup-first/guard model (idea-K dropped) + install scaffold (idea O); 0010 hardener rendering-invariant property tests. Glossary: setup-swarm, swarm-ready marker, session retro, retro-triage, prompt bundle. Co-Authored-By: Claude Opus 4.8 (1M context) * docs: fix cross-doc inconsistencies from code review Resolve six consistency/accuracy issues across the fork-divergence ADRs: - four-pack freeze: scrub "apply to four-pack" instructions from ADRs 0008/0009/0010/0011/0013/0015/0016 (frozen per ADR 0001 / manifest) - logbook.json -> logbook.jsonl (real runtime file; handoff-lib.sh:33) - ADR 0012 advisor: no --advisor flag; written as advisorModel into settings.local.json (per backup write_worktree_advisor) - manifest Section C: drop stale "16 ADRs / no ADR" (now 0017-0021 assigned) - CONTEXT.md: remove leaked memory-slug wikilink, keep ADR 0003 ref - ADR 0006 holdout: key sparse-checkout on specifier role, not the master worktree name (ADR 0008 renames it to specifier) Co-Authored-By: Claude Opus 4.8 (1M context) * docs: fork divergence implementation plan — 2 PRs, one per delivery branch Task-by-task plan (C1-C11 on main, D1-D14 on six-pack) re-applying every documented divergence onto pristine upstream. One PR per delivery branch, each divergence an ordered squashed commit. four-pack frozen; cmux dropped. Co-Authored-By: Claude Opus 4.8 (1M context) * docs(adr-0001): upstream baseline anchor + squash-divergence/merge-upstream policy Record the pristine-upstream baseline explicitly (manifest SHA line + fork-base/- tag) since a hard reset erases merge-history anchoring. Fork divergences are squash-merged (one clean commit each); upstream syncs are history-preserving merges (never squash/rebase). Resolves manifest still-open #5 (PR shape). Co-Authored-By: Claude Opus 4.8 (1M context) --------- Co-authored-by: Claude Opus 4.8 (1M context) --- CONTEXT.md | 97 ++ .../0001-permanent-fork-synced-by-merge.md | 11 + ...0002-idle-gate-and-clear-first-delivery.md | 29 + docs/adr/0003-setup-is-a-one-time-skill.md | 23 + docs/adr/0004-rework-routes-back.md | 18 + docs/adr/0005-qa-refutes-not-confirms.md | 19 + docs/adr/0006-harness-enforced-holdout.md | 23 + docs/adr/0007-ux-engineer-role.md | 25 + docs/adr/0008-integrator-role.md | 24 + docs/adr/0009-feature-file-spec-header.md | 20 + docs/adr/0010-surface-harness-doctrine.md | 28 + docs/adr/0011-dependency-fidelity-manifest.md | 20 + .../adr/0012-per-role-model-effort-advisor.md | 35 + docs/adr/0013-curator-knowledge-promotion.md | 28 + docs/adr/0014-agents-knowledge-injection.md | 17 + .../0015-platform-feasibility-stop-rule.md | 15 + docs/adr/0016-boundary-logic-detection.md | 15 + docs/adr/0017-inlined-prompt-bundle.md | 20 + .../0018-swarm-pins-and-upgrades-itself.md | 20 + docs/adr/0019-autonomous-permission-mode.md | 17 + docs/adr/0020-worktree-auto-compaction.md | 15 + docs/adr/0021-retro-triage-skill.md | 20 + docs/fork-change-manifest.md | 115 ++ docs/migrations/0003-setup-skill-sources.md | 55 + docs/migrations/main-script-layer.md | 34 + docs/migrations/six-pack-role-prompts.md | 56 + ...26-06-14-fork-divergence-implementation.md | 1232 +++++++++++++++++ 27 files changed, 2031 insertions(+) create mode 100644 CONTEXT.md create mode 100644 docs/adr/0001-permanent-fork-synced-by-merge.md create mode 100644 docs/adr/0002-idle-gate-and-clear-first-delivery.md create mode 100644 docs/adr/0003-setup-is-a-one-time-skill.md create mode 100644 docs/adr/0004-rework-routes-back.md create mode 100644 docs/adr/0005-qa-refutes-not-confirms.md create mode 100644 docs/adr/0006-harness-enforced-holdout.md create mode 100644 docs/adr/0007-ux-engineer-role.md create mode 100644 docs/adr/0008-integrator-role.md create mode 100644 docs/adr/0009-feature-file-spec-header.md create mode 100644 docs/adr/0010-surface-harness-doctrine.md create mode 100644 docs/adr/0011-dependency-fidelity-manifest.md create mode 100644 docs/adr/0012-per-role-model-effort-advisor.md create mode 100644 docs/adr/0013-curator-knowledge-promotion.md create mode 100644 docs/adr/0014-agents-knowledge-injection.md create mode 100644 docs/adr/0015-platform-feasibility-stop-rule.md create mode 100644 docs/adr/0016-boundary-logic-detection.md create mode 100644 docs/adr/0017-inlined-prompt-bundle.md create mode 100644 docs/adr/0018-swarm-pins-and-upgrades-itself.md create mode 100644 docs/adr/0019-autonomous-permission-mode.md create mode 100644 docs/adr/0020-worktree-auto-compaction.md create mode 100644 docs/adr/0021-retro-triage-skill.md create mode 100644 docs/fork-change-manifest.md create mode 100644 docs/migrations/0003-setup-skill-sources.md create mode 100644 docs/migrations/main-script-layer.md create mode 100644 docs/migrations/six-pack-role-prompts.md create mode 100644 docs/superpowers/plans/2026-06-14-fork-divergence-implementation.md diff --git a/CONTEXT.md b/CONTEXT.md new file mode 100644 index 0000000..c5fe15f --- /dev/null +++ b/CONTEXT.md @@ -0,0 +1,97 @@ +# SwarmForge Fork + +A permanent fork of `unclebob/swarm-forge` (rationale in `docs/adr/`). This glossary holds only terms whose fork-specific meaning is already settled; terms are added as decisions are made, not in advance. + +## Language + +**Idle gate**: +The rule that a role does nothing until it receives a handoff — no startup work, scanning, installing, or self-assigned tasks. The single line is "Wait for a handoff. Do not act without one." +_Avoid_: startup guard, wait condition + +**Ready notification** (presence signal): +The startup "I'm awake" message each role sends to the specifier. Informational only — it tells the operator the role launched. Stamped a distinct `presence` type and excluded from the _Delivery sequence_; in the fork's idle model readiness is implicit (a role at idle with an empty queue is ready). +_Avoid_: awake handoff, ready handoff + +**Delivery sequence**: +The steps that start a work handoff on a receiver: `/clear` → re-inject the role bundle → send the task message. Runs for work handoffs only, never for presence pings. Delivered immediately if the receiver is idle, or by its Stop hook when it next stops if busy. (Upstream instead types the message straight into the terminal with no clear.) +_Avoid_: inject, dispatch + +**Prompt bundle** (role bundle): +The single, structured context a role launches and re-launches with: its constitution and role prompt resolved into one deduplicated XML envelope, into which _promoted knowledge_ (`AGENTS.md` + the role's `.agents/` file) is also injected. It is the unit re-sent on every _delivery sequence_ after `/clear`, not just built once at launch. (Upstream concatenates the prompt files with a plain recursive read, with no dedup or structure.) +_Avoid_: context blob, prompt file, instruction file + +**setup-swarm** (the skill): +The one-time, stack-aware step that makes a project swarm-ready — installs the project's language quality tools, enables session tracking, grants the agents' permissions, pins skill versions, and emits the _swarm-ready marker_. Ships inside the swarm install. **It is the operator's first action on a project** (`/setup-swarm`), before the run path is ever invoked. The run path (`./swarm`) performs no **project provisioning** and never triggers the skill itself; it only **guards** — if the marker is absent it refuses and tells the operator to run `setup-swarm` first. (It still bootstraps the swarm's *own* runtime skills automatically — launcher infrastructure, distinct from project provisioning.) (Upstream instead installs tooling per-role at startup.) +_Avoid_: setup skill, preflight, bootstrap, onboarding + +**Swarm-ready marker**: +The file (`.swarmforge/setup-complete`) that `setup-swarm` writes to record that a project has been made swarm-ready. The run path guards on its presence; the operator deletes it to force a re-run. There is no `./swarm setup` subcommand. +_Avoid_: setup flag, ready file, lock + +**Integrator**: +The terminal role that lands finished work. From the QA-approved commit it opens a pull request, gates on CI, merges only on green, runs the post-merge verification, and notifies the specifier — one PR per feature. It never merges locally: CI is a hard precondition, so a project without CI is not swarm-ready (setup ensures CI; see ADR 0003). CI failures route to the owning role via [[back-routing]]. (Upstream has no integrator — the specifier merges ad hoc.) +_Avoid_: merger, releaser, deployer + +**UX Engineer** (six-pack only): +The role, immediately after the coder, that runs the built product and fixes visual/usability mismatches in rendering code (leaving a regression check behind) — an engineer that fixes, not a flag-only reviewer. Checks against the feature's _UX Intent_ and any optional design inputs the feature references. Skips (passes through) when the feature has no UX Intent. Routes back to the coder via [[back-routing]] when a fix needs a model-state change. Framework-agnostic; the visual-testing tool is named by the constitution. +_Avoid_: UX Reviewer, designer + +**UX Intent**: +The section the specifier authors inline in the feature file stating, in concrete observable terms, what a feature should look and feel like. Part of the swarm and the _UX Engineer_'s primary target. Distinct from optional project design inputs (DESIGN.md, EXPERIENCE.md, mockups) — those are not swarm-owned; the specifier merely references them from the feature file when they exist. +_Avoid_: design spec, UX requirements + +**Refuting QA**: +QA's posture in the fork: assume the build does not meet the spec and the acceptance tests are too weak to notice, until proven otherwise — attack the specified contract rather than run a checklist and confirm. Bounded by the spec (unspecified gaps route back to the specifier, they are not QA pass/fail). Includes _conversion fidelity_: a QA procedure converted into an executable script must encode the procedure's full intent, not a green version that asserts nothing (_test theater_). (Upstream QA confirms the spec is met and fixes what fails.) +_Avoid_: verification, acceptance check, confirm + +**Back-routing**: +Sending rework back to the stage whose decision it exposes as flawed, instead of resolving it where it was found. The trigger is any finding that an earlier stage's work must change — a bug, a refactor blocked by a bad earlier decision, or a design/spec revision. Applies only to _structural_ rework (re-opening an earlier stage's job: an ambiguous/missing spec, a weak/missing test, a design that can't hold the behavior); _local_ work the finder can resolve without re-opening an earlier decision stays with the finder. Two caps: a single finding bounces back at most once, and a feature tolerates at most three back-route cycles total (N=3, tracked by a routing count in the handoff) before the role stops and asks the user. (Upstream fixes everything in place.) +_Avoid_: rejection, escalation, bounce, defect back-routing + +**QA holdout**: +The end-to-end QA suite kept physically out of reach of every role that shapes the implementation, so it stays a blind test. The harness sparse-checks-out the suite's pinned path from each role worktree except the specifier's (which authors it) and QA's (which runs it) — present in the commit, absent from disk. Distinct from upstream's prompt-level "ignore it," which leaves the files in the coder's worktree. Covers only the end-to-end QA suite; the Gherkin acceptance tests stay visible because the coder builds and runs them. (Upstream walls it by instruction only.) +_Avoid_: hidden tests, secret suite, test isolation + +**Spec header**: +The structured block of comment sections the specifier fills in at the top of every feature file, above the Gherkin scenarios — the spec-authoring layer that states what the scenarios cannot: contract, constraints, sequencing, NFRs, side effects, scope (and, six-pack only, _UX Intent_). The scenarios are the contract by example; the spec header is the contract's surrounding intent. Every section is addressed; several default to `none` (a deliberate answer). Comments only, so the Gherkin parser ignores them. (Upstream feature files are pure Gherkin with no header.) +_Avoid_: preamble, comment block, feature description + +**Surface harness**: +The way the live-verification roles (QA always; the _UX Engineer_ on six-pack) drive the running system through its real production interface — a declared per-surface tool (tmux/PTY for a TUI, Playwright for web, an HTTP client for an API, event injection for a headless service) chosen from the constitution's surface tool table. Replaces upstream's mechanically-silent "through the user interface only," which let in-process function calls pass as interface verification. Every surface also carries a _baseline scenario_. The role identifies the surface from the codebase; nothing declares it in `project.prompt`. +_Avoid_: UI test, e2e harness, driver + +**Baseline scenario**: +The permanent idle/no-op scenario committed alongside a surface's flow scenarios, asserting the system is stable when nothing is happening — TUI: no input, identical consecutive captures, zero scrollback growth; web: idle load with no console errors; headless: a no-op event changes no state. It catches idle-state defects that flow scenarios never observe because flow scenarios only assert while the user is acting. +_Avoid_: smoke test, idle test, sanity check + +**Observation harness**: +The project `observation-harness/` directory holding the committed, re-runnable surface scenarios — the per-surface _baseline scenario_ plus one set per verified flow — that form the permanent regression record. Authored by the live-verification role (the _UX Engineer_ on six-pack) using the _surface harness_ tool, and re-executed by QA before final verification; a user-facing surface with no scenarios is a finding that routes back. (Upstream has no such artifact.) +_Avoid_: e2e folder, regression dir + +**Fidelity manifest**: +The constitution sub-file (`dependency-manifest.prompt`) declaring every dependency beyond the system itself by _dependency tier_, each as `name: tier N; implementation; gaps: `. A declared gap is binding: the specifier and QA refuse to write or accept any scenario that rests on it, so a known emulator limitation can never pass as covered behavior. Specifier-owned; defaults to `(none)`. +_Avoid_: mock list, dependency doc, services file + +**Dependency tier**: +The fidelity level at which a dependency is provided, declared in the _fidelity manifest_. Tier 1 — owned infrastructure run locally as the real engine (Postgres in Docker); tier 2 — stateful protocol-level emulation (vendor-official > third-party > swarm-built twin as last resort); tier 3 — external domain the swarm does not own, wire-level stubbed against a referenced contract. The system itself is always implicit, never a tier. +_Avoid_: mock level, fidelity grade + +**Curator**: +The terminal role, after the integrator, that turns a run's session retros into versioned repo knowledge via one self-merging PR, then releases the specifier for the next feature. Makes no code changes — writes only _promoted knowledge_. An empty run notifies the specifier immediately; the line never stalls on it. (Upstream has no such role; lessons live only in unread retros.) +_Avoid_: librarian, archivist, scribe + +**Promoted knowledge**: +The project-versioned knowledge contract the _curator_ writes and the launcher injects into role bundles: a root `AGENTS.md` (universal invariants + navigation) and `.agents/` (per-role files, references, skills, the enforcement-gate backlog, the _knowledge ledger_). Lives in the repo, not `~/.claude`, so a fresh clone carries every lesson. `AGENTS.md` and the role's file are injected into that role's bundle at launch; references load on demand by pointer. (Upstream bundles only the constitution and role prompt.) +_Avoid_: docs, memory, knowledge base + +**Knowledge ledger**: +`.agents/ledger.md` — the append-only audit the _curator_ writes, one never-pruned line per processed retro item (`date | session-id | role | failure-class | verdict`). Makes recurrence provable: an item rejected before and seen again has proven itself worth promoting. +_Avoid_: changelog, history, log + +**Session retro**: +The single per-role, per-session retrospective the `agent-retro` skill writes (automatically, as each role's last step before idle) to the shared retro pool. A symptom report from one role's one session under a keyhole view — its proposed fixes are hints, never findings. The shared input consumed independently by both the _curator_ and _retro-triage_; neither destroys what the other has not yet seen. +_Avoid_: retrospective, session log, postmortem + +**Retro-triage**: +The operator-invoked analysis (the `retro-triage` skill) that turns a *batch* of _session retros_ into a validated, cross-session **root-cause diagnosis** from which a human files issues. Distinct from the _curator_: the curator is autonomous and per-item and fixes "the swarm doesn't *know* X" by promoting agent-facing knowledge into the repo; retro-triage is manual and cross-batch and fixes "the swarm is *structurally doing* X wrong" by surfacing causes no single retro names (and that the per-item curator structurally cannot see) for pipeline/tooling/strategy changes a human must make. Diagnosis is the product, validated against transcripts and git artifacts — not the retros' own framing. +_Avoid_: triage, consolidation, retro processing diff --git a/docs/adr/0001-permanent-fork-synced-by-merge.md b/docs/adr/0001-permanent-fork-synced-by-merge.md new file mode 100644 index 0000000..ef0d616 --- /dev/null +++ b/docs/adr/0001-permanent-fork-synced-by-merge.md @@ -0,0 +1,11 @@ +--- +status: accepted +--- + +# Permanent fork of unclebob/swarm-forge, synced by merge + +This repo is a permanent fork of `unclebob/swarm-forge` (remote `upstream`); nothing is contributed back. Upstream moves fast, so we keep current by **merging** `upstream/` into our branches — never rebasing — because the fork is published/shared and rebasing would rewrite shared history and re-surface every conflict on each sync. `git rerere` is enabled (`rerere.enabled`, `rerere.autoupdate`) so conflict resolutions replay automatically. Every divergence should be **additive** (a new file or an appended rule) and recorded as its own ADR in this directory; a non-additive edit to an upstream line is a conscious, documented cost. Two branches are maintained: `main` (shared scripts + these docs) and `six-pack` (runnable: role prompts, `swarmforge.conf`, templates). + +Because the fork can be hard-reset back to a pristine upstream commit (see the `backup/*-pre-reset` branches), the merge history that would otherwise encode the integration point is not a dependable anchor. The upstream baseline each branch's fork layer is re-applied onto is therefore recorded **explicitly**: a SHA line in `docs/fork-change-manifest.md` and an annotated `fork-base/-` tag at each sync, both surviving a reset. + +Two merge styles, by source: **fork divergences are squash-merged** — every divergence PR lands as a single commit, so the fork layer reads as one clean, revertible, re-appliable commit per divergence. **Upstream is integrated by a history-preserving merge** — never squashed and never rebased, so upstream's commit story stays intact and `rerere`-replayable. A landed commit is never rewritten afterward. diff --git a/docs/adr/0002-idle-gate-and-clear-first-delivery.md b/docs/adr/0002-idle-gate-and-clear-first-delivery.md new file mode 100644 index 0000000..c537fa0 --- /dev/null +++ b/docs/adr/0002-idle-gate-and-clear-first-delivery.md @@ -0,0 +1,29 @@ +--- +status: accepted +--- + +# Idle gate and clear-first delivery + +The fork uses upstream's handoff harness as-is (queue, scripts, `logbook.jsonl`); the only engine discrepancy is **delivery**. Upstream does setup work at startup and never clears context between tasks — it types each handoff straight into the terminal and lets the terminal buffer it whether the agent is working or not. The fork instead requires every role to (1) do nothing until it receives a handoff and (2) start each task from a cleared session. + +**Idle gate** — a prompt rule ("Wait for a handoff. Do not act without one.") plus removal of the startup-install directives from role prompts (install work moves to a separate setup skill). Additive prompt edits. + +**Clear-first delivery** — `/clear` clears the session for **any** agent, so it cannot be sent to a working agent. Delivery therefore must know whether the receiver is idle or busy. Upstream tracks no such state, so the fork adds a minimal per-role **idle/busy marker**. Delivery then has two cases, both running `/clear` → re-inject the role bundle → send the task message: + +- receiver **busy** — the handoff waits in upstream's queue (`.swarmforge/handoffs/queue/`); the receiver's **Stop hook** delivers it when the agent next stops. +- receiver **idle** — deliver immediately, because no stop will occur to trigger the hook. + +The marker is set *busy* when a delivery starts and *idle* when the Stop hook finds the queue empty; the hook re-checks the queue before declaring idle to close the narrow "went idle just as a sender judged it busy" race. + +**Re-injection is universal.** `/clear` wipes the session regardless of backend, so the role bundle is always re-sent after `/clear`. + +**Claude Code first.** Both the marker and the delivery ride Claude Code's hook system (the Stop hook). The fork's delivery replaces upstream's immediate terminal-typing only for the roles it manages. The `claude` backend is supported now; roles on `codex`/`grok` keep upstream's delivery until their hook-based equivalent is built — **pending implementation**. + +Ready is implicit (idle + empty queue = ready). Upstream's startup "I'm awake" ping is kept only as an operator-visible **presence** signal — stamped a distinct `presence` type and excluded from the clear-first path, so the Stop hook never clears for it. + +**Session-restart recovery.** The idle/busy marker records *whether* a role is working, not *what* it is working on. So the `executing` logbook entry carries the in-flight task itself — `{message, hash, sender}`: the handoff message being acted on, the commit hash it started from, and who sent it. If a role's session dies and is restarted mid-task, that is enough to resume the task rather than lose it along with the handoff. (Upstream's `executing` entry records no such context.) These fields live inside the delivery and Stop-hook scripts, so they re-base on the prompt bundle of ADR 0017. + +## Pending implementation + +- `codex`/`grok` hook-based delivery (Claude Code first). The current `six-pack` `swarmforge.conf` runs all six roles on `codex`, so until that is built — or those roles move to `claude` — clear-first delivery applies only to `claude` roles. The `claude`/`codex` choice is a per-role configuration knob (ADR 0012), not an architectural decision; no `codex` hook work is required for this ADR to stand. +- Add the `{message, hash, sender}` fields to the `executing` logbook entry written in the delivery script and the Stop hook, re-based onto the ADR 0017 bundle delivery. Source: `feat/main-executing-context-fields` commit `a133c71`. diff --git a/docs/adr/0003-setup-is-a-one-time-skill.md b/docs/adr/0003-setup-is-a-one-time-skill.md new file mode 100644 index 0000000..7d4f295 --- /dev/null +++ b/docs/adr/0003-setup-is-a-one-time-skill.md @@ -0,0 +1,23 @@ +--- +status: accepted +--- + +# Setup is a one-time skill, not in-execution work + +Adapting a project to the swarm — installing the project's language quality tools (mutation, CRAP, DRY, the Acceptance Pipeline commands), enabling session tracking, and granting the permissions the agents need — lives in a **`setup-swarm` skill** that ships inside the swarm install. It is the operator's *first* action on a project (`/setup-swarm`); the run path does no project provisioning. (Installing the swarm's *own* pinned `entire` skills is launcher bootstrap, not project setup — that belongs to ADR 0018.) + +**Setup runs first; the run path only guards.** `setup-swarm` writes a **swarm-ready marker** (`.swarmforge/setup-complete`) when it finishes. `./swarm` checks that marker before launching any role and, if it is absent, refuses and tells the operator to run `setup-swarm` first — it never runs setup itself. (An earlier design had `./swarm` auto-run setup on first launch; that is superseded — setup is an explicit operator step and the launcher merely verifies it happened.) `./swarm` still fetches its own code when missing, bootstraps its own pinned skills (ADR 0018), and does per-launch plumbing (worktrees, sessions, copying constitution files); it never adapts the *project* to its stack. The operator deletes the marker to force a re-run. + +**The only edits to upstream files are four role-prompt lines.** The "At startup, install the language tools" directives in `coder`, `QA`, `cleaner`, and `hardender` are removed; that install work moves into the setup skill and runs once. ADR 0002 already removes these same lines for the idle gate (a role does nothing until handed off); here they go for a second, complementary reason — tool install is a one-time setup step, not per-task startup work. The removal is the seam between the two decisions; neither owns it alone. + +**Why a skill rather than functions added to the launch script.** A skill is a new fork-owned file, so it adds zero upstream merge-conflict surface — exactly the additive divergence ADR 0001 asks for. Adding setup functions inside `swarmforge.sh` would instead edit an upstream-tracked file, a permanent conflict point on every sync. A skill also lets setup *reason about the stack* (which tools for Go vs Java vs Clojure, which gates matter), which a deterministic script cannot. + +**Why replace rather than overlay.** Setup is an explicit one-time step; the run path stays pure "start the agents." The accepted cost is that the swarm no longer self-installs project tooling on first run — the operator runs the setup skill once before the first `./swarm`. Any setup step this moves out of the run path is named and documented so the divergence stays auditable. + +**Setup also lays down the project scaffold.** Beyond tooling, the skill writes the one-time repository scaffold the swarm assumes: a `.gitignore` covering the swarm's runtime artifacts (`logbook.jsonl`, `tmp/`, `.swarmforge/`), the project's default branch probed once (`git symbolic-ref refs/remotes/origin/HEAD`) and recorded in `swarmforge.conf`, and a small, targeted set of permission allow-rules in `.claude/settings.json` (for example `Bash(gh pr merge*)` for the integrator, `Bash(git reset --hard origin/)` for the specifier). Under autonomous permission mode (ADR 0019) those allow-rules are advisory hints rather than a load-bearing whitelist, so the set is kept deliberately small. + +## Pending implementation + +- The `setup-swarm` skill, shipped at `swarmforge/skills/setup-swarm/` (mirroring `agent-retro`): it reasons about the stack and writes the project tooling, session tracking (`entire enable …` plus `entire agent add ` per `swarmforge.conf` backend), the permission allow-rules, and the `.gitignore`/default-branch scaffold, then writes the marker. *How* it detects the stack is the skill's own domain — deliberately not prescribed here, since reasoning about the stack is the whole reason setup is a skill and not a script. +- `main`: `./swarm` checks `.swarmforge/setup-complete` before launching roles and refuses (with a message to run `setup-swarm`) if it is absent. +- The `entire` skill install is **not** part of this skill — it is launcher bootstrap (ADR 0018). diff --git a/docs/adr/0004-rework-routes-back.md b/docs/adr/0004-rework-routes-back.md new file mode 100644 index 0000000..6add5f7 --- /dev/null +++ b/docs/adr/0004-rework-routes-back.md @@ -0,0 +1,18 @@ +--- +status: accepted +--- + +# Rework routes back to its cause + +Upstream fixes a problem wherever it is found — the QA role's prompt says plainly "fix bugs found by the QA suite." That keeps the line moving but lets fixes pile up downstream of the stage that caused them, and the responsible stage never learns it did its job wrong. The fork instead sends the work **back to the stage whose decision it exposes as flawed**, so the fix lands at the cause. + +The trigger is not only a defect. Any finding that an earlier stage's work must change routes back — a failing behavior (a bug), a refactor blocked because the structure rests on a bad earlier decision, or a design/spec revision surfaced when a later stage tries to hold a behavior the specification can't carry. A defect is the most obvious case, not the only one. + +**Only structural rework routes back.** It routes back when resolving it means re-opening an earlier stage's job — an ambiguous or missing specification, a weak or missing acceptance test, a design that can't hold the behavior. The stage that owns that work gets it back and corrects the root cause. **Local** work — anything the finder can resolve without re-opening an earlier stage's decision — stays with the finder. Routing a contained, local change backward only adds a round trip and teaches no one. + +**Two caps, at two scopes.** A *single finding* routes back to its cause **at most once**: if it returns still unresolved, the finder resolves it in place and flags it, so two stages never volley the same item. Independently, a *feature* tolerates **at most three back-route cycles total** (depth cap N=3), tracked by a routing count carried in the handoff trail; after the third the routing role stops and asks the user rather than looping. The first cap stops ping-pong on one issue; the second stops a feature from churning through endless distinct bounces. (The role prompts — ux-engineer, integrator — carry the N=3 feature-level cap.) + +## Pending implementation + +- How a finding is attributed to an origin stage (the line must be able to trace it back to the spec, test, or design that owns it). +- Where the rule lives in the role prompts (runnable change, `six-pack`). diff --git a/docs/adr/0005-qa-refutes-not-confirms.md b/docs/adr/0005-qa-refutes-not-confirms.md new file mode 100644 index 0000000..cc5ab72 --- /dev/null +++ b/docs/adr/0005-qa-refutes-not-confirms.md @@ -0,0 +1,19 @@ +--- +status: accepted +--- + +# QA refutes rather than confirms + +Upstream QA verifies that the accepted specification is met and fixes what fails — a *confirm* posture. It converts the specifier's written QA procedures into executable scripts and runs them through the real user interface. The fork flips the posture: QA assumes the build does **not** meet the spec and the acceptance tests are too weak to notice, until it proves otherwise. Its job is to make the claim "this meets the spec and the tests prove it" *fail*. + +**Refute against the spec, not beyond it.** QA attacks the specified contract — it hunts specified-but-untested behavior, proves the acceptance tests too weak to catch a real violation, and throws inputs designed to break the specified behavior. It does **not** invent new requirements. A genuinely unspecified gap it stumbles on is not a QA pass/fail; it is a finding that routes back to the specifier. This keeps QA adversarial but bounded, so it never blocks the line on behavior no one agreed to. + +**Conversion fidelity.** When QA turns the specifier's written procedures into executable scripts, the script must encode the procedure's full intent — not a weakened version that passes. QA refutes its *own* conversion. This is the highest-leverage guard in the line because the QA end-to-end suite is the one suite the hardener's mutation testing explicitly does not cover: a weak conversion ("test theater" — a green test that asserts nothing real) that hides there is caught by nothing else. + +**Findings route back; QA owns the attack, not the routing.** A structural weakness QA surfaces routes back to its cause (a weak acceptance test or an ambiguous spec → the specifier); a local defect QA fixes in place — per the back-routing decision. Refuting QA is the engine that *generates* structural findings; it needs no routing rule of its own. + +## Pending implementation + +- Prompt change on `six-pack`. +- Conversion fidelity is made auditable by the surface-harness conversion rule (ADR 0010): every Expected bullet maps to a harness assertion or is marked `NOT AUTOMATED — `, so a dropped bullet is visible rather than taken on QA's word. +- Whether QA's converted end-to-end suite should itself be mutation-tested (the hardener currently ignores it) — the objective way to detect a theatrical conversion rather than relying on QA's self-judgment. diff --git a/docs/adr/0006-harness-enforced-holdout.md b/docs/adr/0006-harness-enforced-holdout.md new file mode 100644 index 0000000..e6ff310 --- /dev/null +++ b/docs/adr/0006-harness-enforced-holdout.md @@ -0,0 +1,23 @@ +--- +status: accepted +--- + +# Harness-enforced holdout of the QA suite + +Upstream holds the end-to-end QA suite back from the coder by prompt instruction alone: the coder's prompt says "ignore the specifier's end-to-end QA suite," but the files sit in the coder's own worktree (every worktree is `git worktree add -B … HEAD`, a full checkout of the commit the specifier wrote the suite into). The wall is honor-system. The fork makes it **mechanical**: the QA suite is physically absent from the worktree of every role that shapes the implementation, so "ignore it" becomes "cannot reach it." + +**Why mechanical, not instructional.** The verification-loop reference is explicit that the scenario suite is a *holdout* — "never visible to the code generation agent" — and names the failure mode directly: "holdout leakage … must be enforced architecturally (filesystem isolation, separate repos, access controls)," not by a prompt. A holdout the implementer can read is a holdout the implementer can quietly fit to; the suite then stops being a blind test and QA running it proves nothing. This is the prevention layer that the detection layers (mutation testing + refuting QA, ADR 0005) cannot supply: detection catches a gamed suite after the fact; the wall stops the gaming. + +**Mechanism: `git sparse-checkout`, not file deletion.** The worktree-prep step the harness already runs sets a sparse-checkout on each role worktree that excludes the QA-suite path. Sparse-checkout makes the file *absent from disk but still tracked in the commit* — so the role cannot read it, yet its commit cannot accidentally drop it downstream. Naive deletion (`rm` from the worktree) was rejected for exactly this reason: the role commits with `git add`, the deletion gets staged, and the suite vanishes for QA. A separate QA-only branch was rejected as more flow change for no extra protection. + +**Scope: hide from implementers, keep for author and verifier.** The exclusion applies to every worktree *except* the **specifier's** (it authors the suite) and **QA's** (it runs the suite — it is the verifier). Key the exclusion on the specifier *role*, not a fixed worktree name: it is the `master` worktree on upstream today, but ADR 0008 moves the specifier to its own `specifier` worktree, and this rule must follow it. Coder, UX Engineer, cleaner, architect, and hardener all touch the implementation before QA and so are walled. The integrator never touches implementation; its worktree is irrelevant either way. + +**Precondition: a fixed QA-suite path.** For the harness to exclude the suite it must live at a deterministic path; the specifier writes the end-to-end QA suite under a pinned location (e.g. `qa/`). This is the only added convention. The existing coder-prompt "ignore it" line stays as defense-in-depth. + +**Scope boundary: only the end-to-end QA suite.** The Gherkin acceptance tests and the acceptance pipeline stay fully visible — the coder builds and runs them. The holdout is the specifier's end-to-end QA suite alone. + +## Pending implementation + +- Add the sparse-checkout exclusion to the worktree-prep step (`six-pack`/scripts), keyed to skip the specifier's worktree (whatever its name — `master` today, `specifier` once ADR 0008 lands) and QA's. +- Pin the end-to-end QA-suite path in the specifier prompt. +- Confirm sparse-checkout interacts cleanly with the coder→cleaner→…→QA handoff commits (the excluded path must survive each role's commit untouched). diff --git a/docs/adr/0007-ux-engineer-role.md b/docs/adr/0007-ux-engineer-role.md new file mode 100644 index 0000000..1c8d1bf --- /dev/null +++ b/docs/adr/0007-ux-engineer-role.md @@ -0,0 +1,25 @@ +--- +status: accepted +--- + +# UX Engineer role and UX Intent + +Upstream has no UX role — nothing in the line owns whether the product is *usable*, only whether it is correct. The fork adds a **UX Engineer** (six-pack only) that runs the built product and **fixes** visual and usability mismatches in rendering code, leaving a regression check behind. It is an engineer, not a flag-only reviewer: the fork's pattern is that every stage fixes in place and leaves a durable artifact, so a report-only role is the anti-pattern it rejects. + +**It checks against UX Intent.** The specifier authors a **UX Intent** section inline in the feature file — concrete, observable statements of what the feature should look and feel like. UX Intent is part of the swarm and travels with the feature. A feature with no UX Intent is the signal to skip: the UX Engineer passes straight through to the next stage, the same "no work, no handoff" pattern used elsewhere. + +**Optional design inputs are referenced, not owned.** When a project supplies design artifacts — a DESIGN.md (visual system), an EXPERIENCE.md (interaction and feel), mockups (concrete visual targets) — the specifier **references** them from the feature file, and the UX Engineer consults them alongside UX Intent. These are optional project inputs; the swarm neither defines, scaffolds, nor requires them, and does not walk the directory tree to discover them — the only link is an explicit reference from the feature file. + +**It also enforces a universal visual-quality bar, independent of any project input.** Beyond UX Intent and any DESIGN.md, the role applies a fixed standard the prompt enumerates: AI-aesthetic anti-patterns (unjustified default purple/indigo, gradient noise, uniformly maximal rounding, oversized equal padding, shadow-heavy chrome, missing loading/error/empty states), type-hierarchy rules (primary content must dominate; no skipped heading levels), and colour rules including **WCAG contrast minimums (4.5:1 normal, 3:1 large) and "colour is never the sole state indicator."** These hold even when a feature has no UX Intent and the project has no DESIGN.md — they are the floor, not project preferences. + +**It leaves a durable, re-runnable artifact.** "Fixes, leaves a regression check behind" is concrete: per verified flow the role commits re-runnable scenarios to `observation-harness/` using the project's surface tool (ADR 0010), plus golden-file snapshots per state and rendering invariants for structural properties. These are the permanent regression record and must pass against the committed code — and QA re-executes them downstream (ADR 0010), routing back here if a user-facing surface has none. + +**Framework-agnostic.** The role defines the *class* of check — the running product matches its stated UX — and leaves the specific visual-testing tool to the project's constitution. No terminal-UI assumptions live in the role. + +**Placement and routing.** The UX Engineer sits immediately after the coder, so the downstream roles (cleaner, architect, hardener, QA) see implementation and rendering code together in one pass rather than running twice. When a mismatch cannot be fixed in rendering alone and needs a model-state change, it routes back to the coder — using the back-routing rule already decided (`0004`), not a separate mechanism. The back-route message carries what UX Intent says, what the implementation does, what must change, and the current routing count; the role observes the N=3 feature-level cap (`0004`) and stops to ask the user after the third cycle. + +## Pending implementation + +- Six-pack only: new `ux-engineer` role prompt; UX Intent authoring in the specifier and the feature template; coder reads UX Intent; `swarmforge.conf` adds the window after the coder. +- Routing follows `0004`; durable artifact (`observation-harness/`, snapshots, rendering invariants) follows `0010`. +- DESIGN.md is referenced from the feature file only — the specifier does not scaffold it and the ux-engineer does not walk the tree to find it. diff --git a/docs/adr/0008-integrator-role.md b/docs/adr/0008-integrator-role.md new file mode 100644 index 0000000..2a83ae2 --- /dev/null +++ b/docs/adr/0008-integrator-role.md @@ -0,0 +1,24 @@ +--- +status: accepted +--- + +# Integrator role lands work behind a CI gate + +Upstream has no integrator: when QA signals done, the **specifier** merges the work ad hoc (a local `git merge`) and asks for the next feature. There is no gate between "QA passed" and "landed on the main branch." The fork adds a dedicated **integrator** as the terminal stage of the line that owns *landing* the work — and nothing lands except through a green CI gate. + +**Landing is PR + CI, with no fallback.** From the QA-approved commit the integrator opens a pull request, watches CI, and merges only when CI is green; then it runs a **post-merge gate** — it watches the resulting main-branch CI run and, if the project defines a full verification suite, runs that on green too — before handing off. It never merges locally — a local merge is exactly what the specifier already did, so the integrator's whole value is that the main branch only ever receives green-CI'd work. **CI is therefore a hard precondition, not optional:** a project without CI is not swarm-ready, and ensuring CI is in place belongs to project setup (`0003`). + +**It hands off to the curator.** The integrator is the last *code* stage, but not the last stage: on a green landing it notifies the **curator** (ADR 0013), which promotes the run's retro knowledge and only then releases the specifier for the next feature. + +**One PR per feature.** Rework updates the same PR; a second PR is never opened for the same feature. + +**Failure routing reuses back-routing.** A CI failure routes to the role that owns it — a failing test to the coder, a failing cleanliness gate to the cleaner, a failing architecture check to the architect; a trivially autofixable failure (lint/format) the integrator fixes in place on the PR branch and re-runs. This is the back-routing rule already decided (`0004`) with the integrator as the finder, capped at N=3 (`0004`): it tracks the cycle depth by counting its own failure comments on the PR, and after three it posts a final `FAILED: depth cap reached` comment and stops rather than looping. The post-merge gate's CI-red is routed the same way as pre-merge. + +**The specifier stops merging.** Merging moves entirely to the integrator, so the specifier no longer needs the main checkout — it moves from the `master` worktree to its own worktree and starts each feature from a clean reset to the default branch. + +## Pending implementation + +- Runnable branch (`six-pack`): new terminal `integrator` role; `swarmforge.conf` window; specifier worktree change and removal of its merge step. (four-pack is frozen per ADR 0001 / the change manifest.) +- The PR/CI mechanism (platform, e.g. `gh`) named at implementation. +- CI-in-place enforced as a setup precondition (`0003`); routing per `0004`. +- Terminal handoff target is the curator (`0013`), not the specifier; autofixable lint/format is the integrator's only allowed code change. diff --git a/docs/adr/0009-feature-file-spec-header.md b/docs/adr/0009-feature-file-spec-header.md new file mode 100644 index 0000000..2262dd6 --- /dev/null +++ b/docs/adr/0009-feature-file-spec-header.md @@ -0,0 +1,20 @@ +--- +status: accepted +--- + +# Feature files open with a structured spec header + +Upstream feature files are pure Gherkin: a `Feature:` line, then scenarios. The fork prepends a **structured spec header** — a block of comment sections the specifier fills in before writing any scenario, captured in a template (`swarmforge/templates/feature.feature`) that the specifier starts every feature from. + +The header is the **spec-authoring layer** the reference verification loop puts ahead of the scenarios (its Step 1): the Gherkin scenarios are the contract *by example*, but they cannot state what is out of scope, what was assumed, what non-functional targets apply, or what side effects must be observed. The header carries exactly that — the WHAT/WHY around the examples — so those concerns are stated once, up front, where every downstream role reads them. + +**Sections (seven base):** `TRACKING` (traceability to an issue), `CONTRACT` (every input, every response shape and status, fields deliberately absent), `CONSTRAINTS` (dataset bounds, validation, exclusions), `SEQUENCING` (ordering / async dependencies, defaults `none`), `NFR` (latency, idempotency key+window, in-flight UI, error distinguishability), `SIDE EFFECTS` (public-contract changes, derived artifacts to regenerate, defaults `none`), `SCOPE` (`Does NOT:` exclusions and `ASSUMED:` assumptions). Each section pairs an `Ask:` (the questions that elicit it) with a `Format:` (how to write the answer). + +**Six-pack adds an eighth section, `UX INTENT`**, with four dimensions — Visual Composition, Information Hierarchy, Interaction Feel, State Transitions — written as concrete observable statements. Its content and semantics are owned by ADR 0007; the header is merely its home in the feature file. It is six-pack-only because the UX Engineer that consumes it is six-pack-only. + +**Address every section; do not fill every section.** `SEQUENCING`, `SIDE EFFECTS`, and (six-pack) `UX INTENT` default to `none`. `none` is a deliberate answer, not a skipped one — and for `UX INTENT`, `none` is the signal that tells the UX Engineer to pass through (ADR 0007). The sections are comments (`#`), so the Gherkin parser ignores them and the acceptance pipeline is unaffected. + +## Pending implementation + +- Template already drafted on `six-pack` (8 sections, with `UX INTENT`); land it. (four-pack is frozen per ADR 0001 / the change manifest — it keeps pure Gherkin, no header.) +- Specifier phase 1 starts from the template and addresses all header sections before scenarios. Fix the stale count in the **six-pack** specifier prompt: it says "complete all seven header sections" but the six-pack template has eight — change to "eight" (or "all"). diff --git a/docs/adr/0010-surface-harness-doctrine.md b/docs/adr/0010-surface-harness-doctrine.md new file mode 100644 index 0000000..a34c4be --- /dev/null +++ b/docs/adr/0010-surface-harness-doctrine.md @@ -0,0 +1,28 @@ +--- +status: accepted +--- + +# Live verification runs through a declared surface harness + +Two defects (a screen blink and a runaway key-repeat) once survived a 250-scenario, eight-role pipeline. The cause was structural: no gate ever drove the *running* system through its real production interface — every check ran below the surface, against functions and return values. The fork closes this with a **surface harness doctrine**: the roles that own live verification drive the running system through its actual surface, using a declared tool, and every surface carries a permanent idle baseline. + +This is the reference verification loop's execute-and-observe layer (its Steps 5–7) made concrete: build the real thing, drive it through its surface, assert on what comes out. + +**Surface tool table (in `engineering.prompt`).** Following the existing language-tool-table pattern, the constitution declares the harness tool per surface type: tmux/PTY for a TUI (`send-keys -l` for raw input at controlled timing, `capture-pane` for screen state over time), Playwright for web, an HTTP client for HTTP APIs, event-injection-at-ingress for headless services. Roles owning live verification — **QA** and the **UX Engineer** (six-pack, ADR 0007) — identify the project's surface *from the codebase* and acquire the matching tool before their first harness run, exactly as they acquire language tools. + +**No surface field in `project.prompt`.** Roles read the code to know the surface; an explicit declaration would be a meaningless placeholder until the project is customised. + +**Every surface carries a mandatory baseline scenario**, committed alongside the flow scenarios: TUI → idle stability (no input, consecutive captures identical, zero scrollback growth); web → idle page loads with no console errors; headless → a no-op event produces no state change. The baseline is what the tetris defects would have hit — they were *idle-state* failures invisible to any flow test, because flow tests only assert while the user is acting. + +**The harness scenarios are committed and re-run, not throwaway.** Per verified flow, the live-verification role commits re-runnable scenarios to a project `observation-harness/` directory using the surface tool — alongside the per-surface baseline — as a permanent regression record that must pass against the committed code (on six-pack the UX Engineer authors these, ADR 0007; it also adds golden-file snapshots per state and rendering invariants for structural properties). **QA re-executes the committed `observation-harness/` scenarios before its own final verification**, and routes back (ADR 0004) if a user-facing surface exists but has no scenarios. This is what makes the surface check durable: a defect fixed once stays fixed because its scenario re-runs every cycle. + +**QA verifies through the declared surface harness, not "the UI" (idea Q).** Upstream QA's "operate through the user interface only" was right in intent but mechanically silent — it let in-process function calls masquerade as UI verification. The fork replaces the phrase with "through the declared surface harness," and adds an auditable conversion rule: **every Expected bullet maps to a harness assertion, or is explicitly marked `NOT AUTOMATED — `.** This is the mechanism that makes the conversion-fidelity guard of ADR 0005 checkable rather than a matter of QA's word — a silently dropped bullet becomes a visible marker. Findings route back per ADR 0004. + +**The hardener pins pure rendering with property tests.** Where rendering is a pure function of state (`state → string`), the hardener writes property-based tests over that function — the structural complement to the UX Engineer's golden snapshots and rendering invariants. A snapshot pins one concrete state's output; a property pins the rule across the input space (every state renders without truncation, every cell stays within bounds), catching the rendering defects that no single captured frame happens to exhibit. + +## Pending implementation + +- Add the surface tool table + context-driven acquisition rule to `engineering.prompt` on `six-pack`. (four-pack is frozen per ADR 0001 / the change manifest; all `six-pack`-only below for the same reason.) +- Change QA's "through the UI only" to "through the declared surface harness" and add the Expected-bullet → assertion / `NOT AUTOMATED` rule in `QA.prompt` (`six-pack`). +- Require the per-surface baseline scenario to be committed with every feature's flow scenarios. +- `six-pack`: add the rendering-invariant property-test rule for pure rendering functions to `hardender.prompt`. Source: recover from `backup/six-pre-reset`. diff --git a/docs/adr/0011-dependency-fidelity-manifest.md b/docs/adr/0011-dependency-fidelity-manifest.md new file mode 100644 index 0000000..f967718 --- /dev/null +++ b/docs/adr/0011-dependency-fidelity-manifest.md @@ -0,0 +1,20 @@ +--- +status: accepted +--- + +# Dependencies are declared by fidelity tier in a manifest + +A scenario that rests on an emulated dependency the emulator does not actually implement passes green and proves nothing — the system was never exercised against the behavior the scenario claims to cover. The fork makes dependency fidelity **explicit and refusable** through a new constitution sub-file, `swarmforge/dependency-manifest.prompt`, that declares every dependency beyond the system itself by fidelity tier. This is the reference loop's Digital-Twin discipline: a twin is only trustworthy if its fidelity — and its gaps — are stated. + +**A separate constitution file, not `project.prompt`.** The manifest holds project-specific dependency data that would clutter `project.prompt`; it lives in its own file, auto-resolved by the same bundle resolver as the other constitution sub-files. It ships on `six-pack` (four-pack is frozen) and defaults to `(none)` — a project with no external dependencies declares nothing. + +**Three tiers (the system itself is always implicit).** Tier 1 — owned infrastructure run locally as the real engine (e.g. Postgres in Docker). Tier 2 — stateful, protocol-level emulation (preference order: vendor-official emulator > established third-party > a swarm-built twin only as last resort). Tier 3 — external domain the swarm does not own (third-party APIs, other teams' services), wire-level stubbed against a referenced contract. Entry format: `name: tier N; implementation; gaps: `. + +**Declared gaps are machine-readable and binding.** The specifier and QA must not write or accept scenarios that rest on a declared gap — so a known emulator limitation can never masquerade as covered behavior. Supporting rules: every harness scenario starts from a declared seed state and resets dependency state between scenarios; tier-2/3 dependencies must expose post-interaction state for assertion (the message landed in the emulator's outbox), so scenarios assert *effects*, not only the system's own surface; and a swarm-built twin must not be authored by the role that wrote the system code it emulates, and must be validated against recorded real traffic or the vendor's official SDK tests. + +**The specifier owns the manifest.** Before writing scenarios it reads the manifest; if a feature touches an external system not yet declared, it stops, proposes name/tier/implementation/gaps to the user, and waits for approval before adding the entry — tier assignment is an architectural decision the user must own, mirroring the other specifier approval gates. + +## Pending implementation + +- Add `swarmforge/dependency-manifest.prompt` (tier definitions inline, body `(none)`) on `six-pack`. (four-pack is frozen per ADR 0001 / the change manifest.) +- Add the read-manifest / propose-on-undeclared rule to `specifier.prompt` (`six-pack`); QA's refusal of gap-resting scenarios is part of refuting QA (ADR 0005). diff --git a/docs/adr/0012-per-role-model-effort-advisor.md b/docs/adr/0012-per-role-model-effort-advisor.md new file mode 100644 index 0000000..758e19d --- /dev/null +++ b/docs/adr/0012-per-role-model-effort-advisor.md @@ -0,0 +1,35 @@ +--- +status: accepted +--- + +# Per-role model, effort, and advisor in `swarmforge.conf` + +Different roles have different compute needs — the architect reasoning about design warrants a more capable model than the coder grinding through an implementation slice. Upstream's only per-role knob is the agent backend (`window `); model, effort, and advisor are absent. The fork adds **optional per-role overrides** without breaking any existing config. + +**Syntax: an inline `key=value` tail on the window line.** The existing four fields parse exactly as before; any fields beyond position four are read as `key=value` pairs stored per role. Upstream rejects lines that are not exactly four fields, so this is a genuine parser change, but it is backward compatible — a four-field line still works untouched. + +```conf +# before (still valid) +window coder claude coder + +# after (opt-in per role) +window specifier claude specifier model=opus effort=xhigh advisor=sonnet +window coder claude coder model=sonnet effort=high +window architect codex architect model=o3 +``` + +**Three keys, mapped to CLI flags per backend; unsupported keys are silently ignored:** + +| Key | Applies to | Mapping | +|-----|-----------|---------| +| `model` | all backends | `claude`/`copilot`/`grok`: `--model ` · `codex`: `-c model=""` | +| `effort` | claude, copilot, grok | `--effort ` (codex has no effort flag — skipped) | +| `advisor` | claude only | written as `advisorModel` into the worktree's `.claude/settings.local.json` — there is **no** `--advisor` CLI flag (ignored for other backends) | + +**Per-role granularity, not per-backend.** Two `claude` roles can run different models; a global per-backend setting would throw away the value of the role abstraction. **No pre-populated values** ship in the runnable configs — those express topology (roles + worktrees), not opinions about model cost. The feature is fully opt-in: operators add keys only to the lines they care about. + +## Pending implementation + +- `main`: extend `parse_config` in `swarmforge.sh` to accept ≥4 fields and read the `key=value` tail into per-role maps; extend `launch_role` to append the mapped flags per backend when set. (Script lives on `main`; the conf grammar is exercised there.) +- `model`/`effort` map to real CLI flags; `advisor` does **not** — there is no `claude --advisor` flag. It is implemented by writing `advisorModel` into each worktree's `.claude/settings.local.json` (a `write_worktree_advisor` step that shares the read-modify-write seam with ADR 0020). Source: `backup/main-pre-reset:swarmforge.sh` `write_worktree_advisor`. +- Runnable config (`six-pack`) stays topology-only — no keys added. (four-pack is frozen per ADR 0001 / the change manifest.) diff --git a/docs/adr/0013-curator-knowledge-promotion.md b/docs/adr/0013-curator-knowledge-promotion.md new file mode 100644 index 0000000..12e7908 --- /dev/null +++ b/docs/adr/0013-curator-knowledge-promotion.md @@ -0,0 +1,28 @@ +--- +status: accepted +--- + +# Curator role and the knowledge-promotion loop + +Upstream ends the line at QA: the specifier merges and asks for the next feature, and whatever the run *learned* — a wrong path taken, a convention discovered, a gate that should have existed — lives only in a session retro that no one reads again. The fork adds a terminal **curator** role, after the integrator, that turns those retros into **versioned repo knowledge** via one self-merging PR per run, then releases the specifier for the next feature. + +**Pipeline position: integrator → curator → specifier.** The integrator notifies the curator on a green landing; the curator promotes the run's knowledge and only then notifies the specifier. An empty run (no unprocessed retros) notifies the specifier immediately with no PR — the line never stalls on the curator. The curator makes no code changes; it may only write `AGENTS.md` and files under `.agents/` (ADR 0014). + +**Capture everything; discard once, at the curator.** The retro skill tags every action with a scope — `project | swarmforge | skill | ephemeral` — and captures all of them without filtering for "obviousness." The single discard gate is the curator's **non-inferable check**: could a future agent reach this fix from the error output and the files it names, with no foreknowledge? If yes, it is not worth promoting. Putting the one filter here, not at capture, means nothing is lost before a consistent judge sees it. + +**Promote to the highest rung that fits (the routing ladder).** A mechanical fix (config line, CI gate, script guard) goes to the enforcement-gate backlog — a gate beats documentation. Otherwise: `AGENTS.md` for universal invariants, `.agents/roles/.md` for one role's operational knowledge, `.agents/references/.md` for deep dives (each needs a pointer line or it never loads), `.agents/skills//` only on the second occurrence of a need, `.agents/upstream/.md` for `swarmforge`-scoped items, ledger-only for ephemeral and rejected. A learning whose fix is global routes *up* the ladder, never into `AGENTS.md`, and is discarded only when the gap is already mechanically closed. Every item is rewritten from a phenomenon ("X can fail because Y") into a rule ("every X MUST Z because Y") before it is promoted. + +**The ledger is the append-only audit.** `.agents/ledger.md` records one never-pruned line per processed item — `date | session-id | role | failure-class | verdict` — so recurrence is provable: an item rejected before and now recurring has proven itself non-trivial and is promoted rather than rejected again. + +**The curator self-merges from day one.** The knowledge PR is merged in-role with no user confirmation; the PR body (a metric line plus one verbatim bullet per promoted rule) and the ledger are the asynchronous review surface. Budgets hold the knowledge small: `AGENTS.md` ≤ 60 lines, each role file ≤ 40 — over budget, the stalest or now-inferable lines are pruned and ledgered. + +**Loop health is self-reported.** Each PR body carries running totals (`promoted | rejected | upstream | ephemeral`). Kill criterion: fewer than three promotions that survive contact with later sessions over 90 days → disable the curator window; the ledger and promoted docs stay. + +**Retros are captured automatically, from the transcript, before idle.** The loop only has something to promote because every role runs `agent-retro` as its last step before going idle — a line added to every role prompt — so a retro is produced for each role-session with no one asking. The skill reconstructs the session from its transcript rather than the role's from-memory account: it extracts via the `entire` CLI (`entire session current` → `session info --transcript`), falling back to Claude Code's `~/.claude/projects/` transcript path when `entire` is absent. Grounding the retro in the transcript is what lets the curator (and `retro-triage`, ADR 0021) judge against what actually happened, not what the role remembers happening. + +## Pending implementation + +- `six-pack` (four-pack is frozen per ADR 0001 / the change manifest): new `curator` role prompt; `swarmforge.conf` gains the curator window (last); rewire — integrator notifies the curator, specifier waits on the curator before the next feature, `workflow.prompt` documents the integrator→curator→specifier chain. +- `main`: upgrade the `agent-retro` skill — scope tag on every action, capture-first (no pre-filter), and an autonomous mode that marks actions `pending-curation` without prompting a human. +- `main`: `agent-retro` transcript capture (`entire session current` → `session info --transcript`, with the `~/.claude/projects/` fallback); add the "run `agent-retro` before going idle" line to every role prompt. Source: `feat/issue-20-a-retro-skill-upgrade`. +- Pairs with ADR 0014 (the `.agents/` contract the curator writes and the launcher injects). diff --git a/docs/adr/0014-agents-knowledge-injection.md b/docs/adr/0014-agents-knowledge-injection.md new file mode 100644 index 0000000..eea76c3 --- /dev/null +++ b/docs/adr/0014-agents-knowledge-injection.md @@ -0,0 +1,17 @@ +--- +status: accepted +--- + +# `.agents/` knowledge contract injected into every bundle + +Promoted knowledge is worthless if it never reaches the agent that needs it. Upstream bundles only the constitution and the role prompt into an agent's context, so there is no channel for project-specific, accumulated knowledge. The fork defines a versioned knowledge contract in the project repo and **injects it into every role bundle at launch**, closing the loop the curator (ADR 0013) feeds. + +**The contract lives in the project repo, under `.agents/` plus a root `AGENTS.md`.** `AGENTS.md` is the navigation map and universal invariants (≤ 60 lines); `.agents/roles/.md` is one role's operational knowledge (≤ 40 lines); `.agents/references/.md` holds deep dives reached by pointer; `.agents/skills//` holds promoted procedures; `.agents/backlog.md` is the enforcement-gate backlog; `.agents/ledger.md` is the append-only audit. All of it is written only by the curator and **versioned in the project**, not in `~/.claude` — so a fresh clone carries every promoted lesson and nothing depends on a machine's local memory. + +**Injection is automatic and role-scoped.** When the launcher builds a role's bundle it appends, when the files exist, the root `AGENTS.md` (so every role gets the universal invariants) and that role's `.agents/roles/.md` (so a role gets only its own operational knowledge). References are not injected — they load on demand when an included line points to them, which is why every reference must be pointed at from `AGENTS.md` or a role file. Missing files are silently skipped: a project that has not bootstrapped its knowledge yet launches cleanly with no knowledge blocks. + +## Pending implementation + +- `main`: extend the bundle generator (`write_agent_instruction_file` in `swarmforge.sh`) to append `AGENTS.md` and `.agents/roles/.md` from the project root when present, and add the preamble sentence telling the agent these knowledge files (and on-demand references) are included. +- Acceptance: a scratch project with an `AGENTS.md` → every generated bundle carries it; adding `.agents/roles/coder.md` → only the coder's bundle gains it; removing both → bundles generate with no knowledge blocks and no errors. +- Pairs with ADR 0013 (the curator is the only writer of this contract). diff --git a/docs/adr/0015-platform-feasibility-stop-rule.md b/docs/adr/0015-platform-feasibility-stop-rule.md new file mode 100644 index 0000000..f969ca6 --- /dev/null +++ b/docs/adr/0015-platform-feasibility-stop-rule.md @@ -0,0 +1,15 @@ +--- +status: accepted +--- + +# Platform-feasibility stop rule + +Upstream has no rule for what a role does when a spec requirement conflicts with what the platform can actually deliver. So the role improvises — it ships a silent workaround and leaves a code comment acknowledging the conflict, and behavior diverges from the spec with no one having decided that trade-off. The fork adds a constitution rule: **when a spec requirement conflicts with a real platform capability, stop and report to the user before proceeding.** + +**The workaround comment is the smell.** A comment in the code acknowledging a spec-vs-platform conflict is the signal that this rule fired and was suppressed — it is treated as a defect, not an accepted note. + +**Narrow on purpose.** This is not a general "stop when confused" rule; it fires specifically on spec-versus-platform-capability conflicts. It lives in the constitution (`workflow.prompt`), so it binds every role that reads the constitution rather than being repeated per role. + +## Pending implementation + +- `six-pack`: add the rule to `swarmforge/constitution/workflow.prompt`. (four-pack is frozen per ADR 0001 / the change manifest.) diff --git a/docs/adr/0016-boundary-logic-detection.md b/docs/adr/0016-boundary-logic-detection.md new file mode 100644 index 0000000..64ee55e --- /dev/null +++ b/docs/adr/0016-boundary-logic-detection.md @@ -0,0 +1,15 @@ +--- +status: accepted +--- + +# Boundary-logic detection + +Boundary files — environmentally-unsuitable adapter shells like TUI drivers, OS input handlers, and environment adapters — are excluded by design from every quality tool's worklist, because they can't run under test. Upstream leaves it there, so pure logic that gets embedded in a boundary file is invisible to mutation, CRAP, and coverage alike. The fork closes that hole: the cleaner (six-pack) / refactorer (four-pack) **also scans boundary files**, at a lower threshold, and extracts the logic when it finds too much. + +**A lower threshold, because boundary files should be thin.** Testable source keeps the existing 100-mutation-site split threshold; boundary files trigger at ~15–20 sites — above that, the file holds implementation, not adaptation, and the logic is extracted to a testable module before handoff. Extraction funnels that logic into the normal mutation/CRAP/coverage loops automatically, so no new test type is needed. + +**"Tested only through a stripped view" counts as untested.** A test that asserts a simplified projection of the output — ANSI-stripped text when the real output includes the escape codes and newline placement the function exists to add — does not cover the behavior. This is an explicit anti-pattern the cleaner/refactorer treats as missing coverage. + +## Pending implementation + +- `six-pack`: extend `swarmforge/roles/cleaner.prompt` to scan boundary files at the ~15–20 site threshold and add the stripped-view anti-pattern. (four-pack — whose equivalent role is `refactorer` — is frozen per ADR 0001 / the change manifest; no change there.) diff --git a/docs/adr/0017-inlined-prompt-bundle.md b/docs/adr/0017-inlined-prompt-bundle.md new file mode 100644 index 0000000..14fff03 --- /dev/null +++ b/docs/adr/0017-inlined-prompt-bundle.md @@ -0,0 +1,20 @@ +--- +status: accepted +--- + +# The agent context is one inlined, deduplicated prompt bundle + +Upstream builds a role's launch context by concatenating its constitution and role prompt, following `*.prompt` references with a simple recursive read — no deduplication and no structure, just text appended to text. The fork replaces this with a **resolved prompt bundle**: a breadth-first walk over the `*.prompt` reference graph that visits each file once (dedup by resolved path, already-visited references skipped so a cycle cannot loop), emitted as a single XML envelope `` with each source file in its own `` block. + +**The bundle is the unit of delivery, not just of launch.** Clear-first delivery (ADR 0002) wipes the session with `/clear` and then *re-injects the role bundle* before every task. That re-injection needs a single, complete, deduplicated context to re-send — which is exactly what the resolver produces. A naive recursive concatenation is fine to build once at launch but is the wrong shape to re-send reliably on every handoff. + +**It is the prerequisite for knowledge injection.** ADR 0014 appends the project's `AGENTS.md` and the role's `.agents/` file into this same envelope. There is nowhere to append them, and no well-defined boundary to append them at, until the context is a structured bundle rather than flat concatenated text. 0014 and the session-restart `executing` fields (ADR 0002) both build on top of the bundle. + +**Why an XML envelope.** Explicit `` boundaries let the agent tell its constitution from its role prompt from its promoted knowledge, instead of inferring breaks in a wall of concatenated text; and the BFS dedup keeps a cross-referenced constitution (articles, the dependency manifest) from appearing two or three times. + +This divergence is taken in its **minimal translated form**: the resolver and envelope are ported onto upstream's current tmux delivery harness, not lifted from the pre-reset implementation where they were entangled with the dropped cmux backend. + +## Pending implementation + +- `main`: replace the recursive-read heredoc in `write_agent_instruction_file` with `resolve_prompt_bundle` (BFS, dedup by resolved path) emitting the `` envelope; wire the resolved bundle into upstream's delivery path. Source: `backup/main-pre-reset:swarmforge/scripts/swarmforge.sh` — re-base onto current upstream, do not copy. +- Prerequisite for ADR 0014 (`.agents` injection) and the ADR 0002 `executing`-field recovery; both re-base on the bundle. diff --git a/docs/adr/0018-swarm-pins-and-upgrades-itself.md b/docs/adr/0018-swarm-pins-and-upgrades-itself.md new file mode 100644 index 0000000..208053a --- /dev/null +++ b/docs/adr/0018-swarm-pins-and-upgrades-itself.md @@ -0,0 +1,20 @@ +--- +status: accepted +--- + +# The swarm pins and upgrades its own dependencies + +The swarm depends on an external skill set (the `entire` skills) that it installs into the target project's `.claude/skills/`. The fork makes that dependency **pinned and upgradable**: a SHA recorded in `swarmforge/scripts/install-pins.conf`, installed automatically at launch, and refreshable through an explicit `./swarm upgrade`. + +**Pinned, not floating.** `install-pins.conf` records `ENTIRE_SKILLS_SHA`; the swarm installs exactly that SHA and writes it to `.swarmforge/skills-installed`. Moving versions means bumping the pin and committing it on `main` — so two runs weeks apart install identical skills, and an upstream skill change can never alter a run mid-flight. + +**Auto-install is launcher bootstrap, not project setup.** `ensure_skills_installed` runs at launch: if the recorded sentinel matches the pin it does nothing, otherwise it (re)installs. This is the program fetching its own dependencies — the same category as `./swarm` self-fetching its scripts — and is deliberately kept separate from the two things that are *not* automatic: project provisioning (the `setup-swarm` skill, ADR 0003) and role work (the idle gate, ADR 0002). It does not contradict "roles do nothing at startup": the launcher, not any role, installs the skills, and it does so before a single role starts. Skill installation therefore lives here, not in `setup-swarm`. + +**`./swarm upgrade` refreshes the installation.** It re-pulls the scripts (from `main`) and the role prompts (from the branch recorded in `.swarmforge/source-branch`) and forces a skill reinstall. `source-branch` is written on first run so `upgrade` knows whether a checkout's prompts came from `six-pack` or `four-pack`. + +**Why the swarm needs this at all.** A tool whose job is to adapt arbitrary projects must itself be reproducible and updatable in place; without a pin, runs drift; without `upgrade`, an operator's only way to take a fix is to re-clone. + +## Pending implementation + +- `main`: `install_skills` + `ensure_skills_installed` (pin-aware, idempotent) in `swarmforge.sh`, plus the new `swarmforge/scripts/install-pins.conf`. Source: `backup/main-pre-reset` (`~L946`). +- Runnable branches (`six-pack`/`four-pack`, root `swarm` bootstrap — not `main`): the `upgrade` subcommand, `download_from_main`, `write_source_branch`, and `.swarmforge/source-branch` tracking. Source: `swarm` bootstrap commit `8994322`. diff --git a/docs/adr/0019-autonomous-permission-mode.md b/docs/adr/0019-autonomous-permission-mode.md new file mode 100644 index 0000000..28934eb --- /dev/null +++ b/docs/adr/0019-autonomous-permission-mode.md @@ -0,0 +1,17 @@ +--- +status: accepted +--- + +# Roles run unattended in autonomous permission mode + +Upstream launches the `claude` and `grok` roles with `--permission-mode acceptEdits`, which auto-approves file edits but still raises an interactive permission prompt on every bash/tool call. The fork's roles run **fully unattended** in isolated worktrees — there is no human present to answer that prompt, so for the fork the prompt is not a safety net, it is a silent hang. The fork launches with `--permission-mode auto`. + +**Why `auto` and not the other never-prompt modes.** Claude Code offers three modes that never block on a prompt: `auto`, `dontAsk`, and `bypassPermissions`. `bypassPermissions` ignores all allow/deny rules and ships no safety checks — unacceptable for worktrees that touch a real repository and the network. `dontAsk` is deterministic but runs only an explicit allow-list and denies everything else, which would mean building and maintaining an exhaustive command allow-list spanning every language and tool the swarm drives — ongoing complexity the fork chooses not to take on. `auto` keeps roles moving with near-zero configuration while retaining built-in guardrails (it still refuses force-pushes to the main branch, mass deletion, and similar high-blast-radius actions). Because `auto` is in force, the permission allow-rules that `setup-swarm` writes (ADR 0003) stay a small, targeted, advisory set rather than a load-bearing whitelist. + +**This is a real mode, deliberately verified.** `auto` is one of Claude Code's documented `--permission-mode` values — unlike the per-role advisor knob of ADR 0012, which turned out to have no CLI flag and had to be written to settings instead. The lesson there was applied here before committing to the divergence. + +The `codex` backend launches with no permission-mode flag at all, so this change touches only the `claude` and `grok` launch lines. + +## Pending implementation + +- `main`: change `--permission-mode acceptEdits` → `auto` on the `claude` and `grok` lines in `launch_role` (a one-word change on each line; reapply on every upstream sync). Source: `backup/main-pre-reset` commit `1097233`. diff --git a/docs/adr/0020-worktree-auto-compaction.md b/docs/adr/0020-worktree-auto-compaction.md new file mode 100644 index 0000000..0aa26b7 --- /dev/null +++ b/docs/adr/0020-worktree-auto-compaction.md @@ -0,0 +1,15 @@ +--- +status: accepted +--- + +# Role worktrees auto-compact before context overflow + +A swarm role can run a long, many-turn session — build, run the suite, read failures, fix, re-verify — that walks its context toward the model's window limit. Upstream leaves context management to the client's defaults. The fork provisions each role worktree so the role **compacts its own context before it overflows** rather than failing partway through a task. + +**The settings.** Each worktree's `.claude/settings.local.json` is given `autoCompactEnabled: true`, `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE: "88"` (compact at 88% of the window) and `CLAUDE_CODE_AUTO_COMPACT_WINDOW: "200000"`. The thresholds are tunable; these are the fork's current defaults, set to leave headroom ahead of a hard limit so compaction happens on the role's terms, not as a crash. + +**Why per-worktree `settings.local.json`.** The file is fork-owned and not upstream-tracked, so writing to it adds no merge-conflict surface — the additive divergence ADR 0001 asks for. It is also the same provisioning seam already used to write the per-role advisor (ADR 0012); both perform a read-modify-write into this one file, so they share a single mechanism rather than each inventing its own. + +## Pending implementation + +- `main`: write the three keys into each worktree's `.claude/settings.local.json` (a `write_worktree_permissions` step, or folded into the existing advisor writer), called from `prepare_worktrees`; share the read-modify-write with the ADR 0012 advisor writer. Source: `backup/main-pre-reset` (`write_worktree_permissions`, ~L679; commit `93f8c5d`). diff --git a/docs/adr/0021-retro-triage-skill.md b/docs/adr/0021-retro-triage-skill.md new file mode 100644 index 0000000..146bd9d --- /dev/null +++ b/docs/adr/0021-retro-triage-skill.md @@ -0,0 +1,20 @@ +--- +status: accepted +--- + +# retro-triage: operator root-cause diagnosis, distinct from the curator + +The fork keeps a `retro-triage` skill: an operator-invoked tool that turns a *batch* of session retros into a validated, cross-session **root-cause diagnosis** from which a human files issues. It lives in `.claude/skills/` (an operator tool), not `swarmforge/skills/` (the skills the swarm's own roles run). + +**Why it exists alongside the curator.** The curator (ADR 0013) already consumes session retros — but autonomously, one item at a time, to promote agent-facing knowledge into the repo. retro-triage is its complement, not a duplicate. The curator fixes *"the swarm doesn't **know** X"* — a missing rule becomes repo knowledge. retro-triage fixes *"the swarm is **structurally doing** X wrong"* — a pipeline, tooling, or strategy defect becomes a filed issue for a human to act on. The structural causes it hunts (one upstream decision surfacing as different pains across five roles) are precisely what a per-item consumer like the curator cannot see, because they live *across* retros and below any single retro's notice. + +**Diagnosis is the product, not sorting.** The skill exists to prevent two failure modes that occurred in real runs: codifying a workaround as a win (a slick technique that only exists to cope with a self-inflicted problem is evidence of cost, not a pattern), and inheriting a retro's own proposed fix (the retro reports a symptom; its suggested fix is a hypothesis, not a finding). Every root cause is recorded with re-pullable receipts — a transcript quote by session id, git output, a `file:line` — and validated against the artifacts; an unvalidated cause is not a finding and cannot be filed. + +**Why `.claude/skills/`, not `swarmforge/skills/`.** It is a human's meta-analysis tool, not a step any swarm role executes. Keeping it with the operator skills leaves the swarm's own skill set to the things the swarm itself runs (`agent-retro`, `setup-swarm`). + +**Sharing the retro pool without starvation.** Both the curator and retro-triage read `~/.claude/worklog/retros/`. They must not consume each other's unseen retros: the curator processes and archives retros to `processed/` each run (ADR 0013), and retro-triage reads the full history — live pool plus archive — while tracking its own consolidation independently of the curator's mark. Neither destroys what the other has not yet seen. + +## Pending implementation + +- `main`: restore `.claude/skills/retro-triage/SKILL.md` as-is (byte-identical across branches). Source: `feat/issue-20-a-retro-skill-upgrade`. +- Make retro-triage's retro detector glob the curator's `processed/` archive in addition to the live `~/.claude/worklog/retros/` directory, so curated retros remain visible to a later diagnosis. diff --git a/docs/fork-change-manifest.md b/docs/fork-change-manifest.md new file mode 100644 index 0000000..6c61f45 --- /dev/null +++ b/docs/fork-change-manifest.md @@ -0,0 +1,115 @@ +# Fork change manifest + +Compact, permanent record of **every divergence to apply on top of a pristine `upstream`**, one line per change. Rationale lives in the ADRs (`docs/adr/`) — this file is *where + what + source*, not *why*. Use it to (re)apply the fork after any upstream sync. + +## Sync policy (ADR 0001) + +- **Current upstream baseline (re-apply the fork layer onto this):** `main` ← `upstream/main` @ `d947f67` · `six-pack` ← `upstream/six-pack` @ `cbd1697` (2026-06-14). Bump these on every sync; an annotated `fork-base/-` tag pins the same commit so the anchor survives a hard reset (merge history alone does not — this fork has been reset before). +- **Merge style by source:** every **fork divergence is squash-merged** (one divergence PR → one clean commit on the delivery branch). **Upstream syncs are history-preserving merges** — never squashed, never rebased (keep upstream's story; `rerere` replays conflicts). The two initial re-implementation PRs follow the same squash rule (one squashed commit per branch). A landed commit is never rewritten. +- `main`, `six-pack`, `four-pack` are kept **identical to `upstream/`** and advanced by **merge** (`git merge upstream/`), never rebase. `rerere` replays conflict resolutions. +- **four-pack is frozen (decision 2026-06-14): no fork divergences are applied to it.** Only `main` and `six-pack` carry changes. (Open: whether four-pack is still resynced to upstream to honor "keep == upstream", or left as-is — see below.) +- Every item below is **additive** (new file or appended rule) wherever possible; a non-additive edit to an upstream line is marked **[edit]** and is a conscious, documented conflict point. +- **Delivery routing:** `main` ← scripts + skills + docs/ADRs · `six-pack` ← role prompts, constitution articles, templates, manifest, `swarmforge.conf`. +- Never push `main` without explicit request; **never** push `upstream` (`gh` defaults to upstream → always `--repo gabadi/swarm-forge`). + +## Source legend + +- **ADR** — `docs/adr/NNNN-*.md` (decision + rationale + `## Pending implementation`). +- **B6** — `backup/six-pre-reset` (real pre-reset six-pack artifacts: prompts, manifest, template, conf). Re-merge onto *current* prompts; do **not** copy whole files (they predate current upstream; some carry behavior the ADRs removed). +- **I20A** — `feat/issue-20-a-retro-skill-upgrade` (`swarmforge/skills/agent-retro/`, `AGENTS.md`). +- **I20B** — `feat/issue-20-b-bundle-knowledge-injection:docs/specs/issue-20-knowledge-promotion-loop.md` (locked curator-loop spec, PRs A→B→C→D; **spec wins** over issue #20; budgets AGENTS.md ≤60 / role files ≤40). + +## Per-row recovery docs (exact recover-from `branch:path` + delta + STRIP per item) + +- `docs/migrations/main-script-layer.md` — all Section A + Section C `swarmforge.sh`/scripts rows. **⚠ Idea B + cmux + M3 + executing-fields are one entangled ~400-line restructure — gating decision: keep the full cmux delivery model or rebuild lean on upstream's harness.** +- `docs/migrations/six-pack-role-prompts.md` — all Section B/C role-prompt rows + the 3 new roles + final conf window order + the STRIP table (backup content ADRs reversed). +- `docs/migrations/0003-setup-skill-sources.md` — setup skill design recovery (net-new, no code). + +--- + +## A. `main` — scripts / skills / docs + +Script path: `swarmforge/scripts/swarmforge.sh`. Skills path: `swarmforge/skills/`. + +| ADR | Change (one line) | Where | Source | +|-----|-------------------|-------|--------| +| 0006 | In `prepare_worktrees` (`git worktree add`, ~L331) add `git sparse-checkout` excluding the pinned QA-suite path for **every worktree except the specifier's and QA's** (key on the specifier role, not the `master` name — ADR 0008 renames its worktree to `specifier`); verify the path survives each role's handoff commit. | `swarmforge.sh` `prepare_worktrees` | ADR 0006 · **NET-NEW (no impl)** | +| 0012 | `parse_config` (~L182, today rejects ≠4 fields) → accept **≥4 fields**, parse `key=value` tail into a per-role map; `launch_role` (~L414) → append mapped flags per backend. **[edit]** | `swarmforge.sh` | ADR 0012 · recover `backup/main-pre-reset` · **advisor = `advisorModel` in settings.local.json, not `--advisor`** ✅ | +| 0014 | `write_agent_instruction_file` (~L389) → append project-root `AGENTS.md` + `.agents/roles/.md` when present, plus a preamble sentence; missing files silently skipped. | `swarmforge.sh` | ADR 0014 + I20B(PR-B) · **needs Idea B first** | +| 0013 | Upgrade `agent-retro` skill: per-action **scope tag** (`project\|swarmforge\|skill\|ephemeral`), **capture-first** (no pre-filter), **autonomous** mode marking actions `pending-curation` without a human prompt. | `swarmforge/skills/agent-retro/` | ADR 0013 + I20A + I20B(PR-A) | +| 0003 | New **`setup-swarm` skill** (stack detection; writes tooling/permissions/skill-pins/session-tracking; emits the **swarm-ready marker** `.swarmforge/setup-complete`); **setup-first** — operator runs `/setup-swarm` as step one, `./swarm` only **guards** on the marker and refuses if unset (never auto-runs setup). Absorbs Idea O scaffold. *Impl details open: marker format, stack detection (no backup artifact).* | `swarmforge/skills/setup-swarm/` (new) | ADR 0003 | + +--- + +## B. `six-pack` — prompts / constitution / templates / conf + +Roles: `swarmforge/roles/*.prompt` · constitution: `swarmforge/constitution/articles/*.prompt` · `swarmforge/swarmforge.conf`. + +| ADR | Change (one line) | Where | Source | +|-----|-------------------|-------|--------| +| 0002/0003 | Remove the `At startup, install/make-ready …` directive(s): `coder`:9, `QA`:7, `cleaner`:19, `hardender`:8–9. **[edit]** | `roles/*.prompt` | ADR 0002, 0003 | +| 0002 | Add idle-gate rule to each role prompt: "Wait for a handoff. Do not act without one." | `roles/*.prompt` | ADR 0002 | +| 0009 | Add `swarmforge/templates/feature.feature` — **8-section** spec header (TRACKING/CONTRACT/CONSTRAINTS/SEQUENCING/NFR/SIDE EFFECTS/SCOPE + UX INTENT). | `templates/feature.feature` (new) | ADR 0009 + B6 | +| 0009 | Specifier phase 1 starts from the template, addresses **all** sections before scenarios; fix stale count "seven" → **"eight"/"all"**. | `roles/specifier.prompt` | ADR 0009 + B6 | +| 0011 | Add `swarmforge/dependency-manifest.prompt` (3 tier defs inline + Rules section, body `(none)`); auto-resolved by the bundle resolver. | `dependency-manifest.prompt` (new) | ADR 0011 · recover `feat/baseline-scenarios-six` (**obs-harness-six over-deleted the Rules section**) | +| 0011 | Specifier reads the manifest before scenarios; on an undeclared external system → stop, propose name/tier/impl/gaps, wait for approval. | `roles/specifier.prompt` | ADR 0011 + B6 | +| 0010 | Add **surface-tool table** + context-driven acquisition rule (tmux/PTY · Playwright · HTTP client · ingress event-injection) to `engineering.prompt`. | `constitution/articles/engineering.prompt` | ADR 0010 + B6 | +| 0010 | Require a per-surface **baseline scenario** committed with every feature's flow scenarios (idle stability / no console errors / no-op event = no state change). | spec-header + role prompts | ADR 0010 | +| 0015 | Add platform-feasibility **stop rule** to `workflow.prompt` (spec-vs-platform conflict → stop & report; a workaround comment is a defect). | `constitution/articles/workflow.prompt` | ADR 0015 | +| 0005 | Rewrite QA to a **refute** posture (assume build fails spec & tests are weak; attack within the spec; conversion fidelity); replace "Fix bugs found by the QA suite…" (`QA`:14) — local fix in place, structural routes back. **[edit]** | `roles/QA.prompt` | ADR 0005 + B6 | +| 0010 | QA: replace "through the user interface only" (`QA`:13) with "**through the declared surface harness**"; add **every Expected bullet → a harness assertion or `NOT AUTOMATED — `**; QA re-executes committed `observation-harness/`, routes back if a user-facing surface has none. **[edit]** | `roles/QA.prompt` | ADR 0010 + B6 | +| 0004 | Add back-routing rule to role prompts: structural finding routes to its origin stage; local stays with finder; single-finding cap (back **once**) + feature cap **N=3** via routing count in the handoff trail (ux-engineer & integrator carry N=3). | `roles/*.prompt` | ADR 0004 | +| 0007 | Add **UX Engineer** role after coder (runs product, fixes rendering vs UX Intent, universal visual-quality bar incl. WCAG 4.5:1/3:1, writes `observation-harness/` + snapshots + rendering invariants; routes back per 0004 N=3); add conf window after coder. **Strip** DESIGN.md scaffold/walk-up from B6 draft. | `roles/ux-engineer.prompt` (new) + `swarmforge.conf` | ADR 0007 + B6 | +| 0007 | Coder reads UX Intent; specifier authors the UX INTENT section. | `roles/coder.prompt`, `roles/specifier.prompt` | ADR 0007 | +| 0008 | Add terminal **integrator** role (PR + green CI, post-merge gate, one PR/feature, autofix lint only, **hands off to curator**); add conf window. | `roles/integrator.prompt` (new) + `swarmforge.conf` | ADR 0008 + B6 | +| 0008 | Specifier **stops merging**: drop merge step (specifier:36), move specifier off `master` to its own worktree, reset to default branch per feature. **[edit]** | `roles/specifier.prompt` + `swarmforge.conf` | ADR 0008 | +| 0013 | Add terminal **curator** role (promotes retros → `.agents/`+`AGENTS.md` via one self-merging PR, then releases specifier; empty run = pass-through); rewire **integrator→curator→specifier**; conf curator window last; document chain in `workflow.prompt`. | `roles/curator.prompt` (new) + `swarmforge.conf` + `workflow.prompt` | ADR 0013 + B6 + I20B(PR-C) | +| — | **hardener** rendering-invariant property tests for pure rendering fns (state→string) — **unmanifested divergence found in audit**; consistent w/ 0007/0010. | `roles/hardender.prompt:18` | recover `backup/six-pre-reset` | +| 0016 | `cleaner` also scans **boundary files** at ~15–20-site threshold (vs 100 for testable source), extracts logic to a testable module; add the "stripped-view = untested" anti-pattern. | `roles/cleaner.prompt` | ADR 0016 + B6 | + +--- + +## C. Uncaptured implemented divergences — NO ADR (recover from backup, else lost on rebase) + +The behavioral/prompt-layer ADRs (0002–0016) did not originally cover the **`main`-side script infrastructure**. The items below were uncaptured implemented divergences living only in the monolith ADR (`backup/main-pre-reset:docs/adr/0001-fork-divergence.md`, "§Idea X") + the backup/feat branches — **each since dispositioned** (right-hand `ADR?` column): most now have their own ADR (0017–0021), the rest extend an existing ADR, fold into one, or stay a row here. **Each verified as still a divergence vs current `upstream/main` (2026-06-14).** They are prerequisites/peers of Section A — a clean rebase that follows only the original ADRs would drop them. + +| Idea | Divergence (one line) | Verified vs upstream | Source artifact | ADR? | +|------|----------------------|----------------------|-----------------|------| +| B | **Prompt-bundle inlining** — `write_agent_instruction_file` emits XML envelope `` + `resolve_prompt_bundle` (BFS over `*.prompt` refs, dedup). **KEEP (decision 2026-06-14).** Must be **disentangled from cmux**: port the resolver + envelope onto upstream's tmux harness and wire the bundle into upstream's delivery (NOT cmux's `write_deliver_script`). Prerequisite for M3/0014. | upstream has the naive read-recursively form only | `backup/main-pre-reset:swarmforge/scripts/swarmforge.sh` (`resolve_prompt_bundle`, `write_agent_instruction_file`); re-base, don't lift | **0017** | +| F | **Auto-compaction on role worktrees** — `write_worktree_permissions` merges into `.claude/settings.local.json`: `autoCompactEnabled:true`, `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE:"88"`, `CLAUDE_CODE_AUTO_COMPACT_WINDOW:"200000"`. | absent upstream | `backup/main-pre-reset` (commit 08e7f25); `mono §Idea F:207` | **0020** | +| J | **Session-retro plumbing** — `agent-retro` uses `entire session current`→`session info --transcript >/tmp`; fallback `~/.claude/projects/`; Codex-schema risk accepted; `agent-retro before idle` line in every role prompt. | absent upstream | `feat/issue-20-a…:swarmforge/skills/agent-retro/`; `mono §Idea J:189` | extend **0013** | +| N | **`./swarm upgrade`** — refresh scripts(main)+prompts(source branch)+skills; `install-pins.conf` SHA pinning; `.swarmforge/source-branch` tracking; auto-install skills on first launch via `.swarmforge/skills-installed`. | absent upstream | `mono §Idea N:88` | **0018** | +| O | **Install scaffold** — `.gitignore` gen (`logbook.jsonl`,`tmp/`,`.swarmforge/`); default-branch probe→`swarmforge.conf`; permission allow-rules. **Overlaps setup-swarm skill (0003).** | absent upstream | `mono §Idea O:326` | folds into **0003** | +| — | **Autonomous permission mode** — `--permission-mode auto` (not `acceptEdits`) in `launch_role`. | upstream = `acceptEdits` (L433/442) | `backup/main-pre-reset` (commit 1097233) | **0019** | +| — | **cmux multiplexer backend** — `swarm-mux.sh`. **DROP — not wanted in the new fork (decision 2026-06-14).** Stay on upstream's tmux harness. Dropping this is what un-tangles Idea B / executing-fields / M3. | no mux file upstream | n/a — not reapplied | **DROP** | +| — | **`executing` logbook entry carries `{message,hash,sender}`** for session-restart recovery (ADR 0002 names only the idle/busy marker). | absent upstream | `feat/main-executing-context-fields:swarmforge/scripts/swarmforge.sh` | extend **0002** | +| — | **retro-triage skill** — `.claude/skills/retro-triage/` (~219 lines), diagnosis-first batch retro. **KEEP — restore (decision 2026-06-14).** Byte-identical on all branches; recover as-is. | absent upstream | `feat/issue-20-a…:.claude/skills/retro-triage/SKILL.md` | **0021** | +| — | **Self-referencing fork URL** — `./swarm` self-fetch points at the fork. | upstream points at unclebob | `backup/main-pre-reset` (commit ded6019) | row-only | +| — | **Richer `CONTEXT.md` glossary** — Task / Logbook / Prompt bundle / Bundle cache / Landing / Depth cap / full logbook-status spec; leaner than the backup version. | n/a (docs) | `backup/main-pre-reset:CONTEXT.md` | doc-merge | + +Not-lost / already consistent (no action): curator budget **60/40** (ADR 0013 + I20B spec win over backup prompts' stale 150/300); DESIGN.md **not scaffolded** (ADR 0007 wins over `mono §Idea M`); back-routing **to owning stage** (ADR 0004 wins over `mono §Idea E` "always to coder"). Genuinely rejected (no recover): ideas **G, H, I**. +Also unimplemented draft, not a divergence: `backup/main-pre-reset:docs/proposals/2026-06-11-factory-line-refactor.md` (architecture audit; status draft). + +--- + +## Cross-cutting invariants (do not break while applying) + +- **observation-harness/** is shared: ux-engineer writes (0007), doctrine (0010), QA re-executes (0010), hardener honors rendering invariants — keep consistent. +- **Back-route N=3** (0004) referenced by ux-engineer & integrator — keep the routing-count-in-handoff mechanic. +- **Refuting QA (0005)** is *new*; the B6 QA draft already has the 0010 surface-harness wording — **merge both** when writing QA.prompt. +- **DESIGN.md** is referenced-from-feature-file only (0007) — when porting B6 specifier/ux-engineer, delete scaffold-on-absence and nearest-file walk-up. +- **Curator PRs land in order** A→B→C→D (I20B); everything else is independently landable. + +## Still open (decisions / unknowns) + +*(resolved 2026-06-14, grilling session — what's-missing pass)* + +0. **Section C scope** — RESOLVED. All Section-C items kept (cmux already dropped). ADR assignments: B→**0017**, N→**0018**, auto-permission→**0019**, F→**0020**, retro-triage→**0021**; J→extend **0013**, executing-fields→extend **0002**, O→folds into **0003**; self-url→row-only; CONTEXT glossary→doc-merge. **Idea B remains a hard prerequisite for M3/ADR 0014.** +1. **ADR 0003 setup-swarm skill** — idea-K conflict RESOLVED: setup is **setup-first** (operator runs `/setup-swarm` as step one); `./swarm` **guards** on the `.swarmforge/setup-complete` marker and refuses if absent — it never auto-runs setup. Skill **renamed `setup` → `setup-swarm`**. Idea O folds in. *Remaining impl details (not blockers): marker content format, stack-detection mechanism, per-language tool selection — captured in `docs/migrations/0003-setup-skill-sources.md`.* +2. **ADR 0002 clear-first on six-pack** — RESOLVED: the model column is **configuration** (governed by ADR 0012's per-role model), not an architectural decision. No codex-hook work is added. ADR 0002 stands as written — clear-first is claude-first; codex roles keep upstream delivery as a documented property. +3. *(resolved earlier)* cmux **DROPPED** (stay on upstream tmux harness); Idea-B bundle-inlining **KEPT** but disentangled — port `resolve_prompt_bundle` + XML envelope onto upstream's harness, re-base executing-fields/M3 on it. ADR 0012 `--advisor` resolved (`advisorModel` in `settings.local.json`). +4. **four-pack** — RESOLVED: kept as a **pure merge-mirror of `upstream/four-pack`** (no fork content ever) to honor ADR 0001's "all branches == upstream"; resync via merge-only. +5. **PR shape for implementation** — RESOLVED (2026-06-14): **one PR per delivery branch**, not one per ADR. Two PRs total — `main` (script + skill layer) and `six-pack` (prompts/constitution/conf/root-swarm); no four-pack PR (frozen). Each divergence is an **ordered commit** within its branch (dependency-linear), keeping the single PR tailored. These two initial PRs may be squash-merged (see Sync policy). Full task breakdown: `docs/superpowers/plans/2026-06-14-fork-divergence-implementation.md`. + +**Overriding constraint (all items):** keep the diff vs upstream as small as possible — translate to the minimal additive form, do not lift the pre-reset implementation. See `feedback-minimize-upstream-diff` memory. + diff --git a/docs/migrations/0003-setup-skill-sources.md b/docs/migrations/0003-setup-skill-sources.md new file mode 100644 index 0000000..8e09a11 --- /dev/null +++ b/docs/migrations/0003-setup-skill-sources.md @@ -0,0 +1,55 @@ +# Migration source list — ADR 0003 setup skill + +Working source list to implement the **setup skill** (ADR 0003) without losing decisions already made in the pre-reset work. ADR 0003 decided *that* setup becomes a one-time skill; the *how* lives scattered across idea-K, the monolith ADR, ideas N/O, and the "At startup, install…" lines being removed. There is **no implemented setup skill in any branch** (confirmed) — this is design recovery, not code recovery. + +Refs: `idea-K` = `origin/docs/ideas-backlog:docs/ideas/idea-K-setup-preflight.md` · `mono` = `backup/main-pre-reset:docs/adr/0001-fork-divergence.md` · `ADR` = `docs/adr/0003-setup-is-a-one-time-skill.md`. + +## ✅ Resolved (2026-06-14): setup-first, guard-only; skill renamed `setup-swarm` + +- **idea-K** (auto-run on first launch) is **superseded.** `./swarm` never runs setup; the auto-run + stale `backup/main-pre-reset:CLAUDE.md:12` line are dead. +- **ADR 0003 form wins:** setup is **setup-first** — the operator runs `/setup-swarm` as the project's *first* action. `./swarm` is the *second* action and only **guards**: if `.swarmforge/setup-complete` is absent it refuses and tells the operator to run `setup-swarm` first. +- **Skill renamed `setup` → `setup-swarm`** (operator-facing `/setup-swarm`). Glossary updated (`CONTEXT.md`: `setup-swarm`, `swarm-ready marker`). Skill path: `swarmforge/skills/setup-swarm/`. + +## Decisions already made (cite before re-deciding) + +- Setup is a **skill** (fork-owned file, zero upstream conflict), not a `swarmforge.sh` function — `ADR`. +- Run path installs **no project tooling**; `./swarm` still self-fetches scripts, does worktree/session plumbing, **and auto-installs the swarm's own `entire` skills (pin-aware `ensure_skills_installed`, owned by ADR 0018)**; stops if the project isn't set up — `ADR`. *(Decision 2026-06-14: launcher infra-bootstrap stays automatic; only project provisioning is gated by the setup-swarm marker. See `main-script-layer.md` Idea N row.)* +- Skill **reasons about the stack** (Go vs Java vs Clojure → which tools/gates) — that's the point of a skill over a script — `ADR`. +- `entire enable --no-github --telemetry=false` (no `--agent`; hooks added separately) — `idea-K`, `mono §Idea K`. +- Backends derived from `swarmforge.conf` col 3 → `entire agent add ` per unique value; no user input — `idea-K`, `mono §K:178`. +- Warn-and-continue if `entire` absent (setup never blocks the swarm) — `idea-K`, `mono §K:182`. +- No `./swarm setup` subcommand; force re-run = operator deletes the marker — `idea-K`, `mono §K:180`. +- Idea G (per-tech engineering template system) **rejected** — adding a language is 2–3 lines in the shared table — `idea-G`, `mono:69`. + +## What the setup skill must take over (from the removed "At startup, install…" lines) + +| Category | Detail | Removed-line source | +|----------|--------|---------------------| +| Mutation/CRAP/DRY tools | language mutation + CRAP + DRY, from `engineering.prompt` | `upstream/six-pack:roles/cleaner.prompt:19`, `hardender.prompt:8`, `QA.prompt:7` | +| Acceptance Pipeline (APS) | ensure pipeline in place; build `gherkin-parser` + `gherkin-mutator` from `github.com/unclebob/Acceptance-Pipeline-Specification` | `upstream/six-pack:roles/coder.prompt:9`, `hardender.prompt:9` | +| Session tracking | `entire enable …` + `entire agent add ` per conf backend | `idea-K`, `mono §K` | +| ~~Skill pins~~ → **ADR 0018, not setup-swarm** | `entire` skills at pinned SHA (`install-pins.conf` `ENTIRE_SKILLS_SHA`); 11 skills + `agent-retro` to `.claude/skills/`. **Moved out of setup-swarm (decision 2026-06-14):** this is launcher infra-bootstrap, auto-installed by `./swarm` (`ensure_skills_installed`, pin-aware). Documented in **ADR 0018 (Idea N)**. | `mono §Idea N:100` | +| Permissions | write to `.claude/settings.json`: `Bash(gh pr merge*)` (integrator), `Bash(git reset --hard origin/)` (specifier) | `mono §Idea O:334` | +| Install scaffold | `.gitignore` ← `logbook.jsonl`, `tmp/`, `.swarmforge/`; default-branch probe `git symbolic-ref refs/remotes/origin/HEAD` → `swarmforge.conf` | `mono §Idea O:330-332` | + +Note four-pack equivalents exist (architect/refactorer/coder) but four-pack is **frozen** — six-pack rows above are what matters. + +## Swarm-ready marker + +- Path **`.swarmforge/setup-complete`**; `./swarm` checks it before role launch; absent → refuse (ADR 0003 form). Operator deletes to force re-run. — `idea-K`, `mono §K:180`, `ADR`. +- **Marker content (defaulted 2026-06-14, impl detail):** timestamp + swarmforge SHA (debuggable); refusal message text is impl-level. Not an ADR decision. + +## Open design questions — resolved 2026-06-14 + +1. **Stack detection mechanism** — **RESOLVED: the skill's own domain, not an ADR decision.** setup-swarm is a *skill* precisely because it *reasons* about the stack; the ADR must not prescribe a rigid probe list (that would contradict why it's a skill). The `SKILL.md` reads the repo, infers the stack, and asks the operator only when genuinely ambiguous. +2. **Marker format** — defaulted (see above): timestamp + swarmforge SHA. Impl detail. +3. **How the skill ships** — path `swarmforge/skills/setup-swarm/SKILL.md`, mirroring `agent-retro`. Settled. +4. **Re-run / staleness trigger** — RESOLVED: *project* re-setup = operator deletes the marker (manual, by design). *Skill* staleness = `./swarm` auto-(re)installs pin-aware at launch (ADR 0018), no manual trigger needed. +5. **Idea O scope boundary** — RESOLVED: setup-swarm absorbs `.gitignore`/default-branch/permissions (Idea O); the **`entire` skill install moved OUT to ADR 0018** (launcher bootstrap). No `./swarm install` subcommand. +6. **Per-language tool selection** — **RESOLVED: the skill's domain (same as #1).** The skill reasons from `engineering.prompt`'s tool table; behavior on no-match is skill-level judgment (ask the operator), not an ADR rule. + +## Cross-references + +- Pairs with **Idea N (install/upgrade)** and **Idea O (install scaffold)** — both implemented pre-reset, both **without an ADR**; see the manifest's "Uncaptured implemented divergences" section. The setup skill overlaps their territory and must be designed jointly. +- The removed "At startup" lines are also removed for the idle-gate reason (ADR 0002) — shared seam. + diff --git a/docs/migrations/main-script-layer.md b/docs/migrations/main-script-layer.md new file mode 100644 index 0000000..0222c3d --- /dev/null +++ b/docs/migrations/main-script-layer.md @@ -0,0 +1,34 @@ +# Migration recovery — `main` script layer (`swarmforge/scripts/`) + +Per-divergence recovery for everything that touches the launch script on `main`. Base = pristine `upstream/main` (`swarmforge/scripts/swarmforge.sh`, ~554 lines, naive form). Primary source = `backup/main-pre-reset` (~1109 lines — all script divergences stacked linearly). **Re-merge onto current upstream; do not copy the whole file.** + +## ⚠ The entanglement (read first) + +**Idea B (bundle inlining) + cmux backend + M3 (0014) + executing-fields are NOT independent patches.** In `backup/main-pre-reset` they are one ~400-line restructure of the handoff/delivery model: +- cmux refactor introduces `write_deliver_script`, `write_notify_script`, `write_stop_hook`, `write_worktree_notify_wrapper` and **deletes** upstream's `install_shared_constitution_articles`, `sync_worktree_scripts`, `write_tmux_env_file`. +- Idea B's `resolve_prompt_bundle` + rewritten `write_agent_instruction_file` (XML envelope) produce the bundle that `write_deliver_script` passes via `$BUNDLE_PATH`. +- M3 (0014) is a 7-line addendum (commit `1b84895`) **inside** the rewritten `write_agent_instruction_file`. +- executing-fields (commit `a133c71`) live **inside** `write_deliver_script` and `write_stop_hook` heredocs. + +**DECIDED (2026-06-14): cmux is DROPPED; Idea B is KEPT.** So do NOT lift the cmux commit. Instead **disentangle**: port `resolve_prompt_bundle` + the XML-envelope `write_agent_instruction_file` onto upstream's current tmux harness, wire the resolved bundle into upstream's delivery path (not cmux's `write_deliver_script`), then layer M3/0014 and re-base executing-fields onto that. Idea F (auto-compaction) wiring is independent. Skip everything cmux: `swarm-mux.sh`, `swarm-stop.sh`, the `write_deliver_script`/`write_notify_script`/`write_stop_hook` family, `MUX_TARGETS`. + +## Recovery table + +| Row | Recover from | Delta vs upstream / notes | +|-----|-------------|---------------------------| +| **M1 / 0006** sparse-checkout in `prepare_worktrees` | **NET-NEW — no source anywhere** | Write fresh: `git sparse-checkout` excluding the pinned QA path on every worktree except the specifier's + QA's (key on the specifier role, not the `master` name — ADR 0008 renames its worktree to `specifier`); verify path survives handoff commits. Prereq: QA path pinned in specifier prompt. | +| **M2 / 0012** per-role model/effort/advisor | `backup/main-pre-reset:swarmforge.sh` `parse_config`(~L212), `launch_role`(~L870), arrays `ROLE_MODELS/EFFORTS/ADVISORS`(~L42); commits `93f8c5d`, `d467ab7` | `!= 4`→`< 4` + `key=value` loop; per-backend flag locals. **Advisor is NOT a `--advisor` flag** — `write_worktree_advisor` writes `advisorModel` to `.claude/settings.local.json`. ✅ resolves the "does claude --advisor exist" open item. | +| **Idea B** bundle inlining | `backup/main-pre-reset:swarmforge.sh` `resolve_prompt_bundle`(~L797)+`write_agent_instruction_file`(~L825); same on `feat/main-executing-context-fields`, `feat/issue-20-b` | Replace upstream's 2-line "read constitution recursively" heredoc with BFS resolver + XML `` envelope. **Prereq for M3.** Entangled with cmux (see above). | +| **M3 / 0014** append AGENTS.md + .agents/roles | `backup/main-pre-reset:swarmforge.sh` (commit `1b84895`, inside `write_agent_instruction_file`) | 7-line loop appending `AGENTS.md` + `.agents/roles/.md` `` blocks before envelope close + preamble sentence. Cannot land without Idea B. | +| **Idea F** auto-compaction | `backup/main-pre-reset:swarmforge.sh` `write_worktree_permissions`(~L679); commit `93f8c5d` | New fn → `.claude/settings.local.json`: `autoCompactEnabled:true`, `PCT_OVERRIDE:"88"`, `WINDOW:"200000"`; called from `prepare_worktrees`. Shares the file with `write_worktree_advisor` (M2) — both use read-modify-write python3. | +| **executing-fields** | `feat/main-executing-context-fields:swarmforge.sh` (commit `a133c71`, clean +25/-10) | `executing` logbook entry carries `{message,hash,sender}`; inside `write_deliver_script` + `write_stop_hook` heredocs. Cherry-pick `a133c71` once cmux base is in. Partial-fills ADR 0002. | +| **Idea N** upgrade/skills (ADR 0018) | `backup/main-pre-reset:swarmforge.sh` `install_skills`+`ensure_skills_installed`(~L946) + new file `swarmforge/scripts/install-pins.conf`; **`swarm` bootstrap** (root, runnable branches) commit `8994322` adds `upgrade`/`write_source_branch`/`download_from_main` | `swarmforge.sh` part lands on `main`; the `upgrade` subcommand + `.swarmforge/source-branch` live in the root `swarm` file which is **on six-pack/four-pack, not main**. **Decision (2026-06-14): `ensure_skills_installed` STAYS at launch — auto-(re)install the `entire` skills pin-aware, as before.** This is launcher infra-bootstrap (peer of self-fetch + worktree/session plumbing), explicitly allowed; it does NOT violate idle-gate/setup-first, which govern *role* behavior and *project* provisioning, not the launcher bootstrapping its own deps. Skill install is **owned by ADR 0018, not setup-swarm (0003)**. `./swarm upgrade` = explicit refresh of scripts(main) + prompts(`source-branch`) + forced skill reinstall (clears `skills-installed`). | +| **Idea O** install scaffold | `backup/main-pre-reset:swarmforge.sh` `ensure_initial_gitignore`(~L105)+`ensure_runtime_git_excludes`(~L152)+`remove_nonessential_clone_files`(~L165) | `.gitignore`/excludes expansion is implemented (additive). **default-branch probe + permission allow-rules are NET-NEW** → fold into ADR 0003 setup skill. | +| **auto-permission** (ADR 0019) | `backup/main-pre-reset:swarmforge.sh` `launch_role` (commit `1097233`) | `--permission-mode acceptEdits`→`auto` for claude+grok (upstream L433/442). **`auto` verified a real flag value** (Claude Code v2.1.177 choices: acceptEdits, auto, bypassPermissions, default, dontAsk, plan — unlike the phantom `--advisor`). **Decision (2026-06-14): keep `auto`.** Rationale for the ADR: roles run unattended → any prompt is a silent hang; `acceptEdits` still prompts on bash/tool calls. `dontAsk` (deterministic, allow-list-only) was considered and **rejected for allow-list maintenance burden** across every language/tool the swarm runs; `bypassPermissions` rejected (ignores all safety, worktrees aren't sandboxed). `auto` needs ~no config and ships safety rails (blocks force-push-to-main, mass-delete). Consequence: setup-swarm's allow-rules (Idea O) stay a **small, targeted, advisory** set, not a load-bearing whitelist. | +| **cmux** | `backup/main-pre-reset:swarmforge/scripts/swarm-mux.sh` (175 lines, net-new) + `swarm-stop.sh`(66) + `swarmlog.sh`(16); + ~400 lines of `swarmforge.sh` restructure | Largest divergence; see entanglement. Source `swarm-mux.sh` at ~L169; `MUX_TARGETS` array; new write_* fns. | +| **self-url** | root `swarm` bootstrap (commit `ded6019`, runnable branches) | `SCRIPTS_REPO="${SWARMFORGE_SCRIPTS_REPO:-gabadi/swarm-forge}"`. Not on `main`. | + +## What lands where +- **`main` rebase needs:** M1, M2, M3, Idea B, F, O (gitignore part), auto-permission, executing-fields, cmux (`swarmforge.sh` + `swarm-mux.sh`/`swarm-stop.sh`/`swarmlog.sh`), `install-pins.conf`, `install_skills`. +- **Root `swarm` bootstrap (six-pack/four-pack, NOT main):** Idea N `upgrade` subcommand, `source-branch`, self-url. + diff --git a/docs/migrations/six-pack-role-prompts.md b/docs/migrations/six-pack-role-prompts.md new file mode 100644 index 0000000..6ab10b1 --- /dev/null +++ b/docs/migrations/six-pack-role-prompts.md @@ -0,0 +1,56 @@ +# Migration recovery — six-pack role prompts + +Per-role recovery for `swarmforge/roles/*.prompt`. Base = `upstream/six-pack`. **Re-merge deltas onto current upstream prompts; do not copy whole backup files** (they predate upstream and carry content ADRs reversed — see STRIP table). Primary source = `backup/six-pre-reset` unless noted. + +Universal add to **every** role prompt: idle-gate line `"Wait for a handoff. Do not act without one."` (0002) and `"Run agent-retro before going idle."` Back-routing (0004) general rule has **no backup source** — author fresh from ADR 0004 wherever a role needs it (structural finding → origin stage once; local → fix in place; single-finding back-once cap). + +## Existing roles — deltas + +| Role | Re-merge (recover-from `backup/six-pre-reset` unless noted) | STRIP / fix | +|------|------------------------------------------------------------|-------------| +| **coder** | idle-gate; UX-Intent read line (0007); handoff `notify cleaner`→`notify ux-engineer` (0007) | STRIP `## Acceptance Pipeline` block (upstream L8–11, the "At startup… APS" bullets) (0003) | +| **QA** ⚠ | idle-gate; **0010** surface-harness: L13 "through the user interface only"→"through the project surface harness only" + Expected-bullet→assertion/`NOT AUTOMATED` rule + re-execute `observation-harness/` + route-back-if-missing; handoff →`notify integrator` (0008) | STRIP `## Startup Tools` (L7) (0003); `logbook.json`→keep upstream `logbook.jsonl`. **0005 refute posture has NO backup source — author fresh**, replacing L14 "Fix bugs found by the QA suite…" with structural→route-back / local→fix-in-place. Merge 0005 (new) + 0010 (backup) into one prompt. | +| **cleaner** | idle-gate; **0016** boundary-file scan (>15 mutation sites → extract) + stripped-view-as-untested anti-pattern (cleanest source: `feat/baseline-scenarios-six`) | STRIP `At startup, install…` (L19) (0003) | +| **hardender** | idle-gate; rendering-invariant property-test line (L18 — **unmanifested divergence**, see note) | STRIP `## Startup Tools` (L8–9) (0003). STRIP backup's `"merge all queued architect handoffs together"` — **unauthorized, no ADR**; keep upstream's "batch in sorted filename order". | +| **specifier** ⚠ | idle-gate; **0008** worktree reset `git reset --hard origin/` via `git symbolic-ref` (recover from `feat/six-pack-pipeline-order-and-scaffold`, NOT backup); **0008** handoff L36 "merge the changes and ask the user"→"When the curator notifies you… ask the user for the next feature"; **0007** UX-Intent authoring; **0009** start from template + "seven"→**"eight"**; **0011** read dependency-manifest + propose-on-undeclared (recover from `backup`/`feat/issue-20-c`, NOT pipeline-order which dropped it) | STRIP DESIGN.md walk-up + scaffold-on-absence (0007); STRIP backup's `git merge --ff-only origin/master` startup (0008, also hardcodes `master`) | + +⚠ **QA and specifier are the complex merges** — multiple overlapping layers, several from different branches. Apply carefully. + +## STRIP / STALE table (backup content ADRs reversed) +| Stale content | In | Reversed by | +|---------------|-----|-------------| +| DESIGN.md walk-up + scaffold | specifier, ux-engineer | ADR 0007 (reference-from-feature-file only) | +| "seven header sections" | specifier | ADR 0009 (six-pack = eight) | +| `git merge --ff-only origin/master` startup | specifier | ADR 0008 (specifier stops merging; `master` stale) | +| "merge all queued architect handoffs together" | hardender | no ADR — keep upstream sorted-batch | +| `logbook.json` | QA | upstream renamed → `logbook.jsonl` | +| curator budgets 150/300 | curator | ADR 0013 + locked spec = 60/40 | + +## New roles (net-new files) + +### ux-engineer (ADR 0007) — recover `backup/six-pre-reset:swarmforge/roles/ux-engineer.prompt` (≡ `origin/feat/obs-harness-six`; NOT pipeline-order/baseline which lack the `observation-harness/` commit step) +Outline: identity+idle · skip if no `## UX Intent` (→notify cleaner) · UX-Intent verification across Visual Composition/Information Hierarchy/Interaction Feel/State Transitions by running the binary · fix rendering only (back-route to coder for model-state, N=3) · durable artifacts: golden snapshots + rendering invariants + `observation-harness/` scenarios via surface tool · run test suite · `## Visual quality standards` (AI-aesthetic anti-patterns, type hierarchy, WCAG 4.5:1/3:1) · notify cleaner. +**STRIP:** DESIGN.md walk-up; make DESIGN.md fix-authority conditional on a feature-file reference (not tree discovery). + +### integrator (ADR 0008) — recover `backup/six-pre-reset:swarmforge/roles/integrator.prompt` (≡ `feat/issue-20-c`; NOT baseline-scenarios-six which still says "notify specifier") +Outline: identity+idle · own landing, one PR/feature, autofix-lint-only · steps: receive from QA → branch `feat/` → `gh pr create` → watch CI → green: `gh pr merge --squash --delete-branch` + post-merge gate → **notify curator** → CI-red routing (tests→coder, coverage/CRAP/DRY→cleaner, arch→architect; autofix doesn't count; N=3 then `FAILED: depth cap reached`) → agent-retro. +**FIX (locked spec wins):** step 7 must add "Include the specifier handoff name and the post-merge master commit hash." + +### curator (ADR 0013/0014) — authoritative source = `feat/issue-20-b:docs/specs/issue-20-knowledge-promotion-loop.md` **PR C2 verbatim block** (branch `curator.prompt` artifacts have STALE 150/300 budgets — do not cargo-cult) +Outline: identity+idle · only writes `AGENTS.md`+`.agents/` · sources `~/.claude/worklog/retros/*.md` · routing ladder (backlog→AGENTS.md≤60→roles≤40→references→skills-on-2nd→upstream→ledger) · ledger `date|session-id|role|failure-class|verdict|summary` · lifecycle (empty-run→pass-through, knowledge branch, self-merge PR with metric line, move retros to processed/, notify specifier) · 9-check per-item algorithm (scope→recurrence→non-inferable→rule-not-phenomenon→dup/contradiction→global-fix-routing→trigger-load-fit→evidence-pull→sizing). +**Companion changes (locked spec, not on any branch):** specifier wait-on-curator (PR C4); `workflow.prompt` integrator→curator→specifier chain bullet (PR C5). + +## Final `swarmforge.conf` window order (recover `feat/issue-20-c` for 8 windows + curator from `backup/six-pre-reset`) +``` +window specifier codex specifier # was: codex master (0008 moves specifier off master) +window coder codex coder +window ux-engineer codex ux-engineer # 0007: after coder +window cleaner codex cleaner +window architect codex architect +window hardender codex hardender +window QA codex QA +window integrator codex integrator # 0008: after QA +window curator codex curator # 0013: last (only in backup/six-pre-reset) +``` +Note: all roles still on `codex` → clear-first (0002) inert until roles move to `claude` or codex hooks built (open item). `default_branch` is per-feature specifier logic, not a conf field. + diff --git a/docs/superpowers/plans/2026-06-14-fork-divergence-implementation.md b/docs/superpowers/plans/2026-06-14-fork-divergence-implementation.md new file mode 100644 index 0000000..188b245 --- /dev/null +++ b/docs/superpowers/plans/2026-06-14-fork-divergence-implementation.md @@ -0,0 +1,1232 @@ +# Fork Divergence Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Re-apply every documented SwarmForge fork divergence (ADRs 0001–0021 + manifest rows) on top of pristine `upstream`, as **two pull requests — one per delivery branch**: one PR on `main` (scripts + skills), one PR on `six-pack` (prompts + constitution + conf + root swarm). Each PR is the minimal additive diff vs upstream, built from ordered, per-divergence commits. + +**Architecture:** Two delivery branches. `main` carries scripts + skills + docs/ADRs; `six-pack` carries role prompts, constitution articles, templates, the fidelity manifest, `swarmforge.conf`, and the root `swarm` bootstrap. Every branch is kept identical to its `upstream/` and advanced by **merge**, never rebase (ADR 0001). **four-pack is frozen** — no fork content is ever applied to it (manifest decision 2026-06-14); it stays a pure merge-mirror of `upstream/four-pack`. + +**Tech Stack:** zsh (`swarmforge.sh` and the handoff scripts run under zsh — note `${=var}` word-splitting, `typeset -a/-A`, `${var:h}`/`${var:t}` modifiers), Python 3 (settings.local.json read-modify-write), Markdown skills (`SKILL.md`), `*.prompt` plain-text role/constitution files, Gherkin `.feature` templates, `gh` CLI. + +--- + +## Conventions (read before any task) + +**Two PRs, two branches.** Exactly one branch and one PR per delivery branch: +- **PR 1 (MAIN)** — branch `feat/fork-divergences-main` off `origin/main`; all of the MAIN TRACK commits below; PR opened `--base main`. +- **PR 2 (SIX-PACK)** — branch `feat/fork-divergences-six-pack` off `origin/six-pack`; all of the SIX-PACK TRACK commits below; PR opened `--base six-pack`. + +There is **no four-pack PR** (frozen). The two PRs are independent of each other and can proceed in parallel. + +**Commits.** Each divergence is one commit on its track branch, applied in the listed order (the order encodes the within-branch dependencies — e.g. the bundle commit precedes the knowledge-injection commit that extends it). One commit per divergence keeps the single PR reviewable and tailored. Do **not** create extra branches or PRs. + +**Baseline anchor.** The fork layer is re-applied onto a recorded pristine-upstream baseline (ADR 0001): `main` @ `d947f67` (tag `fork-base/2026-06-14-main`) and `six-pack` @ `cbd1697` (tag `fork-base/2026-06-14-six-pack`). As of 2026-06-14 `origin/main`/`origin/six-pack` equal these exactly, so branching off `origin/` == branching off the tag. The two implementation branches come off the real delivery branches, **not** off this docs branch. If `origin` has since advanced, branch off the tag instead so the diff stays measured against the recorded baseline. + +**Merge style.** Fork divergences are **squash-merged** (ADR 0001), so each of these two PRs lands as one clean commit on its delivery branch. Upstream syncs, by contrast, are history-preserving merges (never squashed/rebased — keep upstream's story). A landed commit is never rewritten. + +**Pushing.** **Never** push `main`, `six-pack`, or `upstream` directly without explicit request — push only the two feature branches. `gh` defaults to the `unclebob` upstream remote — always pass `--repo gabadi/swarm-forge`. + +**Minimize-diff rule (overriding constraint).** Translate each divergence to its smallest additive form vs current upstream. Do **not** lift whole files from the backup branches for existing files — re-merge the delta onto the *current* upstream file. Net-new files (new roles, templates, skills) may be recovered whole, but you MUST apply the STRIP/FIX edits called out per commit (the backup artifacts predate upstream and carry behavior the ADRs reversed). + +**Recovery sources.** Recover exact prior content with `git show :`. Key sources: `backup/main-pre-reset` (main script layer), `backup/six-pre-reset` (six-pack prompts/templates), `feat/issue-20-a-retro-skill-upgrade` (agent-retro + retro-triage skills), `feat/issue-20-b-bundle-knowledge-injection` (knowledge-promotion spec / curator), `feat/baseline-scenarios-six` (dependency-manifest, cleaner boundary scan), `feat/six-pack-pipeline-order-and-scaffold` (specifier worktree reset), `feat/issue-20-c-curator-six-pack` (8-window conf, integrator). Line numbers are approximate (`~L###`) — they drift; locate by function/section name, not by line. + +**Verification approach.** There is no bash unit-test harness in this repo. "Tests" are: (a) `shellcheck` on changed shell files where available, (b) `zsh -n ` syntax check, (c) a scratch-project smoke run of the generated artifact (e.g. inspect the bundle `write_agent_instruction_file` produces), and (d) `grep` assertions on prompt/skill text. Each commit states the concrete verification command and expected result. Verify after each commit; a whole-track verification runs before each PR is opened. + +**Commit message footer.** End every commit body with: +``` +Co-Authored-By: Claude Opus 4.8 (1M context) +``` + +--- + +## Commit order (within each branch) + +**MAIN branch** (`feat/fork-divergences-main`) — commit in this order; the only hard dependency is C3→C2 (knowledge injection extends the bundle envelope). C1–C6, C8, C11 all edit `swarmforge.sh`, so a linear commit order avoids any in-file conflict: + +| # | ADR | What | `swarmforge.sh` region / new file | +|---|-----|------|-----------------------------------| +| C1 | 0019 | auto-permission | `launch_role` | +| C2 | 0017 | bundle inlining | `write_agent_instruction_file` + new `resolve_prompt_bundle` | +| C3 | 0014 | knowledge injection (**after C2**) | `write_agent_instruction_file` | +| C4 | 0012 | per-role model/effort/advisor | `parse_config`, `launch_role` + new `write_worktree_advisor` | +| C5 | 0020 | auto-compaction | `prepare_worktrees` + new `write_worktree_permissions` | +| C6 | 0006 | QA holdout sparse-checkout | `prepare_worktrees` | +| C7 | 0002-ext | executing-entry fields | handoff scripts (`swarmforge/scripts/*.sh`) | +| C8 | 0018 | pinned skill install | new `install_skills`/`ensure_skills_installed` + new `install-pins.conf` | +| C9 | 0013/J | agent-retro skill | new `swarmforge/skills/agent-retro/` | +| C10 | 0021 | retro-triage skill | new `.claude/skills/retro-triage/` | +| C11 | 0003 + O | setup-swarm skill + marker guard + scaffold | new `swarmforge/skills/setup-swarm/` + `swarmforge.sh` guard/gitignore | + +**SIX-PACK branch** (`feat/fork-divergences-six-pack`) — commit in this order; the order resolves the shared-file sequencing (`specifier.prompt`: D1,D3,D4,D5,D8,D9,D10 · `QA.prompt`: D1,D2,D3,D6,D7,D9 · `swarmforge.conf`: D8,D9,D10 · `workflow.prompt`: D10,D11): + +| # | ADR | What | Touches | +|---|-----|------|---------| +| D1 | 0002 | idle-gate + agent-retro line | all 6 role prompts | +| D2 | 0003 | strip startup-install directives | coder, QA, cleaner, hardener | +| D3 | 0004 | back-routing rule | role prompts | +| D4 | 0009 | spec-header template + specifier | new `templates/feature.feature`, specifier | +| D5 | 0011 | fidelity manifest + specifier | new `dependency-manifest.prompt`, specifier | +| D6 | 0010 | surface harness | `engineering.prompt`, QA | +| D7 | 0005 | refute QA | QA | +| D8 | 0007 | UX engineer | new `ux-engineer.prompt`, coder, specifier, `swarmforge.conf` | +| D9 | 0008 | integrator + specifier stops merging | new `integrator.prompt`, specifier, QA, `swarmforge.conf` | +| D10 | 0013 | curator + chain rewiring | new `curator.prompt`, integrator, specifier, `workflow.prompt`, `swarmforge.conf` | +| D11 | 0015 | platform-feasibility stop rule | `workflow.prompt` | +| D12 | 0016 | cleaner boundary scan | cleaner | +| D13 | — | hardener rendering invariants | hardener | +| D14 | 0018 | root swarm upgrade + self-url | root `swarm` | + +--- + +# MAIN TRACK → PR 1 + +## Setup: create the main branch + +- [ ] **Create the single branch for all MAIN commits** + +```bash +git fetch origin && git switch -c feat/fork-divergences-main origin/main +# If origin/main has advanced past the recorded baseline, branch off the tag instead: +# git switch -c feat/fork-divergences-main fork-base/2026-06-14-main +``` +All C1–C11 commits land on this one branch. Do not create per-commit branches. This PR is squash-merged (fork-divergence policy, ADR 0001). + +--- + +## C1: ADR 0019 — autonomous permission mode + +**Files:** Modify `swarmforge/scripts/swarmforge.sh` (`launch_role`, the `claude)` and `grok)` arms, ~L433 / ~L442) + +- [ ] **Step 1: Locate the two launch arms** + +Run: `grep -n "permission-mode acceptEdits" swarmforge/scripts/swarmforge.sh` +Expected: two hits inside `launch_role` — the `claude)` arm and the `grok)` arm. + +- [ ] **Step 2: Apply the edit** + +Replace `--permission-mode acceptEdits` with `--permission-mode auto` in both arms. (`auto` is a real Claude Code flag value — verified, unlike the phantom `--advisor`. Roles run unattended, so `acceptEdits` bash/tool prompts hang silently; `auto` ships rails — blocks force-push-to-main and mass-delete.) + +```bash +sed -i '' 's/--permission-mode acceptEdits/--permission-mode auto/g' swarmforge/scripts/swarmforge.sh +``` + +- [ ] **Step 3: Verify** + +Run: `grep -c "permission-mode auto" swarmforge/scripts/swarmforge.sh; grep -c "acceptEdits" swarmforge/scripts/swarmforge.sh; zsh -n swarmforge/scripts/swarmforge.sh && echo SYNTAX_OK` +Expected: `2`, `0`, `SYNTAX_OK`. + +- [ ] **Step 4: Commit** + +```bash +git add swarmforge/scripts/swarmforge.sh +git commit -m "feat(swarmforge): autonomous permission mode for unattended roles (ADR 0019)" +``` + +--- + +## C2: ADR 0017 — prompt-bundle inlining + +**Files:** Modify `swarmforge/scripts/swarmforge.sh` (replace `write_agent_instruction_file` ~L389–413; add `resolve_prompt_bundle`) + +Upstream emits two naive "read recursively" lines. The fork pre-resolves the constitution + role prompt into one deduplicated XML envelope. **Disentangle from cmux:** port ONLY `resolve_prompt_bundle` + the envelope `write_agent_instruction_file`. Do NOT port `write_deliver_script`/`write_notify_script`/`write_stop_hook`/`MUX_TARGETS`. + +- [ ] **Step 1: Read the current naive function** + +Run: `grep -n "write_agent_instruction_file" swarmforge/scripts/swarmforge.sh` +Confirm it emits `Read swarmforge/constitution.prompt, then read every file it refers to recursively...` and uses globals `$CONSTITUTION_FILE`, `$ROLES_DIR`, `$WORKING_DIR` (all set upstream). + +- [ ] **Step 2: Add `resolve_prompt_bundle` above `write_agent_instruction_file`** + +```zsh +resolve_prompt_bundle() { + local role="$1" + typeset -a bundle=() + typeset -A seen=() + typeset -a queue=("$CONSTITUTION_FILE" "$ROLES_DIR/${role}.prompt") + local file rel_path ref ref_abs + + while (( ${#queue[@]} > 0 )); do + file="${queue[1]}" + shift queue + + rel_path="${file#${WORKING_DIR}/}" + [[ ${+seen[$rel_path]} -eq 1 ]] && continue + [[ ! -f "$file" ]] && continue + + seen[$rel_path]=1 + bundle+=("$rel_path") + + while IFS= read -r ref; do + [[ -z "$ref" ]] && continue + ref_abs="$WORKING_DIR/$ref" + [[ ${+seen[$ref]} -eq 0 ]] && queue+=("$ref_abs") + done < <(grep -oE 'swarmforge/[A-Za-z0-9_./-]+\.prompt' "$file" 2>/dev/null || true) + done + + printf '%s\n' "${bundle[@]}" +} +``` + +- [ ] **Step 3: Replace `write_agent_instruction_file` with the envelope form** + +```zsh +write_agent_instruction_file() { + local role="$1" + local prompt_file="$2" + typeset -a bundle_files=() + local rel abs_path + + while IFS= read -r rel; do + [[ -n "$rel" ]] && bundle_files+=("$rel") + done < <(resolve_prompt_bundle "$role") + + { + printf '\n' "$role" + printf '\n' + printf 'This prompt bundle is pre-resolved. Do not open or re-read any swarmforge/*.prompt files — all relevant instructions are already included below.\n' + printf '\n' + for rel in "${bundle_files[@]}"; do + abs_path="$WORKING_DIR/$rel" + [[ -f "$abs_path" ]] || continue + printf '\n' "$rel" + cat "$abs_path" + printf '\n\n' + done + printf '\n' + } > "$prompt_file" +} +``` + +- [ ] **Step 4: Verify** + +Run: `zsh -n swarmforge/scripts/swarmforge.sh && echo SYNTAX_OK` +Then confirm the function references only `$CONSTITUTION_FILE`, `$ROLES_DIR`, `$WORKING_DIR` (set in upstream's init/`parse_config`). For a live check, run the swarm in a scratch dir and inspect a generated `$PROMPTS_DIR/.md` — it should be a single `` envelope with deduped `` blocks, no "read recursively" lines. +Expected: `SYNTAX_OK` + a well-formed envelope. + +- [ ] **Step 5: Commit** + +```bash +git add swarmforge/scripts/swarmforge.sh +git commit -m "feat(swarmforge): pre-resolve role prompt bundle into XML envelope (ADR 0017)" +``` + +--- + +## C3: ADR 0014 — `.agents/` knowledge injection (after C2) + +**Files:** Modify `swarmforge/scripts/swarmforge.sh` (`write_agent_instruction_file`, as written by C2) + +- [ ] **Step 1: Update the preamble line** + +In `write_agent_instruction_file`, change the `` printf to: + +```zsh + printf 'This prompt bundle is pre-resolved. Do not open or re-read any swarmforge/*.prompt files — all relevant instructions are already included below. Project knowledge files (AGENTS.md and your role file under .agents/roles/) are included below when present.\n' +``` + +- [ ] **Step 2: Add the knowledge loop** + +Add `knowledge` to the locals (`local rel abs_path knowledge`). Insert **after** the bundle-files `for` loop and **before** `printf '\n'`: + +```zsh + for knowledge in "AGENTS.md" ".agents/roles/${role}.md"; do + abs_path="$WORKING_DIR/$knowledge" + [[ -f "$abs_path" ]] || continue + printf '\n' "$knowledge" + cat "$abs_path" + printf '\n\n' + done +``` + +- [ ] **Step 3: Acceptance** + +Run: `zsh -n swarmforge/scripts/swarmforge.sh && echo SYNTAX_OK` +In a scratch project with `AGENTS.md` and `.agents/roles/coder.md`: every role's generated bundle carries `AGENTS.md`; only the coder's carries `.agents/roles/coder.md`; removing both produces bundles with no knowledge blocks and no errors. +Expected: `SYNTAX_OK` + the per-role assertions hold. + +- [ ] **Step 4: Commit** + +```bash +git add swarmforge/scripts/swarmforge.sh +git commit -m "feat(swarmforge): inject AGENTS.md + .agents/roles into role bundle (ADR 0014)" +``` + +--- + +## C4: ADR 0012 — per-role model / effort / advisor + +**Files:** Modify `swarmforge/scripts/swarmforge.sh` (`parse_config`, `launch_role`; add arrays + `write_worktree_advisor`) + +- [ ] **Step 1: Declare the three arrays** + +Next to the existing `ROLES`/`AGENTS`/`SESSIONS` declarations, add: + +```zsh +typeset -a ROLE_MODELS=() +typeset -a ROLE_EFFORTS=() +typeset -a ROLE_ADVISORS=() +``` + +- [ ] **Step 2: Relax field count + parse the kv tail in `parse_config`** + +Change `if (( ${#fields[@]} != 4 )); then` → `if (( ${#fields[@]} < 4 )); then`. After the `keyword/role/agent/worktree` assignments, add: + +```zsh + local role_model="" role_effort="" role_advisor="" kv key val kv_i + for (( kv_i = 5; kv_i <= ${#fields[@]}; kv_i++ )); do + kv="${fields[$kv_i]}" + key="${kv%%=*}" + val="${kv#*=}" + case "$key" in + model) role_model="$val" ;; + effort) role_effort="$val" ;; + advisor) role_advisor="$val" ;; + esac + done +``` + +Where the existing arrays are appended, add the parallel appends: + +```zsh + ROLE_MODELS+=("$role_model") + ROLE_EFFORTS+=("$role_effort") + ROLE_ADVISORS+=("$role_advisor") +``` + +- [ ] **Step 3: Add `write_worktree_advisor`** + +```zsh +write_worktree_advisor() { + local worktree_path="$1" + local advisor_model="$2" + local settings_dir="$worktree_path/.claude" + local settings_file="$settings_dir/settings.local.json" + + mkdir -p "$settings_dir" + SETTINGS_FILE="$settings_file" ADVISOR_MODEL="$advisor_model" python3 -c ' +import json, os +p = os.environ["SETTINGS_FILE"] +cfg = {} +try: + with open(p) as f: cfg = json.load(f) +except: pass +cfg["advisorModel"] = os.environ["ADVISOR_MODEL"] +with open(p, "w") as f: json.dump(cfg, f, indent=2) + ' +} +``` + +- [ ] **Step 4: Wire flags into `launch_role`** + +After the existing locals, add: + +```zsh + local role_model="${ROLE_MODELS[$index]}" + local role_effort="${ROLE_EFFORTS[$index]}" + local role_advisor="${ROLE_ADVISORS[$index]}" +``` + +After `write_agent_instruction_file "$role" "$prompt_file"`, add: + +```zsh + [[ -n "$role_advisor" ]] && write_worktree_advisor "$role_worktree" "$role_advisor" +``` + +In the `claude)` arm: + +```zsh + local claude_flags="" + [[ -n "$role_model" ]] && claude_flags+=" --model '$role_model'" + [[ -n "$role_effort" ]] && claude_flags+=" --effort '$role_effort'" +``` +then insert `${claude_flags}` immediately after `claude` in `launch_cmd`. Apply the analogue for `copilot)` (`--model`/`--effort`) and `grok)` (`--model`/`--effort`); for `codex)` use `-c model="$role_model"` only when set. + +- [ ] **Step 5: Verify** + +Run: `zsh -n swarmforge/scripts/swarmforge.sh && echo SYNTAX_OK` +Add a temporary conf line `window coder claude coder model=opus effort=high advisor=sonnet` and confirm `parse_config` accepts it; the existing 4-field lines still parse; `advisorModel` lands in the role worktree's `settings.local.json`. +Expected: `SYNTAX_OK` + both 4-field and 7-field lines parse. + +- [ ] **Step 6: Commit** + +```bash +git add swarmforge/scripts/swarmforge.sh +git commit -m "feat(swarmforge): per-role model/effort/advisor in swarmforge.conf (ADR 0012)" +``` + +--- + +## C5: ADR 0020 — auto-compaction on role worktrees + +**Files:** Modify `swarmforge/scripts/swarmforge.sh` (add `write_worktree_permissions`; call in `prepare_worktrees`) + +- [ ] **Step 1: Add `write_worktree_permissions`** + +```zsh +write_worktree_permissions() { + local worktree_path="$1" + local settings_dir="$worktree_path/.claude" + local settings_file="$settings_dir/settings.local.json" + + mkdir -p "$settings_dir" + SETTINGS_FILE="$settings_file" python3 -c ' +import json, os +p = os.environ["SETTINGS_FILE"] +cfg = {} +try: + with open(p) as f: cfg = json.load(f) +except: pass +cfg["autoCompactEnabled"] = True +cfg.setdefault("env", {}) +cfg["env"]["CLAUDE_AUTOCOMPACT_PCT_OVERRIDE"] = "88" +cfg["env"]["CLAUDE_CODE_AUTO_COMPACT_WINDOW"] = "200000" +with open(p, "w") as f: json.dump(cfg, f, indent=2) + ' +} +``` + +- [ ] **Step 2: Call it from `prepare_worktrees`** + +Inside the per-role loop, after the `git worktree add` block (and after C4's advisor call site), add: + +```zsh + write_worktree_permissions "$worktree_path" +``` +Both writers JSON-merge `settings.local.json`, so calling both is safe and order-independent. + +- [ ] **Step 3: Verify** + +Run: `zsh -n swarmforge/scripts/swarmforge.sh && echo SYNTAX_OK` +After a scratch run, a role worktree's `.claude/settings.local.json` contains `"autoCompactEnabled": true` and the two `env` overrides (alongside any `advisorModel`). +Expected: `SYNTAX_OK` + merged JSON. + +- [ ] **Step 4: Commit** + +```bash +git add swarmforge/scripts/swarmforge.sh +git commit -m "feat(swarmforge): enable auto-compaction on role worktrees (ADR 0020)" +``` + +--- + +## C6: ADR 0006 — harness-enforced QA holdout (sparse-checkout) + +**NET-NEW — no source artifact.** Write fresh. **Files:** Modify `swarmforge/scripts/swarmforge.sh` (`prepare_worktrees`) + +- [ ] **Step 1: Identify the loop variables** + +Run: `grep -n "worktree add\|WORKTREE_NAMES\|ROLES\[" swarmforge/scripts/swarmforge.sh` +Confirm the role variable in `prepare_worktrees`, the specifier worktree (`specifier`, not `master` — ADR 0008), and the QA role name. + +- [ ] **Step 2: Add a pinned QA-path constant** + +Near the top config constants (single source of truth, matches the specifier-authored path): + +```zsh +QA_HOLDOUT_PATH="${SWARMFORGE_QA_HOLDOUT_PATH:-qa-e2e}" +``` + +- [ ] **Step 3: Add conditional sparse-checkout after `git worktree add`** + +Key on the **role** (not worktree name); exclude the holdout from every worktree except specifier's and QA's: + +```zsh + if [[ "$role" != "specifier" && "$role" != "QA" ]]; then + git -C "$worktree_path" sparse-checkout init --no-cone >/dev/null 2>&1 + { + printf '/*\n' + printf '!/%s/\n' "$QA_HOLDOUT_PATH" + } > "$worktree_path/.git/info/sparse-checkout" 2>/dev/null \ + || git -C "$worktree_path" sparse-checkout set --no-cone '/*' "!/${QA_HOLDOUT_PATH}/" >/dev/null 2>&1 + git -C "$worktree_path" read-tree -mu HEAD >/dev/null 2>&1 || true + fi +``` +(Substitute the real role-variable name from Step 1 for `$role`. The holdout stays in the commit/tree — only absent from disk — so it survives each role's handoff commit.) + +- [ ] **Step 4: Verify holdout invisibility + commit survival** + +In a scratch run with a committed `qa-e2e/`: coder/cleaner/architect/hardener worktrees have **no** `qa-e2e/` on disk; specifier + QA **do**; after a role's handoff commit, `git show HEAD:qa-e2e/` still resolves. +Run: `zsh -n swarmforge/scripts/swarmforge.sh && echo SYNTAX_OK` +Expected: `SYNTAX_OK` + invisibility/survival hold. + +- [ ] **Step 5: Commit** + +```bash +git add swarmforge/scripts/swarmforge.sh +git commit -m "feat(swarmforge): sparse-checkout the QA holdout from shaping roles (ADR 0006)" +``` + +--- + +## C7: ADR 0002 (extend) — executing-entry context fields + +**Files:** Modify upstream's handoff scripts under `swarmforge/scripts/` (the script that writes the `executing` logbook entry + the notify + stop-hook paths) + +> ⚠ Reference commit `a133c71` is on the **cmux lineage** (its diff is inside `swarmforge.sh` heredocs that don't exist on pristine upstream). Do **not** cherry-pick — re-author the same field semantics onto upstream's separate handoff scripts. + +- [ ] **Step 1: Find the `executing` entry write site** + +Run: `grep -rn '"executing"\|status.*executing\|executing' swarmforge/scripts/` +The write site is one of `receive-handoff.sh` / `complete-handoff.sh` / `handoff-lib.sh` / the deliver step. Read the intended semantics: +Run: `git show a133c71` +Expected: the entry must carry `{status, timestamp, message, hash, sender}` instead of `{status, timestamp}`. + +- [ ] **Step 2: Add the three fields** + +At the write site, extend the JSON object with: `message` (the task message text the delivery already passes), `hash` (the handoff commit hash in scope), `sender` (the sender role resolved from `sessions.tsv` by matching the sender worktree — mirror `notify-agent.sh`'s existing role resolution). Thread `sender` from `notify-agent.sh` → deliver step → stop-hook re-queue path, following upstream's existing argument-passing convention. + +- [ ] **Step 3: Verify** + +Run: `for f in swarmforge/scripts/*.sh; do zsh -n "$f" || echo "BAD: $f"; done; echo CHECKED` +In a scratch run, trigger a delivery and inspect the `executing` line in `logbook.jsonl`. +Expected: `CHECKED` + the entry carries non-empty `message`, `hash`, `sender`. + +- [ ] **Step 4: Commit** + +```bash +git add swarmforge/scripts +git commit -m "feat(swarmforge): carry {message,hash,sender} in executing logbook entry (ADR 0002)" +``` + +--- + +## C8: ADR 0018 — pinned skill install (main half) + +The `upgrade` subcommand + `source-branch` + self-url live in the root `swarm` (six-pack, D14). This is the `main` script half: pin-aware, idempotent skill install at launch (launcher infra-bootstrap — allowed; does not violate idle-gate/setup-first). + +**Files:** Create `swarmforge/scripts/install-pins.conf`; modify `swarmforge/scripts/swarmforge.sh` + +- [ ] **Step 1: Create `install-pins.conf`** + +```bash +cat > swarmforge/scripts/install-pins.conf <<'EOF' +# Pinned external dependency versions for swarm install/upgrade. +# Bump a SHA here and commit on main to pull in a newer version. + +# entireio/skills — installed to .claude/skills/ in the target project +ENTIRE_SKILLS_SHA=4c9a02513c3ec6ebabd9a9dc6bd8240854a218ac +EOF +``` +Confirm the SHA against `backup/main-pre-reset:swarmforge/scripts/install-pins.conf` and bump if it has moved. + +- [ ] **Step 2: Add `install_skills` + `ensure_skills_installed`** + +Run: `git show backup/main-pre-reset:swarmforge/scripts/swarmforge.sh | grep -n "install_skills\|ensure_skills_installed"` +Add `install_skills()` (sources `install-pins.conf`; copies the in-repo `agent-retro` skill into `.claude/skills/`; fetches entire's skills tarball at `$ENTIRE_SKILLS_SHA` into `.claude/skills/`; writes the SHA to `$STATE_DIR/skills-installed`; warns and continues if offline) and `ensure_skills_installed()` (returns early if the sentinel matches the pinned SHA, else calls `install_skills`). Use the canonical bodies from `backup/main-pre-reset` (`~L946`), kept additive. + +- [ ] **Step 3: Call it in the launch flow** + +After config is parsed and `$STATE_DIR` is known, add: + +```zsh +ensure_skills_installed +``` + +- [ ] **Step 4: Verify** + +Run: `zsh -n swarmforge/scripts/swarmforge.sh && echo SYNTAX_OK` +A second launch is a no-op (sentinel matches); an offline launch warns rather than failing. +Expected: `SYNTAX_OK` + idempotent re-run. + +- [ ] **Step 5: Commit** + +```bash +git add swarmforge/scripts/swarmforge.sh swarmforge/scripts/install-pins.conf +git commit -m "feat(swarmforge): pin-aware idempotent skill install at launch (ADR 0018)" +``` + +--- + +## C9: ADR 0013 / Idea J — agent-retro skill (net-new) + +upstream/main has no `skills/` dir — this is a net-new add. Source = `feat/issue-20-a-retro-skill-upgrade:swarmforge/skills/agent-retro/`. + +**Files:** Create `swarmforge/skills/agent-retro/` + +- [ ] **Step 1: Recover the skill files** + +```bash +for f in $(git ls-tree -r --name-only feat/issue-20-a-retro-skill-upgrade -- swarmforge/skills/agent-retro); do + mkdir -p "$(dirname "$f")" + git show "feat/issue-20-a-retro-skill-upgrade:$f" > "$f" +done +``` + +- [ ] **Step 2: Verify the four locked behaviors** + +```bash +grep -c "pending-curation" swarmforge/skills/agent-retro/SKILL.md # >= 1 +grep -ci "scope" swarmforge/skills/agent-retro/SKILL.md # >= 2 (tag + table column) +grep -ci "capture" swarmforge/skills/agent-retro/SKILL.md # >= 1 +grep -c "session info --transcript\|.claude/projects" swarmforge/skills/agent-retro/SKILL.md # >= 1 +``` +Expected: all thresholds met. If any is 0, re-check the source branch. + +- [ ] **Step 3: Commit** + +```bash +git add swarmforge/skills/agent-retro +git commit -m "feat(swarmforge): add agent-retro skill — scoped, capture-first, autonomous (ADR 0013)" +``` + +--- + +## C10: ADR 0021 — retro-triage skill (net-new, byte-identical) + +Lives under `.claude/skills/` (operator-invoked), distinct from `swarmforge/skills/`. **Files:** Create `.claude/skills/retro-triage/SKILL.md` + +- [ ] **Step 1: Recover byte-identical** + +```bash +mkdir -p .claude/skills/retro-triage +git show feat/issue-20-a-retro-skill-upgrade:.claude/skills/retro-triage/SKILL.md > .claude/skills/retro-triage/SKILL.md +``` + +- [ ] **Step 2: Verify** + +```bash +git diff --no-index <(git show feat/issue-20-a-retro-skill-upgrade:.claude/skills/retro-triage/SKILL.md) .claude/skills/retro-triage/SKILL.md && echo IDENTICAL +wc -l .claude/skills/retro-triage/SKILL.md +``` +Expected: `IDENTICAL`, ~219 lines. + +- [ ] **Step 3: Commit** + +```bash +git add .claude/skills/retro-triage +git commit -m "feat: restore retro-triage skill (ADR 0021)" +``` + +--- + +## C11: ADR 0003 + Idea O — setup-swarm skill, marker guard, scaffold + +NET-NEW skill design (no backup artifact). **Files:** Create `swarmforge/skills/setup-swarm/SKILL.md`; modify `swarmforge/scripts/swarmforge.sh` + +- [ ] **Step 1: Read the design recovery doc** + +Run: `cat docs/migrations/0003-setup-skill-sources.md` +Confirm: setup is **setup-first** (operator runs `/setup-swarm` first); `./swarm` only **guards** on `.swarmforge/setup-complete` and refuses if absent (never auto-runs setup); skill named `setup-swarm`; Idea O folds in; the `entire` skill pins are NOT here (that is C8). + +- [ ] **Step 2: Author `setup-swarm/SKILL.md`** + +Mirror `agent-retro`'s SKILL.md shape; cover, per the design doc: +- **Stack detection** (reason about the language → which quality tools/gates to install — *why* setup is a skill, not a script; don't over-prescribe the mechanism). +- Install the project's mutation/CRAP/DRY tools (those stripped from cleaner/hardener/QA) and APS `gherkin-parser`/`gherkin-mutator` (stripped from coder/hardener). +- Session tracking: `entire enable --no-github --telemetry=false`, then `entire agent add ` per unique backend in `swarmforge.conf` column 3; warn-and-continue if `entire` absent. +- Permission allow-rules to `.claude/settings.json` (`Bash(gh pr merge*)` for integrator, `Bash(git reset --hard origin/*)` for specifier) — a small, advisory set, not a load-bearing whitelist (ADR 0019 `auto` already ships rails). +- Scaffold: ensure `.gitignore` covers `logbook.jsonl`, `tmp/`, `.swarmforge/`; probe the default branch (`git symbolic-ref refs/remotes/origin/HEAD`) and record it for the specifier's per-feature reset. +- Emit the swarm-ready marker `.swarmforge/setup-complete` (content: timestamp + swarmforge SHA — impl detail). + +- [ ] **Step 3: Add the marker guard to `swarmforge.sh`** + +Early in the launch flow (before any role launch; distinct from the `ensure_skills_installed` launcher bootstrap), add: + +```zsh +if [[ ! -f "$STATE_DIR/setup-complete" ]]; then + echo -e "${RED}Error:${RESET} project is not swarm-ready. Run /setup-swarm first." >&2 + exit 1 +fi +``` +The guard never runs setup; it only refuses. + +- [ ] **Step 4: Expand the gitignore/excludes scaffold (Idea O)** + +In `ensure_initial_gitignore`, add `logbook.jsonl`, `tmp/` (plus backup's `swarmtools/`/`logs/`/`agent_context/` if still relevant) — each as an idempotent `grep -qx || append` block and in the initial-creation heredoc. In `ensure_runtime_git_excludes`, expand the `for pattern in ...` loop to the same set. Add `remove_nonessential_clone_files` (recover from `backup/main-pre-reset`) and call it once in the init flow. + +- [ ] **Step 5: Verify** + +Run: `zsh -n swarmforge/scripts/swarmforge.sh && echo SYNTAX_OK` +Launching without the marker exits with "Run /setup-swarm first"; creating `.swarmforge/setup-complete` lets launch proceed; running twice doesn't duplicate `.gitignore` lines. +Expected: `SYNTAX_OK` + guard + idempotent gitignore. + +- [ ] **Step 6: Commit** + +```bash +git add swarmforge/skills/setup-swarm swarmforge/scripts/swarmforge.sh +git commit -m "feat(swarmforge): setup-swarm skill + swarm-ready marker guard + scaffold (ADR 0003, Idea O)" +``` + +--- + +## Finalize PR 1 (MAIN) + +- [ ] **Step 1: Whole-track verification** + +```bash +zsh -n swarmforge/scripts/swarmforge.sh && echo SYNTAX_OK +git diff --stat origin/main # review: only intended files changed, all additive +``` +Expected: `SYNTAX_OK`; the diff touches only `swarmforge/scripts/*`, `swarmforge/skills/*`, `.claude/skills/retro-triage/*` — no role prompts, no conf (those are PR 2). + +- [ ] **Step 2: Push the branch** + +```bash +git push -u origin feat/fork-divergences-main +``` + +- [ ] **Step 3: Open the single PR** + +```bash +gh pr create --base main --repo gabadi/swarm-forge \ + --title "feat: fork divergences — main script + skill layer" \ + --body "Re-applies the main-side fork divergences on pristine upstream, one commit per ADR: 0019 auto-permission, 0017 bundle inlining, 0014 knowledge injection, 0012 per-role config, 0020 auto-compaction, 0006 QA holdout, 0002 executing-fields, 0018 skill install, 0013 agent-retro, 0021 retro-triage, 0003 setup-swarm + Idea O. cmux dropped; four-pack frozen. See docs/superpowers/plans/2026-06-14-fork-divergence-implementation.md and docs/fork-change-manifest.md (Sections A + C)." +``` + +--- + +# SIX-PACK TRACK → PR 2 + +## Setup: create the six-pack branch + +- [ ] **Create the single branch for all SIX-PACK commits** + +```bash +git fetch origin && git switch -c feat/fork-divergences-six-pack origin/six-pack +# If origin/six-pack has advanced past the recorded baseline, branch off the tag instead: +# git switch -c feat/fork-divergences-six-pack fork-base/2026-06-14-six-pack +``` +All D1–D14 commits land on this one branch. This PR is squash-merged (fork-divergence policy, ADR 0001). + +--- + +## D1: ADR 0002 — idle-gate + agent-retro line (all roles) + +**Files:** Modify `swarmforge/roles/{specifier,coder,cleaner,architect,hardender,QA}.prompt` + +- [ ] **Step 1: Add the idle-gate line** + +After the `You are the .` opening of each of the six prompts, insert a blank line then: + +``` +Wait for a handoff. Do not act without one. +``` + +- [ ] **Step 2: Add the agent-retro line** + +As the last bullet of each role's Handoff section: + +``` +- Run `agent-retro` before going idle. +``` + +- [ ] **Step 3: Verify** + +```bash +for r in specifier coder cleaner architect hardender QA; do + grep -q "Wait for a handoff. Do not act without one." "swarmforge/roles/$r.prompt" || echo "MISSING idle-gate: $r" + grep -q "agent-retro\` before going idle" "swarmforge/roles/$r.prompt" || echo "MISSING retro: $r" +done; echo CHECKED +``` +Expected: only `CHECKED`. + +- [ ] **Step 4: Commit** + +```bash +git add swarmforge/roles +git commit -m "feat(roles): idle-gate + agent-retro-before-idle on every role (ADR 0002)" +``` + +--- + +## D2: ADR 0003 — strip startup-install directives + +Install work moves to the setup-swarm skill (C11). **Files:** Modify `swarmforge/roles/{coder,QA,cleaner,hardender}.prompt` + +- [ ] **Step 1: Strip the directives** + +- `coder.prompt`: remove the entire `## Acceptance Pipeline` block (the "At startup, make sure the normal acceptance pipeline …" bullets, ~L8–14). +- `QA.prompt`: remove the `## Startup Tools` section (~L6–7). +- `cleaner.prompt`: remove the "At startup, install the language mutation, CRAP, and DRY tools …" line (~L19). +- `hardender.prompt`: remove the `## Startup Tools` section + APS build line (~L7–10). + +- [ ] **Step 2: Verify** + +```bash +grep -rn "At startup" swarmforge/roles/ ; echo "--- (expect no startup-install directives remain)" +``` +Expected: no remaining "At startup, install/make-ready" directives. + +- [ ] **Step 3: Commit** + +```bash +git add swarmforge/roles +git commit -m "refactor(roles): remove startup install directives — moved to setup-swarm (ADR 0003)" +``` + +--- + +## D3: ADR 0004 — back-routing rule + +No backup source — author fresh from ADR 0004. **Files:** Modify the rework-owning role prompts (coder, cleaner, architect, hardender, QA) + +- [ ] **Step 1: Read the ADR for the exact mechanic** + +Run: `cat docs/adr/0004-rework-routes-back.md` +Confirm: structural finding (re-opens an earlier stage's job) → routes to that origin stage, carried in the handoff; local work stays with the finder; a single finding bounces back at most once; a feature tolerates N=3 cycles total (routing count in the handoff trail); on exceeding, stop and ask the user. + +- [ ] **Step 2: Insert a `## Rework Routing` section before each role's Handoff** + +``` +## Rework Routing +- A structural finding — one that re-opens an earlier stage's decision (an ambiguous or missing spec, a weak or missing test, a design that cannot hold the required behavior) — routes back to the stage that owns that decision, carried in the handoff. +- Local work you can resolve without re-opening an earlier decision stays with you; fix it in place. +- A single finding bounces back at most once. A feature tolerates at most three back-route cycles total (N=3), tracked by the routing count in the handoff trail. On the fourth, stop and ask the user. +``` + +- [ ] **Step 3: Verify** + +```bash +for r in coder cleaner architect hardender QA; do grep -q "## Rework Routing" "swarmforge/roles/$r.prompt" || echo "MISSING: $r"; done; echo CHECKED +``` +Expected: only `CHECKED`. + +- [ ] **Step 4: Commit** + +```bash +git add swarmforge/roles +git commit -m "feat(roles): structural-finding back-routing with N=3 cap (ADR 0004)" +``` + +--- + +## D4: ADR 0009 — spec-header template + specifier wiring + +**Files:** Create `swarmforge/templates/feature.feature`; modify `swarmforge/roles/specifier.prompt` + +- [ ] **Step 1: Recover the template** + +```bash +mkdir -p swarmforge/templates +git show backup/six-pre-reset:swarmforge/templates/feature.feature > swarmforge/templates/feature.feature +``` +Confirm all eight comment sections: `TRACKING`, `CONTRACT`, `CONSTRAINTS`, `SEQUENCING`, `NFR`, `SIDE EFFECTS`, `SCOPE`, `UX INTENT`. + +- [ ] **Step 2: Wire the specifier** + +In Feature Workflow phase 1: start from the template and address all eight header sections (several may resolve to `none` — a deliberate answer) before scenarios. Change any "seven" header-count wording to **"eight"** / "all". + +- [ ] **Step 3: Verify** + +```bash +grep -c "^ # \(TRACKING\|CONTRACT\|CONSTRAINTS\|SEQUENCING\|NFR\|SIDE EFFECTS\|SCOPE\|UX INTENT\)" swarmforge/templates/feature.feature # 8 +grep -n "template\|eight" swarmforge/roles/specifier.prompt +grep -c "seven" swarmforge/roles/specifier.prompt # 0 +``` +Expected: 8 sections; specifier references the template + "eight"; no "seven". + +- [ ] **Step 4: Commit** + +```bash +git add swarmforge/templates/feature.feature swarmforge/roles/specifier.prompt +git commit -m "feat(spec): 8-section feature template; specifier starts from it (ADR 0009)" +``` + +--- + +## D5: ADR 0011 — fidelity manifest + specifier check + +**Files:** Create `swarmforge/dependency-manifest.prompt`; modify `swarmforge/roles/specifier.prompt` + +- [ ] **Step 1: Recover the manifest (with its Rules section)** + +```bash +git show feat/baseline-scenarios-six:swarmforge/dependency-manifest.prompt > swarmforge/dependency-manifest.prompt +``` +⚠ From `feat/baseline-scenarios-six`, NOT `obs-harness-six` (which over-deleted the Rules section). Confirm the 3 tier defs, a `Rules for every declared dependency:` section, and a `## Dependencies` body of `(none)`. + +- [ ] **Step 2: Wire the specifier** + +Add a `## Dependency Manifest` instruction before Feature Workflow: read the manifest before scenarios; on a scenario touching an undeclared external system → stop, propose name/tier/implementation/gaps, wait for approval before adding the entry; never write scenarios resting on an undeclared dependency or a declared gap. Recover exact wording from `backup/six-pre-reset:.../specifier.prompt` or `feat/issue-20-c:.../specifier.prompt` (NOT pipeline-order, which dropped it). + +- [ ] **Step 3: Verify** + +```bash +grep -ci "tier" swarmforge/dependency-manifest.prompt # >= 3 +grep -q "Rules for every declared dependency" swarmforge/dependency-manifest.prompt && echo RULES_OK +grep -q "dependency-manifest" swarmforge/roles/specifier.prompt && echo SPECIFIER_WIRED +``` +Expected: tiers present, `RULES_OK`, `SPECIFIER_WIRED`. + +- [ ] **Step 4: Commit** + +```bash +git add swarmforge/dependency-manifest.prompt swarmforge/roles/specifier.prompt +git commit -m "feat(spec): dependency fidelity manifest + specifier propose-on-undeclared (ADR 0011)" +``` + +--- + +## D6: ADR 0010 — surface harness (engineering article + QA) + +**Files:** Modify `swarmforge/constitution/articles/engineering.prompt`, `swarmforge/roles/QA.prompt` + +- [ ] **Step 1: Add the surface-tool table to `engineering.prompt`** + +Recover the table + context-driven acquisition rule from `backup/six-pre-reset:swarmforge/constitution/articles/engineering.prompt` and merge onto current upstream (a `## Surface Tools` section: tmux/PTY · Playwright · HTTP client · ingress event-injection; live-verification roles pick the minimal sufficient tool per surface). + +- [ ] **Step 2: Edit QA for surface-harness verification** + +In `QA.prompt`: +- Replace "through the user interface only" → "through the project surface harness only". +- Add: every Expected bullet maps to a harness assertion, or is `NOT AUTOMATED — `; asserting constants/config never satisfies a behavioral assertion. +- Add: re-execute the committed `observation-harness/` scenarios before final verification; a user-facing surface with no scenarios routes back (per D3). +- Add the per-surface **baseline scenario** requirement (idle stability / no console errors / no-op event = no state change). + +- [ ] **Step 3: Verify** + +```bash +grep -qi "surface" swarmforge/constitution/articles/engineering.prompt && echo ENG_OK +grep -q "project surface harness only" swarmforge/roles/QA.prompt && echo QA_SURFACE_OK +grep -q "observation-harness" swarmforge/roles/QA.prompt && echo QA_OBS_OK +grep -c "user interface only" swarmforge/roles/QA.prompt # 0 +``` +Expected: `ENG_OK`, `QA_SURFACE_OK`, `QA_OBS_OK`, zero "user interface only". + +- [ ] **Step 4: Commit** + +```bash +git add swarmforge/constitution/articles/engineering.prompt swarmforge/roles/QA.prompt +git commit -m "feat(qa): declared surface-harness verification + baseline scenarios (ADR 0010)" +``` + +--- + +## D7: ADR 0005 — refuting QA posture + +No backup source for the refute posture — author fresh; merge with D6's surface wording. **Files:** Modify `swarmforge/roles/QA.prompt` + +- [ ] **Step 1: Replace the confirm posture with refute** + +Replace the "Fix bugs found by the QA suite or final verification." line and surrounding confirm framing with: + +``` +- Assume the build does not meet the spec and the acceptance tests are too weak to notice, until proven otherwise. Attack the specified contract — try to make it fail within the spec — rather than run a checklist and confirm. +- Stay bounded by the spec: a gap the spec never settled is not a QA pass/fail; route it back to the specifier (per Rework Routing). +- Enforce conversion fidelity: a QA procedure converted into an executable script must encode the procedure's full intent. A green script that asserts nothing is test theater and is itself a defect. +- A structural finding (weak/missing test, ambiguous spec) routes back; a local defect you can fix without re-opening an earlier stage you fix in place. +``` + +- [ ] **Step 2: Confirm against the ADR** + +Run: `cat docs/adr/0005-qa-refutes-not-confirms.md` +Ensure the text matches the ADR's intent (refute, spec-bounded, conversion fidelity / no test theater). + +- [ ] **Step 3: Verify** + +```bash +grep -qi "assume the build does not meet the spec" swarmforge/roles/QA.prompt && echo REFUTE_OK +grep -ci "test theater\|asserts nothing" swarmforge/roles/QA.prompt # >= 1 +grep -c "Fix bugs found by the QA suite" swarmforge/roles/QA.prompt # 0 +``` +Expected: `REFUTE_OK`, conversion-fidelity line present, old confirm line gone. + +- [ ] **Step 4: Commit** + +```bash +git add swarmforge/roles/QA.prompt +git commit -m "feat(qa): refute posture — attack the contract, no test theater (ADR 0005)" +``` + +--- + +## D8: ADR 0007 — UX Engineer role + +**Files:** Create `swarmforge/roles/ux-engineer.prompt`; modify `swarmforge/roles/coder.prompt`, `swarmforge/roles/specifier.prompt`, `swarmforge/swarmforge.conf` + +- [ ] **Step 1: Recover the ux-engineer role** + +```bash +git show backup/six-pre-reset:swarmforge/roles/ux-engineer.prompt > swarmforge/roles/ux-engineer.prompt +``` +⚠ From `backup/six-pre-reset` (≡ `origin/feat/obs-harness-six`), NOT pipeline-order/baseline (they lack the `observation-harness/` commit step). **STRIP** DESIGN.md scaffold-on-absence + walk-up; make DESIGN.md fix-authority conditional on a feature-file reference, not tree discovery. Ensure it carries: the idle-gate line, the N=3 back-route to coder, the `observation-harness/` commit step, golden snapshots + rendering invariants, the `## Visual quality standards` block (WCAG 4.5:1 / 3:1), notify→cleaner. + +- [ ] **Step 2: Wire coder + specifier** + +- `coder.prompt`: add a "read the feature's `## UX Intent` and implement from it alongside the Gherkin" line; change handoff `notify the cleaner` → `notify the ux-engineer`. +- `specifier.prompt`: add UX INTENT authoring (it authors the feature file's `## UX Intent` section — concrete observable statements across Visual Composition / Information Hierarchy / Interaction Feel / State Transitions). STRIP any DESIGN.md scaffold/walk-up here too (reference-from-feature-file only). + +- [ ] **Step 3: Add the conf window after coder** + +In `swarmforge.conf`, after the coder line: +``` +window ux-engineer codex ux-engineer +``` + +- [ ] **Step 4: Verify** + +```bash +grep -q "Wait for a handoff" swarmforge/roles/ux-engineer.prompt && echo UX_IDLE_OK +grep -q "observation-harness" swarmforge/roles/ux-engineer.prompt && echo UX_OBS_OK +grep -c "scaffold" swarmforge/roles/ux-engineer.prompt # 0 +grep -q "notify the ux-engineer" swarmforge/roles/coder.prompt && echo CODER_OK +grep -q "window ux-engineer" swarmforge/swarmforge.conf && echo CONF_OK +``` +Expected: `UX_IDLE_OK`, `UX_OBS_OK`, zero scaffold, `CODER_OK`, `CONF_OK`. + +- [ ] **Step 5: Commit** + +```bash +git add swarmforge/roles/ux-engineer.prompt swarmforge/roles/coder.prompt swarmforge/roles/specifier.prompt swarmforge/swarmforge.conf +git commit -m "feat(roles): UX Engineer after coder; UX Intent authoring + read (ADR 0007)" +``` + +--- + +## D9: ADR 0008 — integrator role + specifier stops merging + +**Files:** Create `swarmforge/roles/integrator.prompt`; modify `swarmforge/roles/specifier.prompt`, `swarmforge/roles/QA.prompt`, `swarmforge/swarmforge.conf` + +- [ ] **Step 1: Recover the integrator role + apply the FIX** + +```bash +git show backup/six-pre-reset:swarmforge/roles/integrator.prompt > swarmforge/roles/integrator.prompt +``` +⚠ From `backup/six-pre-reset` (≡ `feat/issue-20-c`), NOT baseline-scenarios-six (still says "notify specifier"). **FIX step 7** to: `Notify the curator that the feature has landed. Include the specifier handoff name and the post-merge master commit hash.` Confirm: one PR/feature, autofix-lint-only, branch → `gh pr create` → watch CI → green `gh pr merge --squash --delete-branch` + post-merge gate, CI-red routing (tests→coder, coverage/CRAP/DRY→cleaner, arch→architect; autofix doesn't count; N=3 then `FAILED: depth cap reached`), idle-gate line, agent-retro line. + +- [ ] **Step 2: Specifier stops merging + per-feature reset** + +In `specifier.prompt`: +- Drop the merge step (upstream's "merge the changes and ask the user", ~L36); replace the completion line with a placeholder D10 finalizes — for now: "When the work is landed, ask the user for the next feature to add." +- Add the per-feature worktree reset: on receiving a handoff, `git reset --hard "origin/$(git symbolic-ref refs/remotes/origin/HEAD | sed 's|refs/remotes/origin/||')"` in the specifier's own worktree (recover the exact form from `feat/six-pack-pipeline-order-and-scaffold`). STRIP any `git merge --ff-only origin/master` startup line. + +- [ ] **Step 3: QA hands off to integrator + conf windows** + +- `QA.prompt`: change the final handoff to `notify the integrator` (replacing the broadcast list). +- `swarmforge.conf`: change line 1 `window specifier codex master` → `window specifier codex specifier`; insert after QA: `window integrator codex integrator`. + +- [ ] **Step 4: Verify** + +```bash +grep -q "Notify the curator" swarmforge/roles/integrator.prompt && echo INT_FIX_OK +grep -q "post-merge master commit hash" swarmforge/roles/integrator.prompt && echo INT_HASH_OK +grep -q "notify the integrator" swarmforge/roles/QA.prompt && echo QA_INT_OK +grep -q "symbolic-ref" swarmforge/roles/specifier.prompt && echo SPEC_RESET_OK +grep -q "window specifier codex specifier" swarmforge/swarmforge.conf && echo CONF_SPEC_OK +grep -q "window integrator" swarmforge/swarmforge.conf && echo CONF_INT_OK +grep -c "codex master" swarmforge/swarmforge.conf # 0 +``` +Expected: all six `*_OK`, zero `codex master`. + +- [ ] **Step 5: Commit** + +```bash +git add swarmforge/roles/integrator.prompt swarmforge/roles/specifier.prompt swarmforge/roles/QA.prompt swarmforge/swarmforge.conf +git commit -m "feat(roles): terminal integrator; specifier stops merging, runs own worktree (ADR 0008)" +``` + +--- + +## D10: ADR 0013 — curator role + chain rewiring + +Authoritative source = the locked spec's PR-C2 block (budgets **60/40**, NOT the stale 150/300 on artifact branches). **Files:** Create `swarmforge/roles/curator.prompt`; modify `swarmforge/roles/integrator.prompt`, `swarmforge/roles/specifier.prompt`, `swarmforge/constitution/articles/workflow.prompt`, `swarmforge/swarmforge.conf` + +- [ ] **Step 1: Extract the curator from the locked spec** + +Run: `git show feat/issue-20-b-bundle-knowledge-injection:docs/specs/issue-20-knowledge-promotion-loop.md` +Copy the **PR-C2 verbatim block** into `swarmforge/roles/curator.prompt`. Confirm: idle-gate; writes only `AGENTS.md` + `.agents/`; sources `~/.claude/worklog/retros/*.md`; the routing ladder (enforcement-gate backlog → AGENTS.md ≤60 → role files ≤40 → references → skills-on-2nd → upstream → ledger); ledger line `date | session-id | role | failure-class | verdict | summary`; lifecycle (empty-run pass-through, knowledge branch, self-merging PR with metric line, move retros to `processed/`, notify specifier); 9-check per-item algorithm. **Budgets must read 60 and 40.** + +- [ ] **Step 2: Rewire the chain** + +- `integrator.prompt`: confirm step 7 notifies the curator (done in D9); fix if drifted. +- `specifier.prompt`: change the wait line to "When the **curator** notifies you that the job is complete, run the per-feature reset, then ask the user for the next feature. The curator's handoff means the knowledge PR for the previous feature has already landed." +- `workflow.prompt`: append: "The landing chain is integrator → curator → specifier. The curator promotes retro knowledge before the specifier is released; an empty curation run notifies the specifier immediately — the pipeline never stalls on the curator." +- `swarmforge.conf`: append last: `window curator codex curator`. + +- [ ] **Step 3: Verify** + +```bash +grep -q "Wait for a handoff" swarmforge/roles/curator.prompt && echo CUR_IDLE_OK +grep -Eq "60" swarmforge/roles/curator.prompt && grep -Eq "40" swarmforge/roles/curator.prompt && echo BUDGETS_OK +grep -c "150\|300" swarmforge/roles/curator.prompt # 0 +grep -q "When the curator notifies you" swarmforge/roles/specifier.prompt && echo SPEC_CUR_OK +grep -qi "integrator.*curator.*specifier" swarmforge/constitution/articles/workflow.prompt && echo WF_OK +grep -c "^window" swarmforge/swarmforge.conf # 9 +``` +Expected: `CUR_IDLE_OK`, `BUDGETS_OK`, zero 150/300, `SPEC_CUR_OK`, `WF_OK`, and **9** windows (specifier, coder, ux-engineer, cleaner, architect, hardender, QA, integrator, curator). + +- [ ] **Step 4: Commit** + +```bash +git add swarmforge/roles/curator.prompt swarmforge/roles/specifier.prompt swarmforge/constitution/articles/workflow.prompt swarmforge/swarmforge.conf +git commit -m "feat(roles): terminal curator; integrator->curator->specifier chain (ADR 0013)" +``` + +--- + +## D11: ADR 0015 — platform-feasibility stop rule + +**Files:** Modify `swarmforge/constitution/articles/workflow.prompt` + +- [ ] **Step 1: Add the stop rule** + +Append to `workflow.prompt`: + +``` +## Platform Feasibility +- When the spec and the platform conflict — the spec calls for a capability the target platform does not provide — stop and report instead of working around it. A workaround comment ("we can't do X here, so we do Y") is a defect, not a resolution. Wait for a spec revision. +``` + +- [ ] **Step 2: Verify** + +```bash +grep -qi "platform" swarmforge/constitution/articles/workflow.prompt && grep -qi "workaround.*defect" swarmforge/constitution/articles/workflow.prompt && echo OK +``` +Expected: `OK`. + +- [ ] **Step 3: Commit** + +```bash +git add swarmforge/constitution/articles/workflow.prompt +git commit -m "feat(workflow): platform-feasibility stop rule (ADR 0015)" +``` + +--- + +## D12: ADR 0016 — cleaner boundary-file scan + +**Files:** Modify `swarmforge/roles/cleaner.prompt` + +- [ ] **Step 1: Add the boundary-file rule** + +Recover the cleanest wording from `feat/baseline-scenarios-six:swarmforge/roles/cleaner.prompt`. After the ">100 mutation sites → split" rule, add: + +``` +- Also run the mutation scan/count mode on boundary files (the environmentally unsuitable modules excluded from the test tools). If a boundary file exceeds ~15 mutation sites, it holds implementation logic, not adaptation — extract that logic to a testable module before handoff. +- Treat a test that asserts only a stripped or simplified view of output (e.g. ANSI-stripped text when the real output carries escape codes) as not covering the un-stripped behavior. Add coverage for the full output. +``` + +- [ ] **Step 2: Verify** + +```bash +grep -qi "boundary" swarmforge/roles/cleaner.prompt && grep -q "15" swarmforge/roles/cleaner.prompt && echo BOUNDARY_OK +grep -qi "stripped" swarmforge/roles/cleaner.prompt && echo STRIPPED_OK +``` +Expected: `BOUNDARY_OK`, `STRIPPED_OK`. + +- [ ] **Step 3: Commit** + +```bash +git add swarmforge/roles/cleaner.prompt +git commit -m "feat(cleaner): boundary-file mutation scan at ~15 sites; stripped-view anti-pattern (ADR 0016)" +``` + +--- + +## D13: hardener rendering-invariant property tests (manifest row, no ADR) + +Unmanifested divergence found in audit; consistent with ADR 0007/0010. **Files:** Modify `swarmforge/roles/hardender.prompt` + +- [ ] **Step 1: Add the rendering-invariant line** + +Recover the exact text from `backup/six-pre-reset:swarmforge/roles/hardender.prompt` (~L18) and merge it in (don't lift the whole file). Rule: for pure rendering functions (state → string, no side effects), add property tests asserting structural invariants — required elements present per state, character set bounded to the declared vocabulary, mutually exclusive states never co-rendered. Confirm D2 already stripped Startup Tools and the unauthorized "merge all queued architect handoffs together" line is absent (keep upstream's sorted-filename batch). + +- [ ] **Step 2: Verify** + +```bash +grep -qi "rendering" swarmforge/roles/hardender.prompt && grep -qi "property test\|invariant" swarmforge/roles/hardender.prompt && echo OK +grep -c "merge all queued architect handoffs" swarmforge/roles/hardender.prompt # 0 +``` +Expected: `OK`, zero unauthorized merge-all line. + +- [ ] **Step 3: Commit** + +```bash +git add swarmforge/roles/hardender.prompt +git commit -m "feat(hardener): property tests for pure rendering functions (manifest row)" +``` + +--- + +## D14: ADR 0018 — root `swarm` upgrade subcommand + self-url + +The main script half (skill install) is C8. This is the runnable-branch half. **Files:** Modify the root `swarm` bootstrap (exists on `six-pack`) + +- [ ] **Step 1: Inspect current + recover the target deltas** + +Run: `git show origin/six-pack:swarm | head -60` +Run: `git show 8994322:swarm 2>/dev/null | head -120` (adds `upgrade`/`write_source_branch`/`download_from_main`) and `git show ded6019:swarm 2>/dev/null | head -40` (self-url). +Merge the minimal deltas onto the current six-pack root `swarm`: +- `SCRIPTS_REPO="${SWARMFORGE_SCRIPTS_REPO:-gabadi/swarm-forge}"` (self-referencing; replaces hardcoded `unclebob/swarm-forge`). +- `download_from_main` (refresh scripts + skills from `main`). +- `write_source_branch` (record the runnable source branch in `.swarmforge/source-branch`). +- The `upgrade` subcommand: refresh scripts(main) + prompts(`source-branch`) + force skill reinstall (clear `.swarmforge/skills-installed`). + +- [ ] **Step 2: Verify** + +```bash +grep -q "gabadi/swarm-forge" swarm && echo SELF_URL_OK +{ grep -q "upgrade)" swarm || grep -q '"upgrade"' swarm; } && echo UPGRADE_OK +{ zsh -n swarm 2>/dev/null || bash -n swarm; } && echo SYNTAX_OK +``` +Expected: `SELF_URL_OK`, `UPGRADE_OK`, `SYNTAX_OK`. + +- [ ] **Step 3: Commit** + +```bash +git add swarm +git commit -m "feat(swarm): self-url + upgrade subcommand with source-branch tracking (ADR 0018)" +``` + +--- + +## Finalize PR 2 (SIX-PACK) + +- [ ] **Step 1: Whole-track verification** + +```bash +grep -c "^window" swarmforge/swarmforge.conf # 9, in order +for r in specifier coder ux-engineer cleaner architect hardender QA integrator curator; do + test -f "swarmforge/roles/$r.prompt" || echo "MISSING role file: $r" +done; echo ROLES_CHECKED +git diff --stat origin/six-pack # review: only prompts/articles/templates/conf/swarm changed +``` +Expected: 9 windows; all 9 role files present; `ROLES_CHECKED`; the diff touches only six-pack-owned files. + +- [ ] **Step 2: Push the branch** + +```bash +git push -u origin feat/fork-divergences-six-pack +``` + +- [ ] **Step 3: Open the single PR** + +```bash +gh pr create --base six-pack --repo gabadi/swarm-forge \ + --title "feat: fork divergences — six-pack prompts + constitution + conf" \ + --body "Re-applies the six-pack fork divergences on pristine upstream, one commit per ADR: 0002 idle-gate, 0003 startup-strip, 0004 back-routing, 0009 spec header, 0011 fidelity manifest, 0010 surface harness, 0005 refute QA, 0007 UX engineer, 0008 integrator, 0013 curator, 0015 platform-feasibility, 0016 cleaner boundary scan, hardener invariants, 0018 root swarm upgrade. Final pipeline: specifier→coder→ux-engineer→cleaner→architect→hardener→QA→integrator→curator (9 windows). DESIGN.md reference-only; curator budgets 60/40; four-pack frozen. See docs/superpowers/plans/2026-06-14-fork-divergence-implementation.md and docs/fork-change-manifest.md (Section B)." +``` + +--- + +## Out of scope (explicitly NOT implemented) + +- **four-pack PR** — frozen (manifest 2026-06-14): pure merge-mirror of `upstream/four-pack`. The issue-20 spec's "PR D on four-pack" is **dropped**. +- **cmux multiplexer** (`swarm-mux.sh`, `write_deliver_script`/`write_notify_script`/`write_stop_hook`, `MUX_TARGETS`) — DROPPED; stay on upstream's tmux harness. +- **Ideas G, H, I** — genuinely rejected, no recovery. +- **DESIGN.md scaffolding** — ADR 0007 wins: reference-from-feature-file only; recovered roles STRIP scaffold-on-absence + walk-up. +- **curator budgets 150/300** — superseded by the locked spec's 60/40. + +--- + +## Self-Review + +**Spec coverage** (manifest sections A/B/C + cross-cutting): +- Section A (main → PR 1): 0006→C6, 0012→C4, 0014→C3, 0013-skill→C9, 0003→C11 ✓ +- Section B (six-pack → PR 2): 0002→D1, 0009→D4, 0011→D5, 0010→D6, 0005→D7, 0004→D3, 0007→D8, 0008→D9, 0013→D10, 0015→D11, 0016→D12, hardener-row→D13 ✓ +- Section C (uncaptured): B/0017→C2, F/0020→C5, J→C9, N/0018→C8+D14, O→C11, auto-permission/0019→C1, executing-fields→C7, retro-triage/0021→C10, self-url→D14 ✓ +- Cross-cutting: observation-harness shared (D6 QA re-exec, D8 ux-engineer writes, D13 hardener honors); N=3 back-route (D3, carried by D8/D9); refute+surface QA merged across D6→D7; DESIGN.md reference-only (D8); curator chain order (D10) ✓ + +**Structure:** exactly two branches (`feat/fork-divergences-main`, `feat/fork-divergences-six-pack`), one PR each; per-divergence commits in a linear, dependency-correct order on each branch; no per-ADR branches, no four-pack PR. + +**Within-branch ordering:** MAIN — only hard dep is C3 after C2; all `swarmforge.sh` commits are linear so no in-file conflict. SIX-PACK — D1:` + specific STRIP/FIX deltas; verification commands are concrete with expected output. + +**Naming consistency:** `resolve_prompt_bundle`, `write_agent_instruction_file`, `write_worktree_advisor`, `write_worktree_permissions`, `ensure_skills_installed`, `install_skills` consistent across C2/C3/C4/C5/C8/C11; markers `.swarmforge/setup-complete` / `.swarmforge/skills-installed` consistent; conf window names match across D8/D9/D10. + +**Known soft spots to confirm during execution (not blockers):** +- C2/C3/C4 line numbers drift — locate by function name. +- C7 executing-fields: find the actual executing-entry write site in the upstream handoff scripts (NOT a `swarmforge.sh` heredoc as on the cmux lineage). +- C6 QA holdout path (`qa-e2e`) must match the specifier-authored path — keep the one `QA_HOLDOUT_PATH` constant as the single source of truth. +- D10 curator: budgets are 60/40 from the locked spec, not 150/300. From 20844b79122365bacb0b36c08e1691f10677e06e Mon Sep 17 00:00:00 2001 From: gabadi Date: Sun, 14 Jun 2026 18:18:57 -0300 Subject: [PATCH 18/67] =?UTF-8?q?feat:=20fork=20divergences=20=E2=80=94=20?= =?UTF-8?q?main=20script=20+=20skill=20layer=20(#31)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * feat(swarmforge): autonomous permission mode for unattended roles (ADR 0019) Co-Authored-By: Claude Opus 4.8 (1M context) * feat(swarmforge): pre-resolve role prompt bundle into XML envelope (ADR 0017) Co-Authored-By: Claude Opus 4.8 (1M context) * feat(swarmforge): inject AGENTS.md + .agents/roles into role bundle (ADR 0014) Co-Authored-By: Claude Opus 4.8 (1M context) * feat(swarmforge): per-role model/effort/advisor in swarmforge.conf (ADR 0012) Co-Authored-By: Claude Opus 4.8 (1M context) * feat(swarmforge): enable auto-compaction on role worktrees (ADR 0020) Co-Authored-By: Claude Opus 4.8 (1M context) * feat(swarmforge): sparse-checkout the QA holdout from shaping roles (ADR 0006) Co-Authored-By: Claude Opus 4.8 (1M context) * feat(swarmforge): carry {message,hash,sender} in executing logbook entry (ADR 0002) Co-Authored-By: Claude Opus 4.8 (1M context) * feat(swarmforge): pin-aware idempotent skill install at launch (ADR 0018) Co-Authored-By: Claude Opus 4.8 (1M context) * feat(swarmforge): add agent-retro skill — scoped, capture-first, autonomous (ADR 0013) Co-Authored-By: Claude Opus 4.8 (1M context) * feat: restore retro-triage skill (ADR 0021) Allow .claude/skills/ tracking via gitignore negation so operator-facing skills can be committed alongside the repo. Co-Authored-By: Claude Opus 4.8 (1M context) * feat(swarmforge): setup-swarm skill + swarm-ready marker guard + scaffold (ADR 0003, Idea O) Co-Authored-By: Claude Opus 4.8 (1M context) * fix(setup-swarm): ask for stack instead of detecting; list only implemented stacks Co-Authored-By: Claude Opus 4.8 (1M context) * fix(swarmforge): address PR #31 review on swarmforge.sh - Drop run-path .gitignore writes (logbook.jsonl/tmp/) from ensure_initial_gitignore and ensure_runtime_git_excludes; the .gitignore scaffold is owned by the setup-swarm skill (ADR 0003). Reverts both functions to upstream form. (comment 3) - Remove remove_nonessential_clone_files + its call: rm -rf $WORKING_DIR/examples is unbacked data loss on any project with an examples/ dir. (comment 4 / review #1) - Shell-quote per-role model/effort launch flags with ${(q)...} across all four backends so a quote in swarmforge.conf can't break the launch command. (review #5) Co-Authored-By: Claude Opus 4.8 (1M context) * fix(agent-retro): model-aware pricing + correct end_time on truncated tail - extract.py priced every agent/subagent at hardcoded Opus rates regardless of model (and the rate was stale pre-4.6 Opus $15/$75). Replace PRICE_PER_M with a per-family PRICE_TABLE (opus/sonnet/haiku/fable, current rates) + price_for_model + compute_cost; capture the actual model from each session's and subagent's own assistant records and price accordingly. Unknown models fall back to Opus so cost is never understated. (review #2/#3) - extract_metadata_lite read end_time from a partial record when the file exceeds the 64KB tail buffer (tail starts mid-line; a timestamp embedded in a large tool_result could win). Drop the partial first tail line before the backward scan over complete lines. (review #4) Co-Authored-By: Claude Opus 4.8 (1M context) * refactor(swarmforge): fold worktree settings into one shared writer (ADR 0020) write_worktree_permissions (0020 compaction keys) and write_worktree_advisor (0012 advisor model) were two separate inline-python read-modify-write passes over the same .claude/settings.local.json. Replace both with a single write_worktree_settings [advisor] that applies the compaction keys always and advisorModel when set, sharing one RMW. prepare_worktrees calls it for compaction; launch_role calls it with the advisor. Resolves review #8. Co-Authored-By: Claude Opus 4.8 (1M context) * feat(retro-triage): also glob curator processed/ archive in detector (ADR 0021) Co-Authored-By: Claude Sonnet 4.6 * docs: ADR 0002 clear-first delivery engine design spec Co-Authored-By: Claude Sonnet 4.6 * feat(swarmforge): implement ADR 0002 clear-first delivery engine - handoff-lib.sh: add handoff_project_dir/from, handoff_pending_dir, handoff_busy_file, handoff_agent_type, handoff_clear_first_deliver - send-handoff.sh: replace direct notify-agent.sh call with claude-aware idle path (pending queue + noclobber busy marker) - swarm-stop.sh: new Stop hook that drains pending queue on task end - swarmforge.sh: write_worktree_settings gains stop_script param; launch_role wires Stop hook for claude roles and drops positional prompt from claude launch_cmd (clear-first delivery handles it) Co-Authored-By: Claude Sonnet 4.6 --------- Co-authored-by: Claude Opus 4.8 (1M context) --- .claude/skills/retro-triage/SKILL.md | 220 ++++++ .gitignore | 3 +- ...-14-adr0002-clear-first-delivery-design.md | 103 +++ swarmforge/scripts/complete-handoff.sh | 12 + swarmforge/scripts/handoff-lib.sh | 127 ++++ swarmforge/scripts/install-pins.conf | 5 + swarmforge/scripts/send-handoff.sh | 20 +- swarmforge/scripts/swarm-stop.sh | 53 ++ swarmforge/scripts/swarmforge.sh | 215 +++++- swarmforge/skills/agent-retro/SKILL.md | 183 +++++ .../skills/agent-retro/scripts/extract.py | 630 ++++++++++++++++++ swarmforge/skills/setup-swarm/SKILL.md | 139 ++++ 12 files changed, 1696 insertions(+), 14 deletions(-) create mode 100644 .claude/skills/retro-triage/SKILL.md create mode 100644 docs/superpowers/specs/2026-06-14-adr0002-clear-first-delivery-design.md create mode 100644 swarmforge/scripts/install-pins.conf create mode 100755 swarmforge/scripts/swarm-stop.sh create mode 100644 swarmforge/skills/agent-retro/SKILL.md create mode 100644 swarmforge/skills/agent-retro/scripts/extract.py create mode 100644 swarmforge/skills/setup-swarm/SKILL.md diff --git a/.claude/skills/retro-triage/SKILL.md b/.claude/skills/retro-triage/SKILL.md new file mode 100644 index 0000000..3c5c5bb --- /dev/null +++ b/.claude/skills/retro-triage/SKILL.md @@ -0,0 +1,220 @@ +--- +name: retro-triage +description: Use when unprocessed session retros sit in ~/.claude/worklog/retros/ and a batch needs root-cause analysis across the sessions. Triggers on "triage the retros", "consolidate retros", "what did we learn this batch", "file issues from the last sessions", "any new pains from the swarm runs". +--- + +# retro-triage + +## Overview + +Turn a batch of session retros into a **validated root-cause diagnosis** with framed candidate actionables — written to a consolidation doc. The pipeline is **harvest → reconstruct → diagnose → validate → frame**. You do NOT bucket-and-file: bucketing per-signal scatters root causes and reproduces the retros' own framing. Diagnosis is the product; a human files issues from it. + +This is NOT `mattpocock-skills:triage`. That skill triages *incoming* issues from a human reporter through a grilling/labeling state machine. Here there is no reporter — the source is a pile of dead session transcripts, and the output is a diagnosis that must carry its own evidence. Do not hand off to that skill. + +**Core principle: the retro is a symptom report, not a diagnosis. Your job is the diagnosis.** A retro reliably tells you *what hurt* in one role's one session. Its `## What Didn't Work`/`## Actions`/`## What Worked` are the author's framing under a keyhole view — hints, never findings. **The root cause almost never lives inside a single retro.** It lives *across* retros (one upstream decision surfaces as different pains in five roles) and *below* their notice (routine work nobody thought to complain about). If your actionables look like the retros' proposed fixes with an "unverified" label, you have sorted, not diagnosed — and you are wrong. + +**Two failure modes this skill exists to prevent (both happened in production):** +- **Codifying a workaround as a win.** A slick conflict-resolution technique landed in `## What Worked` → got filed as a "pattern worth codifying." It was a workaround for self-inflicted merge conflicts caused by the squash-merge strategy. The real finding was the upstream cause, invisible because nobody's retro said "the strategy made me merge." +- **Inheriting the retro's fix.** A prior batch filed the retros' proposed "push before handoff" rule. The mechanism was wrong (the workflow already merged by hash; the role *had* pushed). Every issue was closed as mis-framed. + +## When to Use + +- Unprocessed retros sit in `~/.claude/worklog/retros/` (no `consolidated:` frontmatter) — the curator's `processed/` archive is also scanned, so curated retros stay visible to a later diagnosis +- Periodically after a swarmforge batch closes + +**Do NOT use:** +- For a single retro — too little signal. Read it directly. +- To re-process already-stamped retros — skip them. +- To triage incoming human-reported issues — that is `mattpocock-skills:triage`. + +## Inputs + +1. **Unprocessed retros** — files in `~/.claude/worklog/retros/*.md` lacking `consolidated:` in their first 5 lines AND whose name lacks `CONSOLIDATED`. Use this detector exactly (`grep -L` over the whole file gives false positives when "consolidated" appears in the body): + ```bash + for f in ~/.claude/worklog/retros/*.md ~/.claude/worklog/retros/processed/*.md; do + [ -e "$f" ] || continue # globs that match nothing expand literally + case "$(basename "$f")" in *CONSOLIDATED*) continue ;; esac + head -5 "$f" | grep -q '^consolidated:' || echo "$f" + done + ``` +2. **Prior consolidations** — any `*-CONSOLIDATED-actionables.md` in the same dir. +3. **Open issues** — `gh issue list --state open --limit 50 --json number,title,labels`. +4. **Project gotchas** — only the `## Gotchas` section of the repo's `AGENTS.md`. Read other repo files (role prompts, constitution, scripts) only when a signal explicitly references them for diagnosis. +5. **Closed issues (on-demand)** — only when a signal smells pre-decided. Dispatch a Haiku subagent with `gh issue list --state closed --search ""`; do not pull closed-issue text into main context. + +Context discipline for the bulk read is defined in Phase 1 (one subagent, never split). + +## Phase 1 — Harvest (raw symptoms only, no conclusions) + +Extract verbatim. These sections are **the author's framing — input to diagnosis, never output.** Do not let a section's label decide an actionable's fate. + +| Section | What it actually is | +|---|---| +| `## What Didn't Work` | Symptoms + the author's guessed cause. Keep the symptom verbatim; discard the guess until you re-derive it. | +| `## Actions` | The author's *proposed fix*. A hypothesis. Never file it as-is. | +| `## What Worked` | What the author was *proud of*. **Trap:** a thing done well may be a workaround for an upstream problem. Run the workaround test (Phase 3) before believing it's a pattern. | +| `## Tool Result Waste` | Efficiency symptoms. Usually a symptom of something, not a finding itself. | + +Each retro header carries **Session ID**, **Branch**, **Date**, and references **commit SHAs / PRs**. Capture the commits — they are the traceability anchor (see below). + +Do NOT skip token/cost tables: a cluster of "expensive session" lines is often the visible edge of an invisible-work root cause (Phase 3). + +**Context discipline:** delegate the bulk read to ONE subagent (not several split by file — root causes cross the split line, and a half-batch reader can't see them). It returns per-file verbatim signals; you do the cross-retro work in the main thread. + +## Phase 2 — Reconstruct the episode (independent of the retros) + +Before any classification, rebuild what the **system actually did** this batch, from durable artifacts, NOT from what the retros say happened: + +- `git log --oneline --graph` over the batch range; note the branch topology and **how branches landed** (squash? merge commit? rebase?). Squash-to-main + long-lived role branches *mechanically* generate divergence — a root cause no single retro will name. +- Which gates ran and which didn't (`verify` chain, mutation, reality-check, arch-check). Read the actual scripts/config when a symptom touches them. +- What landed where (PRs, the final commits on `main`). + +Write a 3–6 line factual reconstruction. This is the lens you cluster symptoms onto. + +## Phase 3 — Diagnose (derive cause across symptoms, then validate) + +Cluster the harvested symptoms onto the reconstruction and ask: **what one decision or missing mechanism explains this cluster?** Two mandatory probes, because the retros are blind to both: + +1. **Workaround-vs-win.** For every `## What Worked` item and every "we handled it well": *would this work have been necessary if something upstream were right?* If the heroics exist to cope with a self-inflicted problem, the finding is the upstream cause — the technique is evidence of cost, not a pattern to codify. +2. **Invisible / normalized work.** Scan for work every role did routinely and nobody flagged as pain — repeated merges, branch resets, re-runs, "expensive session" cost lines. Normalized cost is where the biggest root causes hide, precisely because no retro complains about it. + +Then **validate each candidate cause against the artifacts before it becomes an actionable.** Read the prompt/script/config the cause implicates. Kill the ones the evidence contradicts (the "push before handoff" fix died here: `workflow.prompt` already merged by hash). An unvalidated cause is not a finding — it is the retro's guess wearing a label. + +**The session transcript IS an artifact — and the retro can be wrong about its own session.** Every retro records a Session ID; that is the durable handle to ground truth. Before quoting any *figure, sequence, or "the user said X"* from a retro, confirm it in the actual transcript (resolve via the `entire` CLI by session id). Retros mis-state: a real batch retro reported "70.41%, 22 survivors" for one file when the number belonged to a *different* file and that session had run no mutation at all. **Delegate this check to ONE subagent** (give it all the session IDs + the claims; one reader so findings stay coherent — never split across agents). Quote what the transcript actually shows, not what the retro says it shows. + +## The root-cause record — the unit of output, and the evidence gate + +**This skill's own thesis is "prose rules get skipped; prefer a mechanical gate." Apply it to yourself.** "Validate the cause" is prose an agent can claim without doing (this skill was used to write `Validated:` with no artifact read, and to quote a mutation figure that belonged to a different file). The gate is: **every root cause is recorded in this exact shape, and the Evidence block must contain literal receipts. A receipt is a fact someone else could re-pull, not your summary of it.** No receipt → the verdict is `INSUFFICIENT` by default → it is NOT a finding, cannot become an actionable, cannot be filed. + +```markdown +### RC-N — +- **Symptoms it explains:** verbatim quotes + which retros/sessions (the cross-retro cluster) +- **Probes:** workaround-vs-win → ; invisible/normalized-work → +- **Evidence (receipts — each line re-pullable by someone else):** + - `file:line` quoted, OR `command` + its ACTUAL output, OR transcript quote `@` + - …one line per artifact checked; state what each proves or CONTRADICTS +- **Verdict:** SUPPORTED | WEAKENED | INSUFFICIENT — and it must follow from the receipts above, not from the retro's say-so +- **Disposition + framing:** → if SUPPORTED, the framed candidate (investigate/decide, target file, default `ready-for-human`) +``` + +Hard rules on the record: +- **A bare `Validated:` / "confirmed" with no receipt line is a forgery.** Treat it as INSUFFICIENT. +- **The retro's own numbers, sequences, and "the user said X" are claims, not receipts.** A receipt is the transcript quote (by session id), the git output, the `file:line`. If your only source is the retro, your verdict is at best INSUFFICIENT. +- **A WEAKENED/INSUFFICIENT cause can still be a real finding** — file it as needs-info with the gap stated. What you may NOT do is upgrade it to a prescribed fix. + +The buckets below are a **disposition tag on a finished record**, not bins you sort raw signals into. + +## Disposition tags (applied to a validated record — never to raw signals) + +| # | Tag | Test | +|---|---|---| +| 1 | **Failed-to-learn** | Pain recurs AND a prior CONSOLIDATED row or AGENTS.md `## Gotchas` row documented a **specific fix**. Recurs without a prior remediation → Bucket 3. **Always emit the header** (write "None this batch — verified against [N] prior fixes."). | +| 2 | **Dupe-of-existing-issue** | Matches an open issue. Output: link + a clarification comment (self-contained, see below). | +| 3 | **New-actionable, issue-shaped** | New pain worth tracking. File a self-contained, traceable issue (see Authoring contract). Default `ready-for-human`. | +| 4 | **New-actionable, spec-shaped** | Fix needs design before a ticket (new protocol, a system port). State the design question in the issue; still file as `ready-for-human`. | +| 5 | **Needs-info / decision** | Mechanism contested or unclear. `ready-for-human`. | +| 6 | **Pattern-worth-codifying** | A genuine technique worth a rule/template — ONLY after it passes the workaround-vs-win probe (Phase 3). If the "win" exists to cope with an upstream problem, it is NOT a pattern; route the upstream cause to Bucket 3/5 and cite the technique as evidence of cost. | +| 7 | **Already-learned / dropped** | Pain matches a documented rule, the session **followed it**, retro just confirms it worked. One-line note + source. | +| 8 | **Noise** | Not documented anywhere AND "structural to async swarm" / "generic unfixable friction" / "one-off not worth a rule". Explicit rationale per item. Tiebreak vs 7: written rule exists → 7; no rule and no mechanism → 8. | + +## Issue authoring contract (used only at the human-gated filing step) + +A filed issue is a SUPPORTED root-cause record, rewritten to be **self-contained** and **traceable**: a future agent must extract a valid learning from the body *alone*, without reloading any transcript or local file. The record's receipts become the issue's evidence; the record's verdict sets the issue's confidence. + +**Two hard rules, both learned from real failures:** + +1. **No reference to anything local.** Never cite a retro filename, a `~/.claude/worklog/...` path, a consolidation doc, or a session-transcript `.jsonl` path. Those live on one machine and die elsewhere. Cite only repo paths (`swarmforge/...`, `api/src/...`, `AGENTS.md`) and durable handles (commit SHAs, PR/issue numbers). +2. **Preserve the signal verbatim — do not paraphrase it away.** Your prose summary is not the evidence. Quote the exact `## What Didn't Work` / `## Actions` lines, error strings, commands, and user corrections. Paraphrase loses the debuggable signal. + +**Traceability anchor = the git commit (the `explain` skill pattern).** The durable link from an issue to its origin is the **commit SHA / PR** where the work or pain landed — it is on `origin`, shared and permanent. Provide the resolver line literally: +``` +entire checkpoint explain --commit +``` +The Session UUID is a *secondary, local-only* hint — label it as such; never make it the primary anchor. When a pain never landed as its own commit (halted session, local-only WIP), say so explicitly rather than inventing an anchor. + +**Body schema** (adapted from the `extracting-skill-learnings` skill): +```markdown +> *This was generated by AI during triage.* + +--- +date: YYYY-MM-DD +model: +harness: +source: multi-agent (swarmforge) session retrospectives, +--- + +Pain is stated as fact. The cause carries the record's verdict (SUPPORTED / WEAKENED / INSUFFICIENT) — never assert more confidence than the receipts earned. + +## Raw signals +(verbatim quotes, attributed to role + Session ID; redact secrets/PII as [redacted], keep branch names / SHAs / paths) + +## Defect +**What happened:** ... +**Cause (verdict: SUPPORTED|WEAKENED|INSUFFICIENT):** ... +**Evidence:** the record's receipts — `file:line`, command+output, transcript quote. (NOT the retro's summary.) +**Attribution:** skill workflow | model reasoning | harness enforcement + +## Actionables +| # | Actionable | Target | Confidence | Status | +|---|-----------|--------|-----------|--------| +| 1 | investigate/decide (prescribed fix ONLY if verdict=SUPPORTED on a mechanism you validated) | | from verdict | pending | + +## Traceability +- Landing commit(s) / PR(s): — resolve via `entire checkpoint explain --commit ` +- Session UUID(s) (local transcript only, not durable): () +``` + +## Judgment rules + +- **Pain = fact, cause = hypothesis.** Never promote a retro's guessed cause to asserted root cause. Default new actionables to `ready-for-human`; reserve a prescribed fix (and `ready-for-agent`) for a mechanism you independently verified, not merely a plausible one. +- **Don't over-fragment one cause into many fixes.** Several signals often trace to one gap seen from different roles. Collapse into one broad problem statement rather than N prescriptive tickets — especially since each per-signal "fix" is only a hypothesis. Tiebreak for keeping separate: the fixes would be genuinely different commits/PRs. +- **Prefer a lint/CI gate over a prompt/doc edit** when the rule is mechanically checkable — that is what makes rules stick (prose rules get skipped). Note this in Actionables. +- **Source attribution** belongs in YOUR scratch reasoning only (`Source: (primary), (secondary)`). It must NOT appear in issue bodies (authoring rule 1). +- **Conflict rule:** two retros propose different mechanisms for one pain → list both as options, `ready-for-human`. Do not silently pick. +- **Stale-status verification:** for every `[x] done` row in the most recent prior CONSOLIDATED, grep git log / target file for evidence. Flag absences. +- **Governing insight:** include a section only if one meta-pattern explains ≥3 pains. Frame it as a candidate explanation ("these pains *may* share…"), not a proven diagnosis. Forced coherence is fabrication. + +## Output: the consolidation doc (you do NOT file issues) + +Write the diagnosis to `~/.claude/worklog/retros/-CONSOLIDATED-actionables.md`: the Phase-2 reconstruction, the validated root causes, and the bucketed candidates. Filing GitHub issues is a **separate, human-gated step** — present the doc and ask before creating anything. (Both prior auto-filed batches were closed as mis-framed; the validation step is necessary but not yet proven sufficient.) + +When the human approves filing, follow the authoring contract per Bucket-3/Bucket-5 candidate: +- `gh issue create --label "," --body-file ` — category `enhancement`/`bug`; state defaults to `ready-for-human`. Reserve `ready-for-agent` for a mechanism you validated in Phase 3, never a plausible one. +- Bucket 2: post the clarification comment with `gh issue comment`; ensure category+state labels are present. +- Verify every touched issue ends with exactly one category label and one state label. +- Close any issue whose feature has demonstrably landed (cite the merged PR/commit). + +## Post-step: stamp source retros + +After the doc is written, prepend each source retro (every one included, even Bucket 7/8) with: +```yaml +--- +consolidated: YYYY-MM-DD +--- +``` +Without the stamp the detector re-processes them next run. For retros consolidated by a prior pre-existing doc, stamp with that doc's date, not today's. + +## Common mistakes + +- **Paraphrasing the signal instead of quoting it.** Your summary is not the evidence. Quote verbatim. +- **Anchoring on the session UUID or a retro filename.** Both are local-only. Anchor on the commit SHA; UUID is a secondary hint. +- **Stating the retro's guessed cause as fact.** Pain is fact; cause is hypothesis. Label it. +- **Bucketing raw signals instead of validated causes.** The buckets format Phase-3 output. Classifying raw symptoms straight into buckets is the sort-not-diagnose failure. +- **Codifying a workaround as a pattern.** Run the workaround-vs-win probe on every `## What Worked` item first. +- **Skipping the Phase-2 reconstruction.** Without rebuilding what the system did from git/config, cross-retro and invisible-work causes stay invisible. +- **Filing N tickets for one underlying gap.** Collapse to one broad problem statement. +- **Loading all retros into main context, or splitting the read across subagents.** One reader, verbatim; diagnose in the main thread. + +## Red flags — STOP + +| Thought | Reality | +|---|---| +| "I've bucketed the signals, now I'll file." | You sorted, you didn't diagnose. Do Phase 2–3 first; buckets format *validated causes*. | +| "This went in `## What Worked`, so it's a pattern to codify." | Maybe it's a workaround for an upstream problem. Run the workaround-vs-win probe. | +| "The retro's proposed fix sounds right, I'll file it." | That's the author's guess. Re-derive and validate against the artifact, or you ship a closed-as-mis-framed issue. | +| "Every retro mentions merge pain — five separate findings." | Probably ONE upstream cause (e.g. squash divergence) seen five ways. Reconstruct, then collapse. | +| "No retro complains about it, so it's fine." | Normalized/invisible work hides the biggest causes. Probe for it explicitly. | +| "The cause is obvious / ready-for-agent." | Unvalidated = the retro's guess with a label. Validate against the artifact; default `ready-for-human`. | +| "I'll write `Validated:` / `confirmed` here." | Not without a receipt on the next line. A verdict with no `file:line` / command-output / transcript quote is a forgery → INSUFFICIENT. | +| "The retro says 70.41%, I'll quote that." | The retro's number is a claim, not a receipt. Pull it from the transcript by session id first — it may belong to a different file. | +| "I'll summarize the pain in my own words." | Paraphrase loses the signal. Quote verbatim. | diff --git a/.gitignore b/.gitignore index 02be7d6..6180b58 100644 --- a/.gitignore +++ b/.gitignore @@ -1,5 +1,6 @@ .DS_Store .env -.claude/ +.claude/* +!.claude/skills/ .swarmforge/ .worktrees/ diff --git a/docs/superpowers/specs/2026-06-14-adr0002-clear-first-delivery-design.md b/docs/superpowers/specs/2026-06-14-adr0002-clear-first-delivery-design.md new file mode 100644 index 0000000..d8c1b1d --- /dev/null +++ b/docs/superpowers/specs/2026-06-14-adr0002-clear-first-delivery-design.md @@ -0,0 +1,103 @@ +# ADR 0002 Clear-First Delivery Engine — Design Spec + +**Date:** 2026-06-14 +**Branch:** feat/fork-divergences-main → PR #31 +**Status:** Approved for TDD implementation + +## Problem + +ADR 0002 specifies a clear-first delivery engine for claude-backend roles. When the engine was designed, the cmux multiplexer provided the Stop hook. When cmux was dropped, the engine was lost. The executing-fields pending item was addressed (commit `7af75c3`) but the engine itself was never built. Upstream types each handoff directly into the terminal with no context clear; the fork requires `/clear` → re-inject bundle → deliver task. + +## Scope + +Claude backend only. `codex`/`grok`/`copilot` keep upstream delivery (direct `notify-agent.sh` tmux path) unchanged. The `claude`/`codex` choice is a per-role config knob (ADR 0012); this ADR only covers the claude path. + +## Shared State + +**Pending queue:** `.swarmforge/handoffs/queue/pending//` +- Files named `---.txt` +- Content: full protocol message (same envelope `send-handoff.sh` builds) +- Written by sender; drained by Stop hook + +**Busy marker:** `.swarmforge/.busy` +- Present = role is executing; absent = idle and accepting delivery +- Created atomically with zsh `noclobber` (`set -C; > file`); only the winner delivers + +## Delivery Function (`handoff-lib.sh`) + +``` +handoff_clear_first_deliver +``` + +1. Look up target tmux session from `.swarmforge/sessions.tsv` +2. Read socket from `.swarmforge/tmux-socket`; also try `TMUX` env var fallback (same as `notify-agent.sh`) +3. Send `/clear\n` to tmux session; `sleep 1` +4. If `.swarmforge/prompts/.md` exists: send its content + C-m + C-j; `sleep 0.5` +5. Send protocol message content + C-m + C-j + +**No logbook write here.** The delivery function is called from both the sender (idle path) and the Stop hook (busy path). The sender's `$PWD` is the wrong worktree for the receiver's logbook. Instead: +- **Stop hook writes `executing`** to `$PWD/logbook.jsonl` (correct — hook runs in receiver's worktree) before calling this function +- **Idle path** gets its `executing` entry from `complete-handoff.sh` (called by the agent after receipt) — same as upstream + +## Idle Path (`send-handoff.sh`) + +After building the protocol message, replace the direct `notify-agent.sh ` call with: + +1. Look up target agent type from `.swarmforge/sessions.tsv` (agent column) +2. If not `claude`: fall back to existing `notify-agent.sh "$TARGET" --file "$ARCHIVE_FILE"` (unchanged) +3. If `claude`: + a. Write message to pending queue dir + b. Attempt atomic `set -C; > .swarmforge/.busy` + c. If succeeded (was idle): call `handoff_clear_first_deliver` → role is now busy + d. If failed (already busy): message stays in pending queue; Stop hook drains it + +## Busy Path (`swarm-stop.sh` — new Stop hook) + +Receives JSON on stdin from Claude Code (`session_id`, `cwd`, `hook_event_name`). + +1. Read `SWARMFORGE_ROLE`; if unset, exit 0 (not a swarmforge role) +2. Derive `project_dir` from `cwd` field in stdin JSON (or git fallback) +3. Re-check pending queue — **do NOT noclobber here** (in the normal busy-path the marker is already set from the delivery that started this task; noclobber would cause the hook to bail and never drain the queue) +4. If queue non-empty: `touch .busy` (ensure busy, idempotent), write `executing` logbook entry to `$PWD/logbook.jsonl` (hook runs in receiver's worktree), call `handoff_clear_first_deliver`, remove pending file, exit 0 (marker stays = busy) +5. If queue empty: delete `.busy` marker (role goes idle), exit 0 + +The ADR's "re-check before declaring idle" race closure: between the `ls` returning empty and the `rm .busy`, a sender may have written to the pending dir. That sender's `noclobber` wins (`.busy` was just deleted) and delivers immediately. No message is lost. + +## Settings Wiring (`write_worktree_settings`) + +Third parameter: `stop_script` (absolute path to `swarm-stop.sh`). + +Python RMW adds to `.claude/settings.local.json`: +```json +{ + "hooks": { + "Stop": [{"matcher": "", "hooks": [{"type": "command", "command": ""}]}] + } +} +``` + +Called for ALL claude roles in `launch_role` (not just advisor-having ones): +```sh +write_worktree_settings "$role_worktree" "$role_advisor" "$role_script_dir/swarm-stop.sh" +``` + +Non-claude roles: existing call pattern (advisor only, no stop script). + +## Launch Change (PR comment 2 resolution) + +Drop the positional `"$(cat '$prompt_file')"` from the claude `launch_cmd`. The session starts with `--append-system-prompt-file` (system prompt, survives `/clear`), then waits idle. The first task arrives via clear-first delivery which re-injects the bundle as the first conversational message. + +## Presence Ping Exclusion + +Upstream's startup "I'm awake" ping uses `message type: presence`. The Stop hook must not deliver presence messages via the clear-first path. In practice: the pending queue only contains messages put there by `send-handoff.sh` (which only handles `handoff` and `resend-request` types). Presence pings are not routed through `send-handoff.sh`, so they never enter the pending queue. No special check needed. + +## Test Checkpoints (TDD) + +1. `handoff_clear_first_deliver` — mock tmux, verify call sequence: clear → sleep 1 → bundle → message (no logbook write in this function) +2. `send-handoff.sh` idle path — claude target, no `.busy`: pending file written, `.busy` created, delivery called +3. `send-handoff.sh` busy path — claude target, `.busy` pre-exists: pending file written, no delivery +4. `send-handoff.sh` non-claude — codex target: no pending file, `notify-agent.sh` called directly +5. `swarm-stop.sh` queue non-empty — pending file present: executing entry written to logbook, delivery called, pending file removed, `.busy` stays +6. `swarm-stop.sh` queue empty — no pending file: `.busy` deleted +7. `swarm-stop.sh` with queue non-empty AND `.busy` already set (normal busy-path case) — hook delivers, does NOT bail on pre-existing marker +8. `write_worktree_settings` with stop_script — resulting JSON has `hooks.Stop` entry with correct command diff --git a/swarmforge/scripts/complete-handoff.sh b/swarmforge/scripts/complete-handoff.sh index b5949b4..76f3a34 100755 --- a/swarmforge/scripts/complete-handoff.sh +++ b/swarmforge/scripts/complete-handoff.sh @@ -1,6 +1,9 @@ #!/usr/bin/env zsh set -euo pipefail +SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)" +source "$SCRIPT_DIR/handoff-lib.sh" + usage() { echo "Usage: complete-handoff.sh --file " >&2 } @@ -46,5 +49,14 @@ if [[ -e "$target" ]]; then target="$COMPLETED_DIR/$(date '+%Y%m%d-%H%M%S')-$base" fi +message="$(< "$QUEUE_FILE")" +msg_hash="$(handoff_field "commit hash" "$QUEUE_FILE" 2>/dev/null || true)" +msg_sender="$(handoff_field "sender role" "$QUEUE_FILE" 2>/dev/null || true)" +printf '{"timestamp":"%s","direction":"executing","message":"%s","hash":"%s","sender":"%s"}\n' \ + "$(handoff_json_escape "$(handoff_timestamp)")" \ + "$(handoff_json_escape "$message")" \ + "$(handoff_json_escape "$msg_hash")" \ + "$(handoff_json_escape "$msg_sender")" >> "$(handoff_logbook_file)" + mv "$QUEUE_FILE" "$target" echo "COMPLETED $target" diff --git a/swarmforge/scripts/handoff-lib.sh b/swarmforge/scripts/handoff-lib.sh index f1f2a4a..46fd858 100755 --- a/swarmforge/scripts/handoff-lib.sh +++ b/swarmforge/scripts/handoff-lib.sh @@ -169,3 +169,130 @@ handoff_queue_accepted() { printf '%s' "$message" > "$file" echo "$file" } + +handoff_project_dir() { + local git_common_dir candidate + + if git_common_dir=$(git -C "$PWD" rev-parse --git-common-dir 2>/dev/null); then + if [[ "$git_common_dir" != /* ]]; then + git_common_dir="$(cd "$PWD/$git_common_dir" && pwd)" + fi + candidate="${git_common_dir:h}" + if [[ -f "$candidate/.swarmforge/sessions.tsv" ]]; then + echo "$candidate" + return 0 + fi + fi + + echo "$PWD" +} + +handoff_project_dir_from() { + local start="$1" + local git_common_dir candidate + + if git_common_dir=$(git -C "$start" rev-parse --git-common-dir 2>/dev/null); then + if [[ "$git_common_dir" != /* ]]; then + git_common_dir="$(cd "$start/$git_common_dir" && pwd)" + fi + candidate="${git_common_dir:h}" + if [[ -f "$candidate/.swarmforge/sessions.tsv" ]]; then + echo "$candidate" + return 0 + fi + fi + + echo "$start" +} + +handoff_pending_dir() { + local project_dir="$1" + local role="$2" + echo "$project_dir/.swarmforge/handoffs/queue/pending/$role" +} + +handoff_busy_file() { + local project_dir="$1" + local role="$2" + echo "$project_dir/.swarmforge/$role.busy" +} + +handoff_agent_type() { + local project_dir="$1" + local role="$2" + local sessions_file="$project_dir/.swarmforge/sessions.tsv" + local idx r session display agent + + [[ -f "$sessions_file" ]] || return 1 + while IFS=$'\t' read -r idx r session display agent; do + if [[ "${r:l}" == "${role:l}" ]]; then + echo "$agent" + return 0 + fi + done < "$sessions_file" + return 1 +} + +handoff_clear_first_deliver() { + local project_dir="$1" + local role="$2" + local message_file="$3" + + local sessions_file="$project_dir/.swarmforge/sessions.tsv" + local tmux_socket_file="$project_dir/.swarmforge/tmux-socket" + local tmux_env_file="$project_dir/.swarmforge/tmux-env" + local bundle_file="$project_dir/.swarmforge/prompts/${role}.md" + + local target_session="" + local idx r session display agent + while IFS=$'\t' read -r idx r session display agent; do + if [[ "${r:l}" == "${role:l}" ]]; then + target_session="$session" + break + fi + done < "$sessions_file" + + if [[ -z "$target_session" ]]; then + echo "handoff_clear_first_deliver: role not found in sessions.tsv: $role" >&2 + return 1 + fi + + if [[ -z "${TMUX:-}" && -f "$tmux_env_file" ]]; then + TMUX="$(< "$tmux_env_file")" + export TMUX + fi + + local -a tmux_cmd=() + if [[ -n "${TMUX:-}" ]]; then + tmux_cmd=(tmux send-keys -t "$target_session") + else + local socket + socket="$(< "$tmux_socket_file")" + tmux_cmd=(tmux -S "$socket" send-keys -t "$target_session") + fi + + # Send /clear then wait for context to reset + "${tmux_cmd[@]}" -l -- '/clear' + sleep 0.15 + "${tmux_cmd[@]}" C-m + sleep 0.05 + "${tmux_cmd[@]}" C-j + sleep 1 + + # Re-inject bundle if present + if [[ -f "$bundle_file" ]]; then + "${tmux_cmd[@]}" -l -- "$(< "$bundle_file")" + sleep 0.15 + "${tmux_cmd[@]}" C-m + sleep 0.05 + "${tmux_cmd[@]}" C-j + sleep 0.5 + fi + + # Send protocol message + "${tmux_cmd[@]}" -l -- "$(< "$message_file")" + sleep 0.15 + "${tmux_cmd[@]}" C-m + sleep 0.05 + "${tmux_cmd[@]}" C-j +} diff --git a/swarmforge/scripts/install-pins.conf b/swarmforge/scripts/install-pins.conf new file mode 100644 index 0000000..2235b98 --- /dev/null +++ b/swarmforge/scripts/install-pins.conf @@ -0,0 +1,5 @@ +# Pinned external dependency versions for swarm install/upgrade. +# Bump a SHA here and commit on main to pull in a newer version. + +# entireio/skills — installed to .claude/skills/ in the target project +ENTIRE_SKILLS_SHA=4c9a02513c3ec6ebabd9a9dc6bd8240854a218ac diff --git a/swarmforge/scripts/send-handoff.sh b/swarmforge/scripts/send-handoff.sh index efe076b..98188fb 100755 --- a/swarmforge/scripts/send-handoff.sh +++ b/swarmforge/scripts/send-handoff.sh @@ -102,7 +102,25 @@ EOF ARCHIVE_FILE="$(handoff_temp_file "send-handoff")" printf '%s' "$MESSAGE" > "$ARCHIVE_FILE" handoff_archive_sent "$STREAM" "$SEQUENCE" "$MESSAGE" -notify-agent.sh "$TARGET" --file "$ARCHIVE_FILE" + +PROJECT_DIR="$(handoff_project_dir)" +TARGET_AGENT="$(handoff_agent_type "$PROJECT_DIR" "$TARGET" 2>/dev/null || echo 'unknown')" + +if [[ "$TARGET_AGENT" != "claude" ]]; then + notify-agent.sh "$TARGET" --file "$ARCHIVE_FILE" +else + PENDING_DIR="$(handoff_pending_dir "$PROJECT_DIR" "$TARGET")" + mkdir -p "$PENDING_DIR" + PENDING_FILE="$PENDING_DIR/${MESSAGE_PRIORITY}-$(handoff_id_timestamp)-${STREAM}-${SEQUENCE}.txt" + printf '%s' "$MESSAGE" > "$PENDING_FILE" + + BUSY_FILE="$(handoff_busy_file "$PROJECT_DIR" "$TARGET")" + if ( set -C; > "$BUSY_FILE" ) 2>/dev/null; then + handoff_clear_first_deliver "$PROJECT_DIR" "$TARGET" "$PENDING_FILE" + rm -f "$PENDING_FILE" + fi +fi + handoff_append_logbook "sent" "$MESSAGE" "$MESSAGE_TYPE $MESSAGE_ID sent to $TARGET" echo "Sent $MESSAGE_ID" diff --git a/swarmforge/scripts/swarm-stop.sh b/swarmforge/scripts/swarm-stop.sh new file mode 100755 index 0000000..f9331da --- /dev/null +++ b/swarmforge/scripts/swarm-stop.sh @@ -0,0 +1,53 @@ +#!/usr/bin/env zsh +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)" +source "$SCRIPT_DIR/handoff-lib.sh" + +_swarm_stop_main() { + local role="${SWARMFORGE_ROLE:-}" + if [[ -z "$role" ]]; then + return 0 + fi + + local stdin_json="" + if [[ ! -t 0 ]]; then + stdin_json="$(cat)" + fi + + local cwd="" + if [[ -n "$stdin_json" ]]; then + cwd="$(printf '%s' "$stdin_json" | python3 -c 'import sys,json; print(json.load(sys.stdin).get("cwd",""))' 2>/dev/null || true)" + fi + + local project_dir="" + if [[ -n "$cwd" && -d "$cwd" ]]; then + project_dir="$(handoff_project_dir_from "$cwd")" + else + project_dir="$(handoff_project_dir)" + fi + + local pending_dir busy_file + pending_dir="$(handoff_pending_dir "$project_dir" "$role")" + busy_file="$(handoff_busy_file "$project_dir" "$role")" + + if [[ ! -d "$pending_dir" ]] || [[ -z "$(ls -A "$pending_dir" 2>/dev/null)" ]]; then + rm -f "$busy_file" + return 0 + fi + + local pending_name pending_file pending_content + pending_name="$(ls "$pending_dir" | sort | head -1)" + pending_file="$pending_dir/$pending_name" + + touch "$busy_file" + + pending_content="$(< "$pending_file")" + handoff_append_logbook "executing" "$pending_content" "clear-first delivery from Stop hook" + + handoff_clear_first_deliver "$project_dir" "$role" "$pending_file" + + rm -f "$pending_file" +} + +_swarm_stop_main diff --git a/swarmforge/scripts/swarmforge.sh b/swarmforge/scripts/swarmforge.sh index 1d3e2a6..1755e9a 100755 --- a/swarmforge/scripts/swarmforge.sh +++ b/swarmforge/scripts/swarmforge.sh @@ -25,6 +25,7 @@ WINDOW_STATE_FILE="$STATE_DIR/windows.tsv" WINDOW_WATCHDOG_LOG="$STATE_DIR/window-watchdog.log" SESSIONS_FILE="$STATE_DIR/sessions.tsv" PROMPTS_DIR="$STATE_DIR/prompts" +QA_HOLDOUT_PATH="${SWARMFORGE_QA_HOLDOUT_PATH:-qa-e2e}" TMUX_SOCKET_DIR="/private/tmp/swarmforge-${UID}" PROJECT_SOCKET_ID="$(printf '%s' "$WORKING_DIR" | cksum)" PROJECT_SOCKET_ID="${PROJECT_SOCKET_ID%% *}" @@ -39,6 +40,9 @@ typeset -a SESSIONS=() typeset -a DISPLAY_NAMES=() typeset -a WORKTREE_NAMES=() typeset -a WORKTREE_PATHS=() +typeset -a ROLE_MODELS=() +typeset -a ROLE_EFFORTS=() +typeset -a ROLE_ADVISORS=() typeset -A ROLE_INDEX=() typeset -A WORKTREE_INDEX=() typeset -i CLEANUP_OWNER_INDEX=1 @@ -199,7 +203,7 @@ parse_config() { local -a fields fields=(${=line}) - if (( ${#fields[@]} != 4 )); then + if (( ${#fields[@]} < 4 )); then echo -e "${RED}Error:${RESET} Invalid config line $line_no: $line" exit 1 fi @@ -209,6 +213,18 @@ parse_config() { agent="${fields[3]:l}" worktree="${fields[4]}" + local role_model="" role_effort="" role_advisor="" kv key val kv_i + for (( kv_i = 5; kv_i <= ${#fields[@]}; kv_i++ )); do + kv="${fields[$kv_i]}" + key="${kv%%=*}" + val="${kv#*=}" + case "$key" in + model) role_model="$val" ;; + effort) role_effort="$val" ;; + advisor) role_advisor="$val" ;; + esac + done + if [[ "$keyword" != "window" ]]; then echo -e "${RED}Error:${RESET} Unknown config directive on line $line_no: $keyword" exit 1 @@ -251,6 +267,9 @@ parse_config() { SESSIONS+=("$(session_name_for_role "$role")") DISPLAY_NAMES+=("$(display_name_for_role "$role")") WORKTREE_NAMES+=("$worktree") + ROLE_MODELS+=("$role_model") + ROLE_EFFORTS+=("$role_effort") + ROLE_ADVISORS+=("$role_advisor") if [[ "$worktree" == "none" || "$worktree" == "master" ]]; then WORKTREE_PATHS+=("$WORKING_DIR") else @@ -279,7 +298,7 @@ write_sessions_file() { check_helper_scripts() { local helper - for helper in notify-agent.sh send-handoff.sh receive-handoff.sh resend-handoff.sh complete-handoff.sh handoff-lib.sh swarm-cleanup.sh swarm-window-watchdog.sh swarm-terminal-adapter.sh; do + for helper in notify-agent.sh send-handoff.sh receive-handoff.sh resend-handoff.sh complete-handoff.sh handoff-lib.sh swarm-cleanup.sh swarm-stop.sh swarm-window-watchdog.sh swarm-terminal-adapter.sh; do if [[ ! -x "$SCRIPT_DIR/$helper" ]]; then echo -e "${RED}Error:${RESET} Required helper script not found or not executable: $SCRIPT_DIR/$helper" exit 1 @@ -329,8 +348,9 @@ write_tmux_env_file() { } prepare_worktrees() { - local i worktree_name worktree_path branch_name + local i role worktree_name worktree_path branch_name for (( i = 1; i <= ${#ROLES[@]}; i++ )); do + role="${ROLES[$i]}" worktree_name="${WORKTREE_NAMES[$i]}" worktree_path="${WORKTREE_PATHS[$i]}" branch_name="swarmforge-${worktree_name}" @@ -342,6 +362,17 @@ prepare_worktrees() { if [[ ! -e "$worktree_path/.git" && ! -d "$worktree_path/.git" ]]; then git -C "$WORKING_DIR" worktree add --force -B "$branch_name" "$worktree_path" HEAD >/dev/null fi + write_worktree_settings "$worktree_path" + + if [[ "$role" != "specifier" && "$role" != "QA" ]]; then + git -C "$worktree_path" sparse-checkout init --no-cone >/dev/null 2>&1 + { + printf '/*\n' + printf '!/%s/\n' "$QA_HOLDOUT_PATH" + } > "$worktree_path/.git/info/sparse-checkout" 2>/dev/null \ + || git -C "$worktree_path" sparse-checkout set --no-cone '/*' "!/${QA_HOLDOUT_PATH}/" >/dev/null 2>&1 + git -C "$worktree_path" read-tree -mu HEAD >/dev/null 2>&1 || true + fi done } @@ -386,14 +417,98 @@ create_role_session() { tmux -S "$TMUX_SOCKET" set-window-option -t "$session:$title" allow-rename off } +# Single read-modify-write over a worktree's .claude/settings.local.json. Always +# applies the ADR 0020 auto-compaction keys; also sets the ADR 0012 advisor model +# when a non-empty one is passed. One shared writer for both concerns (ADR 0020). +write_worktree_settings() { + local worktree_path="$1" + local advisor_model="${2:-}" + local stop_script="${3:-}" + local settings_dir="$worktree_path/.claude" + local settings_file="$settings_dir/settings.local.json" + + mkdir -p "$settings_dir" + SETTINGS_FILE="$settings_file" ADVISOR_MODEL="$advisor_model" STOP_SCRIPT="$stop_script" python3 -c ' +import json, os +p = os.environ["SETTINGS_FILE"] +cfg = {} +try: + with open(p) as f: cfg = json.load(f) +except: pass +cfg["autoCompactEnabled"] = True +cfg.setdefault("env", {}) +cfg["env"]["CLAUDE_AUTOCOMPACT_PCT_OVERRIDE"] = "88" +cfg["env"]["CLAUDE_CODE_AUTO_COMPACT_WINDOW"] = "200000" +advisor = os.environ.get("ADVISOR_MODEL", "") +if advisor: + cfg["advisorModel"] = advisor +stop = os.environ.get("STOP_SCRIPT", "") +if stop: + cfg.setdefault("hooks", {}) + cfg["hooks"]["Stop"] = [{"matcher": "", "hooks": [{"type": "command", "command": stop}]}] +with open(p, "w") as f: json.dump(cfg, f, indent=2) + ' +} + +resolve_prompt_bundle() { + local role="$1" + typeset -a bundle=() + typeset -A seen=() + typeset -a queue=("$CONSTITUTION_FILE" "$ROLES_DIR/${role}.prompt") + local file rel_path ref ref_abs + + while (( ${#queue[@]} > 0 )); do + file="${queue[1]}" + shift queue + + rel_path="${file#${WORKING_DIR}/}" + [[ ${+seen[$rel_path]} -eq 1 ]] && continue + [[ ! -f "$file" ]] && continue + + seen[$rel_path]=1 + bundle+=("$rel_path") + + while IFS= read -r ref; do + [[ -z "$ref" ]] && continue + ref_abs="$WORKING_DIR/$ref" + [[ ${+seen[$ref]} -eq 0 ]] && queue+=("$ref_abs") + done < <(grep -oE 'swarmforge/[A-Za-z0-9_./-]+\.prompt' "$file" 2>/dev/null || true) + done + + printf '%s\n' "${bundle[@]}" +} + write_agent_instruction_file() { local role="$1" local prompt_file="$2" - - cat > "$prompt_file" <\n' "$role" + printf '\n' + printf 'This prompt bundle is pre-resolved. Do not open or re-read any swarmforge/*.prompt files — all relevant instructions are already included below. Project knowledge files (AGENTS.md and your role file under .agents/roles/) are included below when present.\n' + printf '\n' + for rel in "${bundle_files[@]}"; do + abs_path="$WORKING_DIR/$rel" + [[ -f "$abs_path" ]] || continue + printf '\n' "$rel" + cat "$abs_path" + printf '\n\n' + done + for knowledge in "AGENTS.md" ".agents/roles/${role}.md"; do + abs_path="$WORKING_DIR/$knowledge" + [[ -f "$abs_path" ]] || continue + printf '\n' "$knowledge" + cat "$abs_path" + printf '\n\n' + done + printf '\n' + } > "$prompt_file" } send_initial_grok_prompt() { @@ -421,6 +536,9 @@ launch_role() { local role_script_dir="$role_worktree/swarmforge/scripts" local prompt_file="$PROMPTS_DIR/${role}.md" local launch_cmd="" + local role_model="${ROLE_MODELS[$index]}" + local role_effort="${ROLE_EFFORTS[$index]}" + local role_advisor="${ROLE_ADVISORS[$index]}" write_agent_instruction_file "$role" "$prompt_file" @@ -430,16 +548,31 @@ launch_role() { case "$agent" in claude) - launch_cmd="export SWARMFORGE_ROLE='$role' && export PATH='$role_script_dir':\$PATH && cd '$role_worktree' && claude --append-system-prompt-file '$prompt_file' --permission-mode acceptEdits -n 'SwarmForge ${display}' \"\$(cat '$prompt_file')\"" + write_worktree_settings "$role_worktree" "$role_advisor" "$role_script_dir/swarm-stop.sh" + local claude_flags="" + [[ -n "$role_model" ]] && claude_flags+=" --model ${(q)role_model}" + [[ -n "$role_effort" ]] && claude_flags+=" --effort ${(q)role_effort}" + launch_cmd="export SWARMFORGE_ROLE='$role' && export PATH='$role_script_dir':\$PATH && cd '$role_worktree' && claude${claude_flags} --append-system-prompt-file '$prompt_file' --permission-mode auto -n 'SwarmForge ${display}'" ;; codex) - launch_cmd="export SWARMFORGE_ROLE='$role' && export PATH='$role_script_dir':\$PATH && cd '$role_worktree' && codex -C '$role_worktree' \"\$(cat '$prompt_file')\"" + [[ -n "$role_advisor" ]] && write_worktree_settings "$role_worktree" "$role_advisor" + local codex_flags="" + [[ -n "$role_model" ]] && codex_flags+=" -c model=${(q)role_model}" + launch_cmd="export SWARMFORGE_ROLE='$role' && export PATH='$role_script_dir':\$PATH && cd '$role_worktree' && codex${codex_flags} -C '$role_worktree' \"\$(cat '$prompt_file')\"" ;; copilot) - launch_cmd="export SWARMFORGE_ROLE='$role' && export PATH='$role_script_dir':\$PATH && cd '$role_worktree' && copilot -C '$role_worktree' --name 'SwarmForge ${display}' -i \"\$(cat '$prompt_file')\"" + [[ -n "$role_advisor" ]] && write_worktree_settings "$role_worktree" "$role_advisor" + local copilot_flags="" + [[ -n "$role_model" ]] && copilot_flags+=" --model ${(q)role_model}" + [[ -n "$role_effort" ]] && copilot_flags+=" --effort ${(q)role_effort}" + launch_cmd="export SWARMFORGE_ROLE='$role' && export PATH='$role_script_dir':\$PATH && cd '$role_worktree' && copilot${copilot_flags} -C '$role_worktree' --name 'SwarmForge ${display}' -i \"\$(cat '$prompt_file')\"" ;; grok) - launch_cmd="export SWARMFORGE_ROLE='$role' && export PATH='$role_script_dir':\$PATH && cd '$role_worktree' && grok --cwd '$role_worktree' --permission-mode acceptEdits --rules \"\$(cat '$prompt_file')\"" + [[ -n "$role_advisor" ]] && write_worktree_settings "$role_worktree" "$role_advisor" + local grok_flags="" + [[ -n "$role_model" ]] && grok_flags+=" --model ${(q)role_model}" + [[ -n "$role_effort" ]] && grok_flags+=" --effort ${(q)role_effort}" + launch_cmd="export SWARMFORGE_ROLE='$role' && export PATH='$role_script_dir':\$PATH && cd '$role_worktree' && grok${grok_flags} --cwd '$role_worktree' --permission-mode auto --rules \"\$(cat '$prompt_file')\"" ;; esac @@ -460,6 +593,57 @@ launch_role() { echo -e " ${CYAN}[${display}]${RESET} started in session ${session}" } +install_skills() { + local skills_src="$SCRIPT_DIR/../skills" + local skills_dst="$WORKING_DIR/.claude/skills" + local pins_file="$SCRIPT_DIR/install-pins.conf" + + [[ ! -f "$pins_file" ]] && return 0 + # shellcheck source=/dev/null + source "$pins_file" + + echo -e "${CYAN}Installing skills...${RESET}" + mkdir -p "$skills_dst" + + if [[ -d "$skills_src/agent-retro" ]]; then + rm -rf "$skills_dst/agent-retro" + cp -R "$skills_src/agent-retro" "$skills_dst/agent-retro" + echo -e " ${GREEN}✓${RESET} agent-retro" + else + echo -e " ${YELLOW}⚠${RESET} agent-retro not found at $skills_src/agent-retro — skipping" + fi + + local tmp_skills + tmp_skills="$(mktemp -d)" + local entire_url="https://github.com/entireio/skills/archive/${ENTIRE_SKILLS_SHA}.tar.gz" + if curl -fsSL "$entire_url" | tar -xz --strip-components=1 -C "$tmp_skills" 2>/dev/null; then + for skill_dir in "$tmp_skills/skills"/*/; do + local skill_name + skill_name="$(basename "$skill_dir")" + rm -rf "$skills_dst/$skill_name" + cp -R "$skill_dir" "$skills_dst/$skill_name" + done + rm -rf "$tmp_skills" + echo -e " ${GREEN}✓${RESET} entire skills (${ENTIRE_SKILLS_SHA:0:8})" + printf '%s\n' "$ENTIRE_SKILLS_SHA" > "$STATE_DIR/skills-installed" + else + rm -rf "$tmp_skills" + echo -e " ${YELLOW}⚠${RESET} entire skills unavailable (no network?) — proceeding without them" + fi +} + +ensure_skills_installed() { + local pins_file="$SCRIPT_DIR/install-pins.conf" + [[ ! -f "$pins_file" ]] && return 0 + # shellcheck source=/dev/null + source "$pins_file" + local installed_sentinel="$STATE_DIR/skills-installed" + if [[ -f "$installed_sentinel" ]] && [[ "$(< "$installed_sentinel")" == "$ENTIRE_SKILLS_SHA" ]]; then + return 0 + fi + install_skills +} + choose_cleanup_owner() { CLEANUP_OWNER_INDEX=1 } @@ -472,6 +656,13 @@ ensure_runtime_git_excludes install_shared_constitution_articles "$WORKING_DIR" parse_config check_backend_dependencies +ensure_skills_installed + +if [[ ! -f "$STATE_DIR/setup-complete" ]]; then + echo -e "${RED}Error:${RESET} project is not swarm-ready. Run /setup-swarm first." >&2 + exit 1 +fi + prepare_workspace prepare_worktrees choose_cleanup_owner diff --git a/swarmforge/skills/agent-retro/SKILL.md b/swarmforge/skills/agent-retro/SKILL.md new file mode 100644 index 0000000..58a8bd0 --- /dev/null +++ b/swarmforge/skills/agent-retro/SKILL.md @@ -0,0 +1,183 @@ +--- +name: agent-retro +description: Run a conversation retrospective — analyze what happened in this session, what worked, what didn't, and propose concrete improvements. Use when the user says "retro", "retrospective", "what happened in this session", "session review", "what did we do", "analyze this conversation", or when wrapping up a long session. Especially useful after using a skill you're developing. In swarmforge: invoked automatically as the last step before each role goes idle. +compatibility: Primary — requires `entire` CLI (0.6.2+) for transcript extraction. Fallback — Claude Code ~/.claude/projects/ path. Python 3.8+ for the extraction script. +metadata: + author: gabadi/swarm-forge (fork of giannimassi/agent-retro) + version: "0.1.0" +--- + +# agent-retro + +## Step 1 — Extract Session Data + +**Primary path (entire):** +1. Run `entire session current --json` to get the active session ID and worktree path. +2. If a session ID is returned: + - Run `entire session info --transcript > /tmp/retro-session.jsonl` + - Verify: `python3 ${CLAUDE_SKILL_DIR}/scripts/extract.py /tmp/retro-session.jsonl --metadata-only` + - If verification succeeds, run full extraction: `python3 ${CLAUDE_SKILL_DIR}/scripts/extract.py /tmp/retro-session.jsonl --summary > /tmp/retro-extract.json` + - Proceed to Step 2 with `/tmp/retro-extract.json`. + +**Fallback path (Claude Code only):** +If `entire` is not installed or `entire session current` returns no session: +1. Look for session pid files in `~/.claude/sessions/*.json`. Read each, match `cwd` to `$PWD`. Take the most recently modified matching entry. +2. If found: use the `sessionId` to find the transcript in `~/.claude/projects//.jsonl`. +3. If not found via pid: take the most recently modified `.jsonl` in `~/.claude/projects//`. +4. Verify: `python3 ${CLAUDE_SKILL_DIR}/scripts/extract.py --metadata-only` +5. Run full extraction: `python3 ${CLAUDE_SKILL_DIR}/scripts/extract.py --summary > /tmp/retro-extract.json` + +**If no transcript is found:** Report "No session transcript found" and stop. Do not fabricate data. + +Raw JSONL is 1MB+ per session — never stream transcript bytes inline into context. Always write to a temp file and pass the path to extract.py. + +--- + +## Step 2 — Read the Conversation Arc + +Read `conversation_arc` from `/tmp/retro-extract.json`. This is the full story of the session: every user message and assistant response in order. + +Identify: +- User corrections ("no, not that", "stop", "undo", "wrong") +- Redirects (user changing direction mid-task) +- Repeated instructions (same request given more than once) +- Pivots (abandoned approaches) +- Friction moments (back-and-forth on a single point) + +--- + +## Step 3 — Classify Outcomes + +Classify what the session produced: +- New code / feature +- Bug fix +- Communication (messages, comments, docs) +- Setup / configuration changes +- Spec or design artifact +- Process improvement +- Review or analysis +- Research +- Skill development + +A session may have multiple outcomes. + +--- + +## Step 4 — Analyze What Worked + +Identify: +- First-try successes (task completed without corrections) +- Efficient delegation (agents dispatched with clear scope) +- Good skill matches (right skill for the task) +- Clean conversation flow (no redirects) +- Smart tool choice (right tool, right scope) + +--- + +## Step 5 — Analyze What Didn't Work + +Identify friction patterns: +- User corrections, redirects, repetitions, stops, frustration signals +- Wasted agent dispatches (dispatched but result unused) +- Oversized tool results (large reads never referenced) +- Tool call retries (same tool called multiple times for the same target) +- Abandoned approaches (started, then discarded) +- Over-engineering (more than the task required) +- Under-specification (task started with insufficient context) + +For skill-development retros: read the active SKILL.md (`${CLAUDE_SKILL_DIR}/SKILL.md` of the skill being developed) and identify which instruction caused each friction. + +Read `tool_result_sizes` from the extract — flag any tool result over 50KB that was followed by no further reference to that file. + +--- + +## Step 6 — Propose Actions + +Lead with the defense-first question: **"What defensive rule did this session's work absorb that future maintainers must keep intact?"** Answer it before cataloging friction — rule-shaped learnings surface before cause-shaped ones. + +Capture-first guard: enumerate every candidate learning from Steps 4–5 in full before writing anything to the retro file. Do not filter for "obviousness" or "self-correcting" here — capture everything; the curation stage downstream owns discards. + +For each friction pattern, propose one of these action types: +- `skill-update` — change an existing skill. Include before/after text. +- `skill-create` — create a new skill. +- `rule-update` — change a rule or instruction in CLAUDE.md or a role prompt. +- `rule-create` — create a new rule. +- `setup-change` — change a configuration or environment setting. +- `memory-update` — update or create a memory entry. +- `investigate` — flag something for human review (uncertain root cause). +- `acknowledge` — nothing to change; note what worked well. + +Be specific. "Improve X" is not a proposal. "Change the wording in Step 3 from Y to Z" is a proposal. + +**Scope** — tag every proposed action with exactly one scope value: +- `project` — knowledge about the target project (its code, config, tools, conventions). +- `swarmforge` — knowledge about the harness itself (role prompts, constitution, scripts, pipeline mechanics). +- `skill` — a reusable procedure that should become or amend a skill. +- `ephemeral` — true one-offs; recorded for audit, never promoted. + +--- + +## Step 7 — Write the Retro File + +Write to `~/.claude/worklog/retros/YYYY-MM-DD-.md` where `` is a 3–5 word kebab-case summary of the session. + +Structure: +```markdown +# Session Retro: +Date: YYYY-MM-DD +Session ID: +Role: +Branch: +Duration: m +Cost: $ + +## Token Budget +| Category | Tokens | Cost | +|---|---|---| +| Input | N | $N | +| Output | N | $N | +| Cache create | N | $N | +| Cache read | N | $N | +| **Total** | **N** | **$N** | + +## Tool Result Waste + + +## What Worked + + +## What Didn't Work + + +## Actions +| # | Type | Scope | Description | Target | +|---|------|-------|-------------|--------| +| 1 | skill-update | project | ... | ... | +``` + +--- + +## Step 8 — Walk Through Actions + +Determine the mode: + +**Interactive session (a human is present):** +- Present the retro file path and summary counts (N worked, N didn't work, N actions). +- Walk through each proposed action one by one: show type, scope, description, target. Ask: "Apply? [y/n/defer]". Apply approved actions immediately; mark deferred/skipped in the table. +- After the walkthrough, show the final action table with statuses. + +**Autonomous session (swarmforge role, no human in the loop):** +- Do not ask anything. Do not apply any action. +- Mark every action's status as `pending-curation` in the table and finish the retro file. +- The curator role consumes the file downstream; your only job is complete, well-tagged capture. + +--- + +## Step 9 — Preemptive Handoff Recommendation + +Check `session` metadata from the extract: +- If `turn_count` > 500, `duration_seconds` > 14400 (4h), or `estimated_cost_usd` > 300: + - Add a `investigate` action: "Session size threshold reached — consider handoff" + - Include two ready-to-paste prompts: + - For `/compact`: "Continue from: " + - For `/clear`: "Resume from: — key context: <3 bullet points>" diff --git a/swarmforge/skills/agent-retro/scripts/extract.py b/swarmforge/skills/agent-retro/scripts/extract.py new file mode 100644 index 0000000..21c8794 --- /dev/null +++ b/swarmforge/skills/agent-retro/scripts/extract.py @@ -0,0 +1,630 @@ +#!/usr/bin/env python3 +""" +Extract structured data from a Claude Code session transcript (JSONL). + +Usage: + python extract.py [--subagents-dir ] [--summary] [--metadata-only] + +Outputs JSON to stdout. Use --summary for a compact version that omits +individual tool call details (just counts and key events). +Use --metadata-only for cheap session verification (head/tail read only). +""" + +import json +import sys +import os +import glob +from collections import Counter, defaultdict +from pathlib import Path +from datetime import datetime + +# Approximate pricing per million tokens, by model family. +# Update these when Anthropic changes pricing and bump PRICING_LAST_VERIFIED. +# cache_create = 1.25x input (5-minute TTL); cache_read = 0.1x input. +PRICING_LAST_VERIFIED = "2026-06-14" +PRICE_TABLE = { + "opus": {"input": 5.0, "output": 25.0, "cache_create": 6.25, "cache_read": 0.50}, + "sonnet": {"input": 3.0, "output": 15.0, "cache_create": 3.75, "cache_read": 0.30}, + "haiku": {"input": 1.0, "output": 5.0, "cache_create": 1.25, "cache_read": 0.10}, + "fable": {"input": 10.0, "output": 50.0, "cache_create": 12.5, "cache_read": 1.00}, +} +# Fall back to the most expensive family for an unknown/empty model so cost is +# never silently understated. +DEFAULT_PRICE_FAMILY = "opus" + + +def price_for_model(model): + """Map a model id/name to its pricing family. Unknown models fall back to + DEFAULT_PRICE_FAMILY.""" + m = (model or "").lower() + for family in ("haiku", "sonnet", "opus", "fable"): + if family in m: + return PRICE_TABLE[family] + return PRICE_TABLE[DEFAULT_PRICE_FAMILY] + + +def compute_cost(tokens, model): + """Cost in USD for a token-usage dict, priced for the given model.""" + p = price_for_model(model) + return ( + tokens["input_tokens"] / 1_000_000 * p["input"] + + tokens["output_tokens"] / 1_000_000 * p["output"] + + tokens["cache_creation_input_tokens"] / 1_000_000 * p["cache_create"] + + tokens["cache_read_input_tokens"] / 1_000_000 * p["cache_read"] + ) + +SCHEMA_VERSION = "0.1.0" + +# Head/tail buffer size for lite reads (matches Claude Code's LITE_READ_BUF_SIZE) +LITE_READ_BUF_SIZE = 65536 + + +def stream_jsonl(path): + """Yield parsed records one at a time without loading the full file.""" + with open(path) as f: + for line in f: + line = line.strip() + if line: + try: + yield json.loads(line) + except json.JSONDecodeError: + continue + + +def read_head_tail(path): + """Read first and last 64KB of a file. Returns (head_str, tail_str, file_size).""" + size = os.path.getsize(path) + with open(path, "rb") as f: + head_bytes = f.read(LITE_READ_BUF_SIZE) + head = head_bytes.decode("utf-8", errors="replace") + + if size <= LITE_READ_BUF_SIZE: + return head, head, size + + f.seek(max(0, size - LITE_READ_BUF_SIZE)) + tail_bytes = f.read(LITE_READ_BUF_SIZE) + tail = tail_bytes.decode("utf-8", errors="replace") + + return head, tail, size + + +def extract_json_field(text, key): + """Extract a JSON string field value without full parsing (regex-free). + Matches '"key":"value"' or '"key": "value"' patterns.""" + for pattern in [f'"{key}":"', f'"{key}": "']: + idx = text.find(pattern) + if idx < 0: + continue + start = idx + len(pattern) + i = start + while i < len(text): + if text[i] == "\\": + i += 2 + continue + if text[i] == '"': + return text[start:i] + i += 1 + return None + + +def extract_metadata_lite(path): + """Extract session metadata from head/tail only — no full parse. + Used for session verification and discovery.""" + head, tail, size = read_head_tail(path) + + # Extract from head (start of session) + session_id = extract_json_field(head, "sessionId") + cwd = extract_json_field(head, "cwd") + git_branch = extract_json_field(head, "gitBranch") + version = extract_json_field(head, "version") + start_time = extract_json_field(head, "timestamp") + + # Extract from tail (end of session). When the file exceeds the buffer the + # tail starts mid-line, so the first split element is a partial record whose + # timestamp would be wrong — drop it before scanning. Scan complete lines + # backwards for the last timestamp. + tail_lines = tail.split("\n") + if size > LITE_READ_BUF_SIZE and tail_lines: + tail_lines = tail_lines[1:] + end_time = None + for line in reversed(tail_lines): + ts = extract_json_field(line, "timestamp") + if ts: + end_time = ts + break + + # First user message for verification + first_prompt = None + for line in head.split("\n"): + if '"role":"user"' not in line and '"role": "user"' not in line: + continue + if '"tool_result"' in line: + continue + # Try to extract text content + text = extract_json_field(line, "text") + if text and not text.startswith(""): + first_prompt = text[:200] + break + + duration_seconds = None + if start_time and end_time: + start = parse_ts(start_time) + end = parse_ts(end_time) + if start and end: + duration_seconds = round((end - start).total_seconds()) + + return { + "session_id": session_id, + "cwd": cwd, + "git_branch": git_branch, + "version": version, + "start_time": start_time, + "end_time": end_time, + "duration_seconds": duration_seconds, + "file_size_bytes": size, + "first_prompt": first_prompt, + } + + +def parse_ts(ts_str): + """Parse ISO 8601 timestamp string to datetime.""" + if not ts_str: + return None + try: + return datetime.fromisoformat(ts_str.replace("Z", "+00:00")) + except (ValueError, TypeError): + return None + + +def extract_all_streaming(jsonl_path, subagents_dir=None, summary_mode=False): + """Main extraction pipeline using streaming — processes line-by-line.""" + + # Session metadata + session = { + "session_id": None, + "cwd": None, + "git_branch": None, + "version": None, + "start_time": None, + "end_time": None, + "duration_seconds": None, + "branches_seen": set(), + "model": None, + } + + # Token totals + tokens_total = { + "input_tokens": 0, + "output_tokens": 0, + "cache_creation_input_tokens": 0, + "cache_read_input_tokens": 0, + } + turn_count = 0 + + # Tool tracking + tool_calls = [] + tool_counts = Counter() + total_tool_calls = 0 + + # Tool result sizes (tool_use_id -> size in bytes) + tool_result_sizes = {} + + # Conversation arc + arc = [] + + # Git tracking + branches = set() + commits = [] + prs = [] + + # File tracking + files = defaultdict(set) + + for rec in stream_jsonl(jsonl_path): + # --- Session metadata --- + if rec.get("sessionId") and not session["session_id"]: + session["session_id"] = rec["sessionId"] + if rec.get("cwd") and not session["cwd"]: + session["cwd"] = rec["cwd"] + if rec.get("gitBranch"): + if not session["git_branch"]: + session["git_branch"] = rec["gitBranch"] + session["branches_seen"].add(rec["gitBranch"]) + branches.add(rec["gitBranch"]) + if rec.get("version") and not session["version"]: + session["version"] = rec["version"] + + ts = rec.get("timestamp") + if ts: + if not session["start_time"]: + session["start_time"] = ts + session["end_time"] = ts + + msg = rec.get("message", {}) + role = msg.get("role") + content = msg.get("content", "") + usage = msg.get("usage", {}) + + # --- Token usage (assistant messages only) --- + if usage and role == "assistant": + tokens_total["input_tokens"] += usage.get("input_tokens", 0) + tokens_total["output_tokens"] += usage.get("output_tokens", 0) + tokens_total["cache_creation_input_tokens"] += usage.get("cache_creation_input_tokens", 0) + tokens_total["cache_read_input_tokens"] += usage.get("cache_read_input_tokens", 0) + turn_count += 1 + if not session["model"] and msg.get("model"): + session["model"] = msg.get("model") + + # --- Process content blocks --- + if isinstance(content, list): + for block in content: + if not isinstance(block, dict): + continue + + block_type = block.get("type") + + # Tool use blocks (assistant calling tools) + if block_type == "tool_use": + name = block.get("name", "unknown") + tool_input = block.get("input", {}) + tool_counts[name] += 1 + total_tool_calls += 1 + + call_summary = { + "name": name, + "timestamp": ts, + "tool_use_id": block.get("id", ""), + } + + if name == "Agent": + call_summary["agent_description"] = tool_input.get("description", "") + call_summary["agent_type"] = tool_input.get("subagent_type", "") + call_summary["agent_model"] = tool_input.get("model", "") + call_summary["agent_prompt_preview"] = tool_input.get("prompt", "")[:300] + call_summary["run_in_background"] = tool_input.get("run_in_background", False) + elif name == "Skill": + call_summary["skill_name"] = tool_input.get("skill", "") + call_summary["skill_args"] = tool_input.get("args", "") + elif name == "Bash": + call_summary["command"] = tool_input.get("command", "")[:300] + elif name in ("Read", "Write", "Edit"): + call_summary["file_path"] = tool_input.get("file_path", "") + elif name in ("Grep", "Glob"): + call_summary["pattern"] = tool_input.get("pattern", "") + elif name in ("TaskCreate", "TaskUpdate", "TaskList", "TaskOutput"): + call_summary["task_detail"] = { + k: v for k, v in tool_input.items() + if k in ("description", "status", "id") + } + elif name == "AskUserQuestion": + questions = tool_input.get("questions", []) + call_summary["questions"] = [q.get("question", "") for q in questions] + elif name.startswith("mcp__"): + call_summary["mcp_inputs_preview"] = json.dumps(tool_input)[:300] + + tool_calls.append(call_summary) + + # Track files + fp = tool_input.get("file_path", "") + if fp: + if name == "Read": + files["read"].add(fp) + elif name == "Write": + files["written"].add(fp) + elif name == "Edit": + files["edited"].add(fp) + + # Track git activity from bash commands + if name == "Bash": + cmd = tool_input.get("command", "") + if "git commit" in cmd: + commits.append({"command": cmd[:200], "timestamp": ts}) + if "gh pr" in cmd: + prs.append({"command": cmd[:200], "timestamp": ts}) + + # Tool result blocks — capture SIZE only, not content + elif block_type == "tool_result": + tool_use_id = block.get("tool_use_id", "") + result_content = block.get("content", "") + if isinstance(result_content, str): + size_bytes = len(result_content.encode("utf-8", errors="replace")) + elif isinstance(result_content, list): + # Multi-block results (e.g., images + text) + size_bytes = 0 + for rb in result_content: + if isinstance(rb, dict): + text = rb.get("text", "") + if text: + size_bytes += len(text.encode("utf-8", errors="replace")) + # Image/binary blocks — estimate from base64 if present + data = rb.get("data", "") + if data: + size_bytes += len(data) + elif isinstance(rb, str): + size_bytes += len(rb.encode("utf-8", errors="replace")) + else: + size_bytes = len(json.dumps(result_content).encode("utf-8")) + + if tool_use_id: + tool_result_sizes[tool_use_id] = size_bytes + + # Text blocks — conversation arc (both assistant AND user) + elif block_type == "text": + text = block.get("text", "").strip() + if role == "assistant" and text and len(text) > 20: + arc.append({ + "role": "assistant", + "text": text[:1000], + "timestamp": ts, + }) + + # After processing all blocks in a list-format user message, + # collect text blocks into the arc + if role == "user" and isinstance(content, list): + user_text = "" + for block in content: + if isinstance(block, dict) and block.get("type") == "text": + user_text += block.get("text", "") + elif isinstance(block, str): + user_text += block + user_text = user_text.strip() + if user_text and not user_text.startswith(""): + arc.append({ + "role": "user", + "text": user_text[:2000], + "timestamp": ts, + }) + + # User messages with string content (simple format) + elif role == "user": + text = "" + if isinstance(content, str): + text = content + text = text.strip() + if text and not text.startswith(""): + arc.append({ + "role": "user", + "text": text[:2000], + "timestamp": ts, + }) + + # --- Post-processing --- + + # Compute duration + if session["start_time"] and session["end_time"]: + start = parse_ts(session["start_time"]) + end = parse_ts(session["end_time"]) + if start and end: + session["duration_seconds"] = round((end - start).total_seconds()) + session["branches_seen"] = sorted(session["branches_seen"]) + + # Compute cost (priced for the session's own model) + cost = compute_cost(tokens_total, session["model"]) + + # Attach result sizes to tool calls + for call in tool_calls: + tid = call.get("tool_use_id", "") + if tid in tool_result_sizes: + call["result_size_bytes"] = tool_result_sizes[tid] + + # Compute tool result size stats + result_size_stats = {} + if tool_result_sizes: + sizes_by_tool = defaultdict(list) + for call in tool_calls: + if "result_size_bytes" in call: + sizes_by_tool[call["name"]].append(call["result_size_bytes"]) + + for tool_name, sizes in sorted(sizes_by_tool.items(), key=lambda x: -sum(x[1])): + result_size_stats[tool_name] = { + "count": len(sizes), + "total_bytes": sum(sizes), + "avg_bytes": round(sum(sizes) / len(sizes)), + "max_bytes": max(sizes), + } + + # Extract agents + agents = _extract_agents(tool_calls, subagents_dir) + + # Extract skills + skills = [ + {"name": c.get("skill_name", ""), "args": c.get("skill_args", ""), "timestamp": c.get("timestamp")} + for c in tool_calls if c["name"] == "Skill" + ] + + # Warn if agents exist but have no cost data (subagents_dir missing) + agents_without_cost = [a for a in agents if a.get("estimated_cost_usd") is None + and a.get("description") and not a.get("description", "").startswith("[unmatched")] + if agents_without_cost: + print(f"Warning: {len(agents_without_cost)} agent dispatch(es) have no subagent cost data. " + f"Pass --subagents-dir to attribute subagent costs.", + file=sys.stderr) + + # Build result + result = { + "schema_version": SCHEMA_VERSION, + "session": session, + "tokens": { + "total": tokens_total, + "turn_count": turn_count, + "estimated_cost_usd": round(cost, 4), + }, + "agents": agents, + "skills": skills, + "git": { + "branches": sorted(branches), + "commits": commits, + "pr_operations": prs, + }, + "files": {k: sorted(v) for k, v in files.items()}, + "conversation_arc": arc, + "tool_result_sizes": result_size_stats, + } + + if summary_mode: + result["tools"] = { + "counts": dict(tool_counts.most_common()), + "total_calls": total_tool_calls, + } + else: + result["tools"] = { + "calls": tool_calls, + "counts": dict(tool_counts.most_common()), + "total_calls": total_tool_calls, + } + + return result + + +def _extract_agents(tool_calls, subagents_dir=None): + """Extract agent dispatch details and match with subagent JSONL files.""" + agents = [] + for call in tool_calls: + if call["name"] == "Agent": + agent = { + "description": call.get("agent_description", ""), + "type": call.get("agent_type", "") or "general-purpose", + "model": call.get("agent_model", "") or "inherited", + "prompt_preview": call.get("agent_prompt_preview", ""), + "background": call.get("run_in_background", False), + "timestamp": call.get("timestamp"), + "tool_use_id": call.get("tool_use_id", ""), + "tokens": None, + "estimated_cost_usd": None, + } + if "result_size_bytes" in call: + agent["result_size_bytes"] = call["result_size_bytes"] + agents.append(agent) + + if subagents_dir and os.path.isdir(subagents_dir): + _match_subagent_files(agents, subagents_dir) + + return agents + + +def _match_subagent_files(agents, subagents_dir): + """Match subagent JSONL files to dispatches using timestamp proximity.""" + subagent_files = sorted(glob.glob(os.path.join(subagents_dir, "*.jsonl"))) + MAX_MATCH_WINDOW_S = 60 + + subagent_info = [] + for sa_file in subagent_files: + sa_tokens = {"input_tokens": 0, "output_tokens": 0, + "cache_creation_input_tokens": 0, "cache_read_input_tokens": 0} + sa_start = None + sa_model = None + turn_count = 0 + + for rec in stream_jsonl(sa_file): + msg = rec.get("message", {}) + usage = msg.get("usage", {}) + if usage and msg.get("role") == "assistant": + sa_tokens["input_tokens"] += usage.get("input_tokens", 0) + sa_tokens["output_tokens"] += usage.get("output_tokens", 0) + sa_tokens["cache_creation_input_tokens"] += usage.get("cache_creation_input_tokens", 0) + sa_tokens["cache_read_input_tokens"] += usage.get("cache_read_input_tokens", 0) + turn_count += 1 + if sa_model is None and msg.get("model"): + sa_model = msg.get("model") + if sa_start is None and "timestamp" in rec: + sa_start = parse_ts(rec["timestamp"]) + + # Price each subagent for the model it actually ran on. + sa_cost = compute_cost(sa_tokens, sa_model) + + # Load meta file if present + meta = None + meta_file = sa_file.replace(".jsonl", ".meta.json") + if os.path.exists(meta_file): + with open(meta_file) as f: + meta = json.load(f) + + subagent_info.append({ + "file": os.path.basename(sa_file), + "tokens": sa_tokens, + "cost": round(sa_cost, 4), + "start_time": sa_start, + "model": sa_model, + "meta": meta, + }) + + # Match by timestamp proximity + matched_dispatches = set() + matched_subagents = set() + + for sa_idx, sa in enumerate(subagent_info): + if not sa["start_time"]: + continue + best_match = None + best_delta = None + + for ag_idx, agent in enumerate(agents): + if ag_idx in matched_dispatches: + continue + dispatch_time = parse_ts(agent["timestamp"]) + if not dispatch_time: + continue + delta = abs((sa["start_time"] - dispatch_time).total_seconds()) + if delta <= MAX_MATCH_WINDOW_S and (best_delta is None or delta < best_delta): + best_match = ag_idx + best_delta = delta + + if best_match is not None: + agents[best_match]["tokens"] = sa["tokens"] + agents[best_match]["estimated_cost_usd"] = sa["cost"] + if sa["model"]: + agents[best_match]["model"] = sa["model"] + agents[best_match]["subagent_file"] = sa["file"] + agents[best_match]["match_delta_s"] = round(best_delta, 1) + if sa["meta"]: + agents[best_match]["meta"] = sa["meta"] + matched_dispatches.add(best_match) + matched_subagents.add(sa_idx) + + # Report unmatched subagents + for sa_idx, sa in enumerate(subagent_info): + if sa_idx not in matched_subagents: + agents.append({ + "description": f"[unmatched subagent: {sa['file']}]", + "type": "unknown", + "model": sa["model"] or "unknown", + "prompt_preview": "", + "background": False, + "timestamp": str(sa["start_time"]) if sa["start_time"] else None, + "tool_use_id": "", + "tokens": sa["tokens"], + "estimated_cost_usd": sa["cost"], + "subagent_file": sa["file"], + "match_confidence": "unmatched", + "meta": sa["meta"], + }) + + +if __name__ == "__main__": + if len(sys.argv) < 2: + print("Usage: python extract.py [--subagents-dir ] [--summary] [--metadata-only]") + sys.exit(1) + + jsonl_path = sys.argv[1] + subagents_dir = None + summary_mode = "--summary" in sys.argv + metadata_only = "--metadata-only" in sys.argv + + if metadata_only: + result = extract_metadata_lite(jsonl_path) + print(json.dumps(result, indent=2, default=str)) + sys.exit(0) + + if "--subagents-dir" in sys.argv: + idx = sys.argv.index("--subagents-dir") + if idx + 1 < len(sys.argv): + subagents_dir = sys.argv[idx + 1] + else: + # Auto-detect: look for sibling directory with same name as the JSONL + stem = Path(jsonl_path).stem + candidate = Path(jsonl_path).parent / stem / "subagents" + if candidate.is_dir(): + subagents_dir = str(candidate) + + result = extract_all_streaming(jsonl_path, subagents_dir, summary_mode) + print(json.dumps(result, indent=2, default=str)) diff --git a/swarmforge/skills/setup-swarm/SKILL.md b/swarmforge/skills/setup-swarm/SKILL.md new file mode 100644 index 0000000..913b3ad --- /dev/null +++ b/swarmforge/skills/setup-swarm/SKILL.md @@ -0,0 +1,139 @@ +--- +name: setup-swarm +description: One-time project setup for SwarmForge. Run before the first `./swarm` launch. Installs language-appropriate quality tools, wires session tracking, writes permission allow-rules, and scaffolds .gitignore. Triggers on "setup swarm", "setup the swarm", "/setup-swarm", "first time setup", or "prepare project for swarm". +compatibility: Requires git, Python 3. Optional but recommended: entire CLI (0.6.2+) for session tracking. +metadata: + author: gabadi/swarm-forge + version: "0.1.0" +--- + +# setup-swarm + +Run this skill **once** before invoking `./swarm`. It prepares the project so the swarm can operate without interruption. If you need to re-run setup, delete `.swarmforge/setup-complete` first. + +--- + +## Step 1 — Ask the operator for the project stack + +Ask the operator: + +> Which stack is this project? +> 1. Go +> 2. Java / Kotlin (JVM) +> 3. JavaScript / TypeScript +> 4. Python +> 5. Rust +> 6. Clojure +> 7. Ruby + +Wait for the operator's answer before proceeding. Do not infer or detect the stack from the repository. + +--- + +## Step 2 — Install quality tools + +Based on the operator's chosen stack, install the mutation, CRAP, and DRY tools. These are the tools that cleaner, hardener, and QA will use during the swarm run. + +**Go:** `go install honnef.co/go/tools/cmd/staticcheck@latest` (CRAP), a mutation tool such as `go-mutesting` if available. + +**Java / Kotlin:** Maven or Gradle plugin for PITest (mutation); PMD or SpotBugs (CRAP/DRY). + +**JavaScript / TypeScript:** `npm install -g stryker-cli` (mutation); ESLint with complexity rules. + +**Python:** `pip install mutmut` (mutation); `radon` (CRAP/DRY metrics). + +**Rust:** `cargo install cargo-mutants` (mutation); Clippy is standard and should already be present. + +**Clojure:** `clj-kondo` (DRY/complexity); mutation support via the project's own test runner. + +**Ruby:** `gem install mutant` (mutation). + +Also install the Acceptance Pipeline Specification (APS) tools: +``` +git clone https://github.com/unclebob/Acceptance-Pipeline-Specification /tmp/aps-build +cd /tmp/aps-build && go build -o gherkin-parser ./cmd/gherkin-parser && go build -o gherkin-mutator ./cmd/gherkin-mutator +cp gherkin-parser gherkin-mutator /usr/local/bin/ 2>/dev/null || cp gherkin-parser gherkin-mutator ~/.local/bin/ +``` +Warn and continue if the build fails (APS tools are quality-of-life, not blocking). + +--- + +## Step 3 — Session tracking with entire + +```bash +entire enable --no-github --telemetry=false +``` + +Then, for each unique backend listed in `swarmforge/swarmforge.conf` column 3 (e.g. `claude`, `codex`, `copilot`, `grok`): +```bash +entire agent add +``` + +If `entire` is not installed: print a warning ("entire not found — session tracking skipped") and continue. Setup never blocks on this. + +--- + +## Step 4 — Permission allow-rules + +Write minimal allow-rules to `.claude/settings.json` so the integrator and specifier can run their necessary git/gh commands unattended. Read the current file first (create `{}` if absent), merge in these two rules, and write it back: + +```json +{ + "permissions": { + "allow": [ + "Bash(gh pr merge*)", + "Bash(git reset --hard origin/*)" + ] + } +} +``` + +Use Python to merge (preserve any existing `allow` entries): +```python +import json, pathlib +p = pathlib.Path('.claude/settings.json') +cfg = json.loads(p.read_text()) if p.exists() else {} +cfg.setdefault('permissions', {}).setdefault('allow', []) +for rule in ['Bash(gh pr merge*)', 'Bash(git reset --hard origin/*)']: + if rule not in cfg['permissions']['allow']: + cfg['permissions']['allow'].append(rule) +p.parent.mkdir(exist_ok=True) +p.write_text(json.dumps(cfg, indent=2)) +``` + +--- + +## Step 5 — Scaffold .gitignore and probe default branch + +Ensure these entries exist in `.gitignore` (append if missing, do not duplicate): +``` +logbook.jsonl +tmp/ +.swarmforge/ +.worktrees/ +``` + +Probe the repository's default remote branch: +```bash +git symbolic-ref refs/remotes/origin/HEAD 2>/dev/null | sed 's|refs/remotes/origin/||' +``` + +If this resolves to a branch name (e.g. `main`, `master`), record it: +```bash +mkdir -p .swarmforge +echo "" > .swarmforge/default-branch +``` +This file lets the specifier reset to origin's default branch without hard-coding the name. + +--- + +## Step 6 — Emit the swarm-ready marker + +```bash +mkdir -p .swarmforge +printf '%s %s\n' "$(date -u +%Y-%m-%dT%H:%M:%SZ)" "$(git rev-parse HEAD 2>/dev/null || echo 'no-git')" > .swarmforge/setup-complete +``` + +Print: `SwarmForge setup complete. Run ./swarm to start the session.` + +The marker's presence is the signal to `./swarm` that the project is ready. If you need to re-run setup, delete this file. From 336f388787918bfb0b3184bf455a590957437c59 Mon Sep 17 00:00:00 2001 From: "Robert C. Martin" Date: Sun, 14 Jun 2026 08:24:56 -0500 Subject: [PATCH 19/67] Add shared engineering and workflow articles --- .../constitution/articles/engineering.prompt | 40 +++++++++++++++++++ .../constitution/articles/workflow.prompt | 14 +++++++ 2 files changed, 54 insertions(+) create mode 100644 swarmforge/constitution/articles/engineering.prompt create mode 100644 swarmforge/constitution/articles/workflow.prompt diff --git a/swarmforge/constitution/articles/engineering.prompt b/swarmforge/constitution/articles/engineering.prompt new file mode 100644 index 0000000..3e189cb --- /dev/null +++ b/swarmforge/constitution/articles/engineering.prompt @@ -0,0 +1,40 @@ +# Engineering Rules + +## Startup Tools +- On startup, acquire the github tools for the project language and get them ready to run. +- Language tool table: + - Go: install with `go install`; mutation `github.com/unclebob/mutate4go`, CRAP `github.com/unclebob/crap4go`, DRY `github.com/unclebob/dry4go`. + - Clojure: install with Clojure CLI/deps.edn; mutation `github.com/unclebob/clj-mutate`, CRAP `github.com/unclebob/crap4clj`, DRY `github.com/unclebob/dry4clj`. + - Java: install with Maven (`mvn`); mutation `github.com/unclebob/mutate4java`, CRAP `github.com/unclebob/crap4java`, DRY `github.com/unclebob/dry4java`. + +## Language Defaults +- For Clojure projects, prefer Babashka where possible. +- For Clojure projects, prefer Speclj for unit and behavior tests. +- For Clojure or Babashka projects using Speclj, use `github.com/unclebob/speclj-structure-check` to validate test syntax. If a Speclj spec file changed, run the structure check before executing the relevant test command. +- For Java projects, avoid using Maven to run tests; build dedicated test runners and run those instead. + +## Design And Testability +- Work in small, reviewable increments. +- Prefer the simplest design that supports the current behavior and leaves clear options for the next step. +- Keep tests close to the behavior being changed. +- Separate testable modules from environmentally unsuitable modules that open GUIs, depend on external devices, throw environment errors, emit system errors, or hang under automated tests. Maximize testable code and minimize the unsuitable boundary. +- Only testable modules should participate in tools that run tests, including unit tests, acceptance tests, coverage, mutation testing, CRAP analysis, DRY analysis that invokes tests, and property tests. +- Keep property tests separate from normal verification. Do not include property-test tags in normal unit coverage, Gherkin acceptance mutation, language mutation tools, CRAP, or coverage commands unless the role owns property-test verification or the user explicitly asks for property tests. + +## Acceptance Pipeline +- Use github.com/unclebob/Acceptance-Pipeline-Specification for Gherkin acceptance tests. +- The Acceptance Pipeline Specification supplies `gherkin-parser` and `gherkin-mutator`; install or build those commands from that repository instead of reimplementing them in the project. +- Project-specific acceptance pipeline components are the acceptance entrypoint generator, acceptance runtime, project step handlers, runner adapter, and convenience scripts. +- Gherkin acceptance mutation means running `gherkin-mutator` to mutate Gherkin example values. +- Gherkin acceptance mutation runs must report periodic progress/status so agents can distinguish normal long-running work from a hang. + +## Verification +- Before running language, build, or test commands, prefer project-local cache/configuration paths inside the assigned worktree. Avoid default cache locations that write outside the project and may trigger sandbox or permission restrictions. +- Run acceptance generation and acceptance tests sequentially. +- Avoid running whole-suite language test commands concurrently with acceptance generation. +- Run the relevant local verification command before handoff whenever the project has one. + +## Guardrails +- Do not edit mutation testing or Gherkin acceptance mutation manifests by hand; allow approved mutation tools to update those manifests as part of their normal runs. +- Do not commit unrelated local changes or generated artifacts unless required for the task. +- Before relying on an unfamiliar command, inspect local help or project documentation. diff --git a/swarmforge/constitution/articles/workflow.prompt b/swarmforge/constitution/articles/workflow.prompt new file mode 100644 index 0000000..9de6d1b --- /dev/null +++ b/swarmforge/constitution/articles/workflow.prompt @@ -0,0 +1,14 @@ +# Workflow Rules + +## Worktree Discipline +- At startup, discover and remember the branch or worktree assigned to your role. +- If your assigned worktree is `master`, work in the main project checkout on its current branch; do not expect or create a `.worktrees/` directory for that role. +- Work only in your assigned branch or worktree. +- Do not inspect, diff, merge, or base work on another branch unless that branch is specifically named in a handoff or explicit user instruction. +- Do not run `./swarm` from an agent worktree to repair helper scripts. If `notify-agent.sh` is missing from PATH, stop and report the startup failure. + +## Temporary Files +- Use `./tmp/` in your assigned worktree for temporary files; do not use `/tmp`. + +## Failure Conditions +- If the expected git layout or assigned worktree is missing, stop and report instead of silently working in the wrong place. From e6546870c780af38bbba2970bfcf2fa6a2fb0151 Mon Sep 17 00:00:00 2001 From: "Robert C. Martin" Date: Sun, 14 Jun 2026 08:26:55 -0500 Subject: [PATCH 20/67] Prefer Babashka APS tools by default --- swarmforge/constitution/articles/engineering.prompt | 2 ++ 1 file changed, 2 insertions(+) diff --git a/swarmforge/constitution/articles/engineering.prompt b/swarmforge/constitution/articles/engineering.prompt index 3e189cb..0297b5b 100644 --- a/swarmforge/constitution/articles/engineering.prompt +++ b/swarmforge/constitution/articles/engineering.prompt @@ -24,6 +24,8 @@ ## Acceptance Pipeline - Use github.com/unclebob/Acceptance-Pipeline-Specification for Gherkin acceptance tests. - The Acceptance Pipeline Specification supplies `gherkin-parser` and `gherkin-mutator`; install or build those commands from that repository instead of reimplementing them in the project. +- Prefer the Babashka APS tools for `gherkin-parser`, `gherkin-mutator`, and related APS support commands. +- Use Go-based APS tools only if the Babashka APS tools do not work in the current project environment. - Project-specific acceptance pipeline components are the acceptance entrypoint generator, acceptance runtime, project step handlers, runner adapter, and convenience scripts. - Gherkin acceptance mutation means running `gherkin-mutator` to mutate Gherkin example values. - Gherkin acceptance mutation runs must report periodic progress/status so agents can distinguish normal long-running work from a hang. From 8409333e1cffccc174c95598a3454ded597076c3 Mon Sep 17 00:00:00 2001 From: "Robert C. Martin" Date: Sun, 14 Jun 2026 08:42:22 -0500 Subject: [PATCH 21/67] Document constitution article inheritance --- README.md | 58 ++++++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 42 insertions(+), 16 deletions(-) diff --git a/README.md b/README.md index 4857545..3a9a3b2 100644 --- a/README.md +++ b/README.md @@ -8,7 +8,7 @@ Do not spend any money on a bankrbot SWARM token. ## Intent -This `main` branch is documentary: it explains the system and carries the shared operational scripts. The runnable `four-pack` and `six-pack` branches carry the project-facing configurations and role prompts that define specific workflows. +This `main` branch is documentary: it explains the system and carries the shared operational scripts and default constitution articles. The runnable workflow branches carry the project-facing configurations, role prompts, and local constitution articles that define specific workflows. SwarmForge is an agent coordination system that facilitates communication between agents working in different git worktrees. @@ -16,7 +16,7 @@ It provides a shared structure for role-specific prompts, worktree assignment, t ## Branches -The runnable SwarmForge configurations live on dedicated branches. Each branch contains the `swarmforge/swarmforge.conf`, constitution, and role prompts for one workflow. At startup, its `./swarm` wrapper copies the shared operational scripts from `main` when they are not already present, then launches that branch's local configuration. +The runnable SwarmForge configurations live on dedicated branches. Each branch contains the `swarmforge/swarmforge.conf`, local constitution articles, and role prompts for one workflow. At startup, its `./swarm` wrapper copies the shared operational scripts and shared constitution articles from `main` when they are not already present, then launches that branch's local configuration. ### `four-pack` @@ -68,7 +68,7 @@ After copying a runnable branch, start the swarm from the target project: ./swarm ``` -The `./swarm` wrapper keeps the runnable branch small. On first use, if `swarmforge/scripts/` is missing, it downloads the `main` branch archive, copies the shared operational scripts from `swarmforge/scripts/`, and then launches `swarmforge/scripts/swarmforge.sh`. Later runs reuse the existing local scripts directory instead of overwriting it. +The `./swarm` wrapper keeps the runnable branch small. On first use, if `swarmforge/scripts/` is missing, it downloads the `main` branch archive, copies the shared operational scripts from `swarmforge/scripts/`, stages shared constitution articles from `swarmforge/constitution/articles/`, and then launches `swarmforge/scripts/swarmforge.sh`. Later runs reuse the existing local scripts directory instead of overwriting it. The windows should open automatically. @@ -91,12 +91,12 @@ SwarmForge is a lightweight, tmux-based orchestration layer that: - **Config-Driven Topology** — The swarm shape comes from `swarmforge/swarmforge.conf`, not hardcoded shell variables. - **Project-Local Roles** — Each role is defined by `swarmforge/roles/.prompt` in the working tree being orchestrated. -- **Layered Constitution** — `swarmforge/constitution.prompt` can delegate to subordinate files such as `swarmforge/constitution/project.prompt`, `engineering.prompt`, and `workflow.prompt`. +- **Layered Constitution** — `swarmforge/constitution.prompt` directs agents to read article files under `swarmforge/constitution/articles/`. - **Backend Selection Per Role** — A role can launch `claude`, `codex`, `copilot`, or `grok`. - **Observable Swarm** — Open one Terminal window per role and watch the sessions in real time. - **Self-Hosted & Lightweight** — Runs locally in tmux and Terminal with minimal machinery. -## Constitution And Roles +## Constitution Structure Each runnable branch contains a `swarmforge/` directory with this general layout: @@ -105,15 +105,40 @@ swarmforge/ swarmforge.conf constitution.prompt constitution/ - project.prompt - engineering.prompt - workflow.prompt + articles/ + project.prompt + local-engineering.prompt + local-workflow.prompt + ... roles/ .prompt ... ``` -`constitution.prompt` is the entry point. It can define precedence and direct agents to read subordinate constitution files in order. That lets you separate project-specific rules from engineering rules and workflow rules without forcing everything into one large prompt. +`constitution.prompt` is the entry point. Runnable branches normally use it to tell agents to read every file in `swarmforge/constitution/articles/`. + +Shared default articles live on `main` under: + +```text +swarmforge/constitution/articles/ + engineering.prompt + handoffs.prompt + workflow.prompt +``` + +At startup, SwarmForge installs missing shared articles into the runnable branch's `swarmforge/constitution/articles/` directory before creating role worktrees. It also installs missing shared articles into each role worktree during script synchronization. Existing local files are skipped, so a runnable branch can override a shared article by committing an article with the same filename. + +Pack-specific additions and exceptions should use explicit local filenames rather than editing shared articles. Current conventions are: + +- `project.prompt` for the workflow's project shape and local topology. +- `local-engineering.prompt` for workflow-specific engineering rules. +- `local-workflow.prompt` for workflow-specific flow rules. + +The `local-*.prompt` naming convention means "add to or specialize the shared default article for this runnable branch." Use it when the shared article remains valid and the branch only needs extra requirements, exceptions, or narrower instructions. Do not use `local-*.prompt` for a full replacement; use the shared filename instead when the branch intentionally overrides the shared article. + +For example, `main` can provide a shared `workflow.prompt`, while `six-pack` can add `local-workflow.prompt` for QA-specific handoff behavior. If a branch needs to replace the shared workflow article completely, it can commit its own `workflow.prompt`; startup will treat that local file as an override and will not copy the shared one over it. + +## Roles Each role in `swarmforge/swarmforge.conf` maps to a corresponding `swarmforge/roles/.prompt` file. @@ -122,13 +147,14 @@ Each role in `swarmforge/swarmforge.conf` maps to a corresponding `swarmforge/ro In a runnable branch: 1. SwarmForge reads `swarmforge/swarmforge.conf`. -2. The root `./swarm` wrapper copies shared helper scripts and terminal adapters from the `main` branch when `swarmforge/scripts/` is not already present. -3. Startup validates the configured role prompts, helper scripts, and terminal adapters. -4. If the target directory is not already a git repository, startup initializes one and creates the first commit. -5. Startup creates one git worktree per configured role under `.worktrees/`, unless the role is assigned to `master` or `none`. -6. Startup syncs `swarmforge/scripts/` into each role worktree and puts that local scripts directory on each agent's `PATH`, so agents use `notify-agent.sh` without reaching back into the master checkout. -7. SwarmForge creates tmux sessions, opens terminal windows, and launches each configured backend in its assigned worktree. -8. Roles communicate through sequenced handoff files. Agents write `.swarmforge/notify/request` and run `notify-agent.sh`; the helper assigns message ids and sequence numbers, archives sent messages, records logbook entries, validates receive ordering, and requests resends when gaps are detected. +2. The root `./swarm` wrapper copies shared helper scripts, terminal adapters, and shared constitution articles from the `main` branch when they are not already present. +3. Startup installs missing shared constitution articles into `swarmforge/constitution/articles/`, skipping any local article file that already exists. +4. Startup validates the configured role prompts, helper scripts, and terminal adapters. +5. If the target directory is not already a git repository, startup initializes one and creates the first commit. +6. Startup creates one git worktree per configured role under `.worktrees/`, unless the role is assigned to `master` or `none`. +7. Startup syncs `swarmforge/scripts/` and missing shared constitution articles into each role worktree and puts that local scripts directory on each agent's `PATH`, so agents use `notify-agent.sh` without reaching back into the master checkout. +8. SwarmForge creates tmux sessions, opens terminal windows, and launches each configured backend in its assigned worktree. +9. Roles communicate through sequenced handoff files. Agents write `.swarmforge/notify/request` and run `notify-agent.sh`; the helper assigns message ids and sequence numbers, archives sent messages, records logbook entries, validates receive ordering, and requests resends when gaps are detected. ## Handoff Helpers From 5e21205ec4e5b8ce0c5ae68cdf32644101a6d982 Mon Sep 17 00:00:00 2001 From: "Robert C. Martin" Date: Sun, 14 Jun 2026 08:50:56 -0500 Subject: [PATCH 22/67] Rename handoff command to swarm-handoff --- README.md | 46 ++-- .../constitution/articles/handoffs.prompt | 14 +- .../constitution/articles/workflow.prompt | 2 +- swarmforge/scripts/notify-agent.sh | 217 +---------------- swarmforge/scripts/resend-handoff.sh | 2 +- swarmforge/scripts/send-handoff.sh | 2 +- swarmforge/scripts/swarm-handoff | 220 ++++++++++++++++++ swarmforge/scripts/swarmforge.sh | 4 +- 8 files changed, 256 insertions(+), 251 deletions(-) create mode 100755 swarmforge/scripts/swarm-handoff diff --git a/README.md b/README.md index 3a9a3b2..96bcecb 100644 --- a/README.md +++ b/README.md @@ -152,18 +152,18 @@ In a runnable branch: 4. Startup validates the configured role prompts, helper scripts, and terminal adapters. 5. If the target directory is not already a git repository, startup initializes one and creates the first commit. 6. Startup creates one git worktree per configured role under `.worktrees/`, unless the role is assigned to `master` or `none`. -7. Startup syncs `swarmforge/scripts/` and missing shared constitution articles into each role worktree and puts that local scripts directory on each agent's `PATH`, so agents use `notify-agent.sh` without reaching back into the master checkout. +7. Startup syncs `swarmforge/scripts/` and missing shared constitution articles into each role worktree and puts that local scripts directory on each agent's `PATH`, so agents use `swarm-handoff` without reaching back into the master checkout. 8. SwarmForge creates tmux sessions, opens terminal windows, and launches each configured backend in its assigned worktree. -9. Roles communicate through sequenced handoff files. Agents write `.swarmforge/notify/request` and run `notify-agent.sh`; the helper assigns message ids and sequence numbers, archives sent messages, records logbook entries, validates receive ordering, and requests resends when gaps are detected. +9. Roles communicate through sequenced handoff files. Agents write `.swarmforge/notify/request` and run `swarm-handoff`; the helper assigns message ids and sequence numbers, archives sent messages, records logbook entries, validates receive ordering, and requests resends when gaps are detected. ## Handoff Helpers -Startup syncs the shared helper scripts into every role worktree under `swarmforge/scripts/` and puts that local directory on the agent's `PATH`. Agents should use the request-file form of `notify-agent.sh` rather than running helper scripts from another worktree. +Startup syncs the shared helper scripts into every role worktree under `swarmforge/scripts/` and puts that local directory on the agent's `PATH`. Agents should use the request-file form of `swarm-handoff` rather than running helper scripts from another worktree. The agent-facing command is stable: ```sh -notify-agent.sh +swarm-handoff ``` Before running it, write `.swarmforge/notify/request` in the assigned worktree. To send a normal handoff: @@ -199,13 +199,13 @@ file: ./.swarmforge/handoffs/queue/accepted/.txt The shared script directory also contains implementation helpers: -- `notify-agent.sh` is the public entry point and low-level tmux transport. +- `swarm-handoff` is the public entry point and low-level tmux transport. - `send-handoff.sh` builds sequenced protocol messages, archives outbound handoffs, sends them, and logs successful sends. - `receive-handoff.sh` normalizes incoming captures, validates protocol messages, records received or queued entries, queues accepted handoffs for role work, and generates resend requests when ordering gaps appear. - `resend-handoff.sh` replays archived outbound handoffs in response to resend requests. - `handoff-lib.sh` contains shared parsing, id generation, sequence, archive, and logbook functions. -Agents normally call only `notify-agent.sh` with no arguments after writing `.swarmforge/notify/request`. The explicit subcommands and other scripts are kept separate so the transport, sequencing, receive validation, and replay behavior are easy to inspect, test, and use manually when needed. +Agents normally call only `swarm-handoff` with no arguments after writing `.swarmforge/notify/request`. The explicit subcommands and other scripts are kept separate so the transport, sequencing, receive validation, and replay behavior are easy to inspect, test, and use manually when needed. ## Avoiding Escalation During Handoffs @@ -214,24 +214,24 @@ Handoff commands must stay inside the agent's sandbox as much as possible. The t When an agent sends, receives, or completes a handoff, it should write `.swarmforge/notify/request` and run: ```sh -notify-agent.sh +swarm-handoff ``` -This stable command shape is intentional. Some command approval systems approve future commands by their literal command prefix. If every handoff uses a different command line, each target, file path, priority, or queue filename can create a new approval prompt. The request file moves those variable arguments into local project state, so a single approval for `notify-agent.sh` covers normal handoff send, receive, resend, and completion operations. +This stable command shape is intentional. Some command approval systems approve future commands by their literal command prefix. If every handoff uses a different command line, each target, file path, priority, or queue filename can create a new approval prompt. The request file moves those variable arguments into local project state, so a single approval for `swarm-handoff` covers normal handoff send, receive, resend, and completion operations. -Do not have agents run `tmux -S ...` directly. The `notify-agent.sh` transport detects when it is already running inside a tmux pane and uses the inherited tmux client context: +Do not have agents run `tmux -S ...` directly. The `swarm-handoff` transport detects when it is already running inside a tmux pane and uses the inherited tmux client context: ```sh tmux send-keys -t "$TARGET_SESSION" ... ``` -That avoids naming or opening the tmux socket from the agent process. Some agent command runners sanitize the environment and remove `TMUX` even though the agent itself was launched inside tmux. To handle that, `swarmforge.sh` writes the active tmux client value to `.swarmforge/tmux-env` after creating the swarm sessions and syncs that file into each role worktree. If `notify-agent.sh` starts without `TMUX`, it restores `TMUX` from `.swarmforge/tmux-env` and still uses plain `tmux send-keys`. +That avoids naming or opening the tmux socket from the agent process. Some agent command runners sanitize the environment and remove `TMUX` even though the agent itself was launched inside tmux. To handle that, `swarmforge.sh` writes the active tmux client value to `.swarmforge/tmux-env` after creating the swarm sessions and syncs that file into each role worktree. If `swarm-handoff` starts without `TMUX`, it restores `TMUX` from `.swarmforge/tmux-env` and still uses plain `tmux send-keys`. The explicit `tmux -S ` path is kept only as a fallback for manual helper use outside tmux. Do not ask agents to remove queue files with `rm`, `rm -f`, or ad hoc cleanup commands. The completion helper moves the accepted queue file to `.swarmforge/handoffs/queue/completed/`, which keeps cleanup predictable and avoids destructive-command escalation. -The operational rule is simple: agents write `.swarmforge/notify/request`, run `notify-agent.sh`, do not call `tmux` directly, and do not delete queue files directly. +The operational rule is simple: agents write `.swarmforge/notify/request`, run `swarm-handoff`, do not call `tmux` directly, and do not delete queue files directly. ## Communication Protocol @@ -243,7 +243,7 @@ target: file: ./tmp/-handoff.txt ``` -Then it runs `notify-agent.sh`. Priority handoffs add `priority: NN` to the request file; normal handoffs default to priority `50`. +Then it runs `swarm-handoff`. Priority handoffs add `priority: NN` to the request file; normal handoffs default to priority `50`. The send helper wraps that body with protocol fields: @@ -260,7 +260,7 @@ commit hash: 1234567890 The `message id` timestamp is human-readable and roughly sortable. Message type, sender, target, and sequence are separate fields so the id does not duplicate protocol data. Sequence numbers are per sender-target stream. For example, `coder-cleaner` has its own sequence, and `cleaner-coder` has a separate reverse sequence. The six-character suffix prevents id collisions when two messages are created in the same second. -The helper reads `branch name` and the 10-character `commit hash` from the sender's current git worktree at send time. Agents should commit the state being handed off, send the handoff immediately, and avoid making another commit until `notify-agent.sh` completes successfully. The generated branch and commit fields are the authoritative state for the receiver to merge. +The helper reads `branch name` and the 10-character `commit hash` from the sender's current git worktree at send time. Agents should commit the state being handed off, send the handoff immediately, and avoid making another commit until `swarm-handoff` completes successfully. The generated branch and commit fields are the authoritative state for the receiver to merge. The sender archives each outbound message under: @@ -277,7 +277,7 @@ command: receive file: ./tmp/incoming-handoff.txt ``` -Then it runs `notify-agent.sh`. +Then it runs `swarm-handoff`. The receive helper ignores any leading terminal noise before the first valid `message type: handoff` or `message type: resend-request` header. It archives and queues only the normalized protocol message, not the noisy capture. @@ -287,22 +287,22 @@ The receive helper checks `message type`, `message id`, sender, target, sequence .swarmforge/handoffs/queue/accepted/---.txt ``` -Agents process only accepted queue files and do not rerun `notify-agent.sh receive` on them. After the corresponding role work is complete, agents complete accepted queue files with: +Agents process only accepted queue files and do not rerun `swarm-handoff receive` on them. After the corresponding role work is complete, agents complete accepted queue files with: ```text command: complete file: ./.swarmforge/handoffs/queue/accepted/.txt ``` -Then they run `notify-agent.sh`. The completion helper moves the queue file into `.swarmforge/handoffs/queue/completed/`. +Then they run `swarm-handoff`. The completion helper moves the queue file into `.swarmforge/handoffs/queue/completed/`. -`notify-agent.sh` also keeps explicit command forms for manual helper implementation and diagnostics: +`swarm-handoff` also keeps explicit command forms for manual helper implementation and diagnostics: ```sh -notify-agent.sh send --file ./tmp/-handoff.txt -notify-agent.sh receive --file ./tmp/incoming-handoff.txt -notify-agent.sh complete --file ./.swarmforge/handoffs/queue/accepted/.txt -notify-agent.sh --file +swarm-handoff send --file ./tmp/-handoff.txt +swarm-handoff receive --file ./tmp/incoming-handoff.txt +swarm-handoff complete --file ./.swarmforge/handoffs/queue/accepted/.txt +swarm-handoff --file ``` Agents should not use the low-level target/file transport form for normal handoffs because it bypasses sequencing, archiving, resend recovery, and logbook handling. Agents should prefer the no-argument request-file form over explicit subcommands to keep command approvals stable. @@ -311,7 +311,7 @@ Agents should not use the low-level target/file transport form for normal handof The protocol is designed for eventual correction rather than tmux-pane sniffing. -If `notify-agent.sh receive` sees the next expected handoff sequence, it queues the handoff for role work. If it sees a sequence gap, it archives the out-of-order message, appends a queued logbook entry, sends a `resend-request` back to the sender, and prints `DO NOT PROCESS`. +If `swarm-handoff receive` sees the next expected handoff sequence, it queues the handoff for role work. If it sees a sequence gap, it archives the out-of-order message, appends a queued logbook entry, sends a `resend-request` back to the sender, and prints `DO NOT PROCESS`. The resend request is itself a sequenced message in the reverse sender-target stream: @@ -330,7 +330,7 @@ resend sequences: 000003-000005 The missing range includes the out-of-order message that exposed the gap. That keeps recovery simple: the original sender replays one contiguous range, and the receiver processes messages only when they arrive in sequence. -When a sender receives a `resend-request`, `notify-agent.sh receive` calls `resend-handoff.sh`, which reads archived messages from `.swarmforge/handoffs/sent/` and resends each requested sequence. Resent messages are logged as sent only after the low-level notification succeeds. +When a sender receives a `resend-request`, `swarm-handoff receive` calls `resend-handoff.sh`, which reads archived messages from `.swarmforge/handoffs/sent/` and resends each requested sequence. Resent messages are logged as sent only after the low-level notification succeeds. Duplicate or stale messages are archived and logged as queued, but the helper prints `DO NOT PROCESS`. Agents should not merge, apply, or otherwise act on a handoff unless it appears in the accepted queue. diff --git a/swarmforge/constitution/articles/handoffs.prompt b/swarmforge/constitution/articles/handoffs.prompt index fb83c01..4686162 100644 --- a/swarmforge/constitution/articles/handoffs.prompt +++ b/swarmforge/constitution/articles/handoffs.prompt @@ -3,23 +3,23 @@ ## Startup Notification - After reading the constitution and your role prompt at startup, send an awake notification to `specifier`. - Write exactly `I'm awake` to `./tmp/specifier-awake.txt`. -- Write `.swarmforge/notify/request` with `command: send`, `target: specifier`, and `file: ./tmp/specifier-awake.txt`, then run `notify-agent.sh` with no arguments. +- Write `.swarmforge/notify/request` with `command: send`, `target: specifier`, and `file: ./tmp/specifier-awake.txt`, then run `swarm-handoff` with no arguments. - This startup notification is only a presence signal; it does not replace any role-specific handoff rule. ## Logbook - Maintain the tracked root `logbook.jsonl` file for every handoff received or sent. - Keep `logbook.jsonl` as append-only JSON Lines. Each line is one complete JSON object; never rewrite, reformat, sort, summarize, or merge existing entries. - Each log entry must include `timestamp`, `direction`, `message`, and `note` fields. Use `direction` values `received`, `sent`, or `queued`. -- Use `.swarmforge/notify/request` plus `notify-agent.sh` to append normal sent, received, queued, and resend-request entries to `logbook.jsonl`. -- Run `notify-agent.sh` with no environment-variable prefix. Do not run `SWARMFORGE_ROLE= notify-agent.sh`; startup already sets `SWARMFORGE_ROLE`, and the handoff command must stay stable. +- Use `.swarmforge/notify/request` plus `swarm-handoff` to append normal sent, received, queued, and resend-request entries to `logbook.jsonl`. +- Run `swarm-handoff` with no environment-variable prefix. Do not run `SWARMFORGE_ROLE= swarm-handoff`; startup already sets `SWARMFORGE_ROLE`, and the handoff command must stay stable. - Do not hand-edit handoff logbook entries unless a helper script fails and explicit user direction says to repair the logbook manually. - If a merge conflict occurs in `logbook.jsonl`, keep all complete JSON lines from both sides. Preserve message text exactly; order chronologically when obvious, otherwise keep both groups intact. - Commit `logbook.jsonl` updates with the next substantive role-owned commit or outgoing handoff commit. Make a logbook-only commit only when no other work will be committed for that handoff, such as a queued handoff that requires no role-specific work. ## Handoff Format - For every outgoing handoff, write only the role-specific handoff body to `./tmp/-handoff.txt`. -- To send that handoff, write `.swarmforge/notify/request` with `command: send`, `target: `, and `file: ./tmp/-handoff.txt`, then run `notify-agent.sh` with no arguments. -- Before sending a handoff, commit all role-owned work and any required `logbook.jsonl` updates. Do not make another commit until `notify-agent.sh` completes successfully. +- To send that handoff, write `.swarmforge/notify/request` with `command: send`, `target: `, and `file: ./tmp/-handoff.txt`, then run `swarm-handoff` with no arguments. +- Before sending a handoff, commit all role-owned work and any required `logbook.jsonl` updates. Do not make another commit until `swarm-handoff` completes successfully. - The send helper wraps the body with `message type`, `message id`, `sender role`, `target role`, `message sequence`, `message priority`, `branch name`, and 10-character `commit hash` fields, archives the sent message, appends the sent logbook entry, and sends the message. - Add `priority: NN` to `.swarmforge/notify/request` only when a role rule or explicit user instruction requires it; normal handoffs default to priority `50`. - Treat the generated `branch name` and `commit hash` fields as the authoritative state for the receiver to merge; do not type branch names or commit hashes into the handoff body. @@ -32,7 +32,7 @@ ## Receiving Handoffs - When receiving any handoff or resend request, save the complete incoming message to `./tmp/incoming-handoff.txt` or another local file. -- To receive the saved message, write `.swarmforge/notify/request` with `command: receive` and `file: `, then run `notify-agent.sh` with no arguments. +- To receive the saved message, write `.swarmforge/notify/request` with `command: receive` and `file: `, then run `swarm-handoff` with no arguments. - The receive helper ignores any leading terminal noise before the first valid protocol header, validates sequence order, archives the normalized message, and queues accepted handoffs under `.swarmforge/handoffs/queue/accepted/`. - Do not process the saved incoming file directly. Process only accepted queue files created by the receive helper. - If the receive helper reports `DO NOT PROCESS`, do not merge, apply, or otherwise act on that handoff; the helper has detected an out-of-order, duplicate, or stale message and has requested any required resend. @@ -46,5 +46,5 @@ - Queue file names begin with the sender-specified two-digit priority, then a timestamp, stream, and sequence. Process accepted queue files in sorted filename order. - Do not receive accepted queue files again. They have already passed receive validation. - If one or more accepted handoffs are queued while you are busy, finish the current job before acting on queued messages. -- After finishing the accepted queue file, write `.swarmforge/notify/request` with `command: complete` and `file: `, then run `notify-agent.sh` with no arguments. +- After finishing the accepted queue file, write `.swarmforge/notify/request` with `command: complete` and `file: `, then run `swarm-handoff` with no arguments. - Do not delete accepted queue files directly. diff --git a/swarmforge/constitution/articles/workflow.prompt b/swarmforge/constitution/articles/workflow.prompt index 9de6d1b..b675dbe 100644 --- a/swarmforge/constitution/articles/workflow.prompt +++ b/swarmforge/constitution/articles/workflow.prompt @@ -5,7 +5,7 @@ - If your assigned worktree is `master`, work in the main project checkout on its current branch; do not expect or create a `.worktrees/` directory for that role. - Work only in your assigned branch or worktree. - Do not inspect, diff, merge, or base work on another branch unless that branch is specifically named in a handoff or explicit user instruction. -- Do not run `./swarm` from an agent worktree to repair helper scripts. If `notify-agent.sh` is missing from PATH, stop and report the startup failure. +- Do not run `./swarm` from an agent worktree to repair helper scripts. If `swarm-handoff` is missing from PATH, stop and report the startup failure. ## Temporary Files - Use `./tmp/` in your assigned worktree for temporary files; do not use `/tmp`. diff --git a/swarmforge/scripts/notify-agent.sh b/swarmforge/scripts/notify-agent.sh index 06b5182..6ad52bc 100755 --- a/swarmforge/scripts/notify-agent.sh +++ b/swarmforge/scripts/notify-agent.sh @@ -2,219 +2,4 @@ set -euo pipefail SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)" - -usage() { - echo "Usage: notify-agent.sh" >&2 - echo "Usage: notify-agent.sh send --file [--sender ] [--priority NN]" >&2 - echo " notify-agent.sh receive --file [--receiver ]" >&2 - echo " notify-agent.sh complete --file " >&2 - echo " notify-agent.sh --file " >&2 -} - -find_project_dir() { - local git_common_dir worktree_root - - if worktree_root=$(git -C "$SCRIPT_DIR" rev-parse --show-toplevel 2>/dev/null); then - if [[ -f "$worktree_root/.swarmforge/sessions.tsv" && -f "$worktree_root/.swarmforge/tmux-socket" ]]; then - echo "$worktree_root" - return 0 - fi - fi - - if git_common_dir=$(git -C "$SCRIPT_DIR" rev-parse --git-common-dir 2>/dev/null); then - if [[ "$git_common_dir" != /* ]]; then - git_common_dir="$(cd "$SCRIPT_DIR/$git_common_dir" && pwd)" - fi - local project_dir="${git_common_dir:h}" - if [[ -f "$project_dir/.swarmforge/sessions.tsv" ]]; then - echo "$project_dir" - return 0 - fi - fi - - echo "${SCRIPT_DIR:h:h}" -} - -request_field() { - local field="$1" - local file="$2" - local line - - line="$(grep -m 1 -E "^${field}: " "$file" || true)" - [[ -n "$line" ]] || return 1 - printf '%s\n' "${line#*: }" -} - -safe_request_path() { - local path="$1" - - [[ -n "$path" ]] || return 1 - [[ "$path" != /* ]] || return 1 - [[ "$path" != "." ]] || return 1 - [[ "$path" != ".." ]] || return 1 - [[ "$path" != "../"* ]] || return 1 - [[ "$path" != *"/../"* ]] || return 1 - [[ "$path" != *"/.." ]] || return 1 - return 0 -} - -run_request_file() { - local project_dir request_file command target file priority sender receiver archive_dir archive_file exit_status - local -a args - - project_dir="$(find_project_dir)" - request_file="$project_dir/.swarmforge/notify/request" - archive_dir="$project_dir/.swarmforge/notify/archive" - - if [[ ! -f "$request_file" ]]; then - echo "Notify request file not found: $request_file" >&2 - exit 1 - fi - - command="$(request_field command "$request_file")" || { - echo "Notify request missing command: $request_file" >&2 - exit 1 - } - file="$(request_field file "$request_file")" || { - echo "Notify request missing file: $request_file" >&2 - exit 1 - } - priority="$(request_field priority "$request_file" || true)" - sender="$(request_field sender "$request_file" || true)" - receiver="$(request_field receiver "$request_file" || true)" - - if ! safe_request_path "$file"; then - echo "Notify request file must be a safe relative path: $file" >&2 - exit 1 - fi - - case "$command" in - send) - target="$(request_field target "$request_file")" || { - echo "Notify send request missing target: $request_file" >&2 - exit 1 - } - args=("$SCRIPT_DIR/send-handoff.sh" "$target" "--file" "$file") - [[ -z "$sender" ]] || args+=("--sender" "$sender") - [[ -z "$priority" ]] || args+=("--priority" "$priority") - ;; - receive) - args=("$SCRIPT_DIR/receive-handoff.sh" "--file" "$file") - [[ -z "$receiver" ]] || args+=("--receiver" "$receiver") - ;; - complete) - args=("$SCRIPT_DIR/complete-handoff.sh" "--file" "$file") - ;; - *) - echo "Unknown notify request command: $command" >&2 - exit 1 - ;; - esac - - set +e - "${args[@]}" - exit_status=$? - set -e - - if [[ "$exit_status" -eq 0 ]]; then - mkdir -p "$archive_dir" - archive_file="$archive_dir/$(date '+%Y%m%d-%H%M%S')-$$.request" - mv "$request_file" "$archive_file" - echo "Archived notify request: $archive_file" - fi - - exit "$exit_status" -} - -if [[ $# -eq 0 ]]; then - run_request_file -fi - -if [[ $# -gt 0 ]]; then - case "$1" in - send) - shift - exec "$SCRIPT_DIR/send-handoff.sh" "$@" - ;; - receive) - shift - exec "$SCRIPT_DIR/receive-handoff.sh" "$@" - ;; - resend) - shift - exec "$SCRIPT_DIR/resend-handoff.sh" "$@" - ;; - complete) - shift - exec "$SCRIPT_DIR/complete-handoff.sh" "$@" - ;; - esac -fi - -if [[ $# -ne 3 || "${2:-}" != "--file" ]]; then - usage - exit 1 -fi - -TARGET="$1" -MESSAGE_FILE="$3" - -PROJECT_DIR="$(find_project_dir)" -SESSIONS_FILE="$PROJECT_DIR/.swarmforge/sessions.tsv" -TMUX_SOCKET_FILE="$PROJECT_DIR/.swarmforge/tmux-socket" -TMUX_ENV_FILE="$PROJECT_DIR/.swarmforge/tmux-env" - -if [[ ! -f "$SESSIONS_FILE" ]]; then - echo "Sessions file not found: $SESSIONS_FILE" >&2 - exit 1 -fi - -resolve_session() { - local target="${1:l}" - local index role session display agent - - while IFS=$'\t' read -r index role session display agent; do - if [[ "$target" == "${index:l}" || "$target" == "${role:l}" ]]; then - echo "$session" - return 0 - fi - done < "$SESSIONS_FILE" - - return 1 -} - -TARGET_SESSION=$(resolve_session "$TARGET") || { - echo "Unknown target: $TARGET" >&2 - exit 1 -} - -if [[ ! -f "$MESSAGE_FILE" ]]; then - echo "Message file not found: $MESSAGE_FILE" >&2 - exit 1 -fi -MESSAGE="$(< "$MESSAGE_FILE")" - -if [[ -z "${TMUX:-}" && -f "$TMUX_ENV_FILE" ]]; then - TMUX="$(< "$TMUX_ENV_FILE")" - export TMUX -fi - -if [[ -n "${TMUX:-}" ]]; then - tmux send-keys -t "$TARGET_SESSION" -l -- "$MESSAGE" - sleep 0.15 - tmux send-keys -t "$TARGET_SESSION" C-m - sleep 0.05 - tmux send-keys -t "$TARGET_SESSION" C-j -else - if [[ ! -f "$TMUX_SOCKET_FILE" ]]; then - echo "Tmux socket file not found: $TMUX_SOCKET_FILE" >&2 - exit 1 - fi - - TMUX_SOCKET="$(< "$TMUX_SOCKET_FILE")" - tmux -S "$TMUX_SOCKET" send-keys -t "$TARGET_SESSION" -l -- "$MESSAGE" - sleep 0.15 - tmux -S "$TMUX_SOCKET" send-keys -t "$TARGET_SESSION" C-m - sleep 0.05 - tmux -S "$TMUX_SOCKET" send-keys -t "$TARGET_SESSION" C-j -fi +exec "$SCRIPT_DIR/swarm-handoff" "$@" diff --git a/swarmforge/scripts/resend-handoff.sh b/swarmforge/scripts/resend-handoff.sh index 4b7899a..1ba9680 100755 --- a/swarmforge/scripts/resend-handoff.sh +++ b/swarmforge/scripts/resend-handoff.sh @@ -63,7 +63,7 @@ for (( n = start_num; n <= end_num; n++ )); do echo "Archived handoff not found: $archived" >&2 exit 1 fi - notify-agent.sh "$TARGET" --file "$archived" + swarm-handoff "$TARGET" --file "$archived" handoff_append_logbook "sent" "$(< "$archived")" "resent $STREAM sequence $seq to $TARGET" echo "Resent $STREAM $seq" done diff --git a/swarmforge/scripts/send-handoff.sh b/swarmforge/scripts/send-handoff.sh index 98188fb..c76b6f9 100755 --- a/swarmforge/scripts/send-handoff.sh +++ b/swarmforge/scripts/send-handoff.sh @@ -107,7 +107,7 @@ PROJECT_DIR="$(handoff_project_dir)" TARGET_AGENT="$(handoff_agent_type "$PROJECT_DIR" "$TARGET" 2>/dev/null || echo 'unknown')" if [[ "$TARGET_AGENT" != "claude" ]]; then - notify-agent.sh "$TARGET" --file "$ARCHIVE_FILE" + swarm-handoff "$TARGET" --file "$ARCHIVE_FILE" else PENDING_DIR="$(handoff_pending_dir "$PROJECT_DIR" "$TARGET")" mkdir -p "$PENDING_DIR" diff --git a/swarmforge/scripts/swarm-handoff b/swarmforge/scripts/swarm-handoff new file mode 100755 index 0000000..278ef08 --- /dev/null +++ b/swarmforge/scripts/swarm-handoff @@ -0,0 +1,220 @@ +#!/usr/bin/env zsh +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)" + +usage() { + echo "Usage: swarm-handoff" >&2 + echo "Usage: swarm-handoff send --file [--sender ] [--priority NN]" >&2 + echo " swarm-handoff receive --file [--receiver ]" >&2 + echo " swarm-handoff complete --file " >&2 + echo " swarm-handoff --file " >&2 +} + +find_project_dir() { + local git_common_dir worktree_root + + if worktree_root=$(git -C "$SCRIPT_DIR" rev-parse --show-toplevel 2>/dev/null); then + if [[ -f "$worktree_root/.swarmforge/sessions.tsv" && -f "$worktree_root/.swarmforge/tmux-socket" ]]; then + echo "$worktree_root" + return 0 + fi + fi + + if git_common_dir=$(git -C "$SCRIPT_DIR" rev-parse --git-common-dir 2>/dev/null); then + if [[ "$git_common_dir" != /* ]]; then + git_common_dir="$(cd "$SCRIPT_DIR/$git_common_dir" && pwd)" + fi + local project_dir="${git_common_dir:h}" + if [[ -f "$project_dir/.swarmforge/sessions.tsv" ]]; then + echo "$project_dir" + return 0 + fi + fi + + echo "${SCRIPT_DIR:h:h}" +} + +request_field() { + local field="$1" + local file="$2" + local line + + line="$(grep -m 1 -E "^${field}: " "$file" || true)" + [[ -n "$line" ]] || return 1 + printf '%s\n' "${line#*: }" +} + +safe_request_path() { + local path="$1" + + [[ -n "$path" ]] || return 1 + [[ "$path" != /* ]] || return 1 + [[ "$path" != "." ]] || return 1 + [[ "$path" != ".." ]] || return 1 + [[ "$path" != "../"* ]] || return 1 + [[ "$path" != *"/../"* ]] || return 1 + [[ "$path" != *"/.." ]] || return 1 + return 0 +} + +run_request_file() { + local project_dir request_file command target file priority sender receiver archive_dir archive_file exit_status + local -a args + + project_dir="$(find_project_dir)" + request_file="$project_dir/.swarmforge/notify/request" + archive_dir="$project_dir/.swarmforge/notify/archive" + + if [[ ! -f "$request_file" ]]; then + echo "Notify request file not found: $request_file" >&2 + exit 1 + fi + + command="$(request_field command "$request_file")" || { + echo "Notify request missing command: $request_file" >&2 + exit 1 + } + file="$(request_field file "$request_file")" || { + echo "Notify request missing file: $request_file" >&2 + exit 1 + } + priority="$(request_field priority "$request_file" || true)" + sender="$(request_field sender "$request_file" || true)" + receiver="$(request_field receiver "$request_file" || true)" + + if ! safe_request_path "$file"; then + echo "Notify request file must be a safe relative path: $file" >&2 + exit 1 + fi + + case "$command" in + send) + target="$(request_field target "$request_file")" || { + echo "Notify send request missing target: $request_file" >&2 + exit 1 + } + args=("$SCRIPT_DIR/send-handoff.sh" "$target" "--file" "$file") + [[ -z "$sender" ]] || args+=("--sender" "$sender") + [[ -z "$priority" ]] || args+=("--priority" "$priority") + ;; + receive) + args=("$SCRIPT_DIR/receive-handoff.sh" "--file" "$file") + [[ -z "$receiver" ]] || args+=("--receiver" "$receiver") + ;; + complete) + args=("$SCRIPT_DIR/complete-handoff.sh" "--file" "$file") + ;; + *) + echo "Unknown notify request command: $command" >&2 + exit 1 + ;; + esac + + set +e + "${args[@]}" + exit_status=$? + set -e + + if [[ "$exit_status" -eq 0 ]]; then + mkdir -p "$archive_dir" + archive_file="$archive_dir/$(date '+%Y%m%d-%H%M%S')-$$.request" + mv "$request_file" "$archive_file" + echo "Archived notify request: $archive_file" + fi + + exit "$exit_status" +} + +if [[ $# -eq 0 ]]; then + run_request_file +fi + +if [[ $# -gt 0 ]]; then + case "$1" in + send) + shift + exec "$SCRIPT_DIR/send-handoff.sh" "$@" + ;; + receive) + shift + exec "$SCRIPT_DIR/receive-handoff.sh" "$@" + ;; + resend) + shift + exec "$SCRIPT_DIR/resend-handoff.sh" "$@" + ;; + complete) + shift + exec "$SCRIPT_DIR/complete-handoff.sh" "$@" + ;; + esac +fi + +if [[ $# -ne 3 || "${2:-}" != "--file" ]]; then + usage + exit 1 +fi + +TARGET="$1" +MESSAGE_FILE="$3" + +PROJECT_DIR="$(find_project_dir)" +SESSIONS_FILE="$PROJECT_DIR/.swarmforge/sessions.tsv" +TMUX_SOCKET_FILE="$PROJECT_DIR/.swarmforge/tmux-socket" +TMUX_ENV_FILE="$PROJECT_DIR/.swarmforge/tmux-env" + +if [[ ! -f "$SESSIONS_FILE" ]]; then + echo "Sessions file not found: $SESSIONS_FILE" >&2 + exit 1 +fi + +resolve_session() { + local target="${1:l}" + local index role session display agent + + while IFS=$'\t' read -r index role session display agent; do + if [[ "$target" == "${index:l}" || "$target" == "${role:l}" ]]; then + echo "$session" + return 0 + fi + done < "$SESSIONS_FILE" + + return 1 +} + +TARGET_SESSION=$(resolve_session "$TARGET") || { + echo "Unknown target: $TARGET" >&2 + exit 1 +} + +if [[ ! -f "$MESSAGE_FILE" ]]; then + echo "Message file not found: $MESSAGE_FILE" >&2 + exit 1 +fi +MESSAGE="$(< "$MESSAGE_FILE")" + +if [[ -z "${TMUX:-}" && -f "$TMUX_ENV_FILE" ]]; then + TMUX="$(< "$TMUX_ENV_FILE")" + export TMUX +fi + +if [[ -n "${TMUX:-}" ]]; then + tmux send-keys -t "$TARGET_SESSION" -l -- "$MESSAGE" + sleep 0.15 + tmux send-keys -t "$TARGET_SESSION" C-m + sleep 0.05 + tmux send-keys -t "$TARGET_SESSION" C-j +else + if [[ ! -f "$TMUX_SOCKET_FILE" ]]; then + echo "Tmux socket file not found: $TMUX_SOCKET_FILE" >&2 + exit 1 + fi + + TMUX_SOCKET="$(< "$TMUX_SOCKET_FILE")" + tmux -S "$TMUX_SOCKET" send-keys -t "$TARGET_SESSION" -l -- "$MESSAGE" + sleep 0.15 + tmux -S "$TMUX_SOCKET" send-keys -t "$TARGET_SESSION" C-m + sleep 0.05 + tmux -S "$TMUX_SOCKET" send-keys -t "$TARGET_SESSION" C-j +fi diff --git a/swarmforge/scripts/swarmforge.sh b/swarmforge/scripts/swarmforge.sh index 1755e9a..87d4e88 100755 --- a/swarmforge/scripts/swarmforge.sh +++ b/swarmforge/scripts/swarmforge.sh @@ -298,7 +298,7 @@ write_sessions_file() { check_helper_scripts() { local helper - for helper in notify-agent.sh send-handoff.sh receive-handoff.sh resend-handoff.sh complete-handoff.sh handoff-lib.sh swarm-cleanup.sh swarm-stop.sh swarm-window-watchdog.sh swarm-terminal-adapter.sh; do + for helper in swarm-handoff notify-agent.sh send-handoff.sh receive-handoff.sh resend-handoff.sh complete-handoff.sh handoff-lib.sh swarm-cleanup.sh swarm-stop.sh swarm-window-watchdog.sh swarm-terminal-adapter.sh; do if [[ ! -x "$SCRIPT_DIR/$helper" ]]; then echo -e "${RED}Error:${RESET} Required helper script not found or not executable: $SCRIPT_DIR/$helper" exit 1 @@ -705,7 +705,7 @@ for (( i = 1; i <= ${#ROLES[@]}; i++ )); do echo -e " ${DISPLAY_NAMES[$i]}: ${SESSIONS[$i]}" done echo "" -echo -e "${GREEN}Tip: Write .swarmforge/notify/request, then run notify-agent.sh while the swarm is running.${RESET}" +echo -e "${GREEN}Tip: Write .swarmforge/notify/request, then run swarm-handoff while the swarm is running.${RESET}" echo -e "${GREEN}Tip: Reattach manually with 'tmux -S $TMUX_SOCKET attach-session -t ' if needed.${RESET}" echo "" From 1ea174a704052008ef7434c844119da5c31ed7a6 Mon Sep 17 00:00:00 2001 From: "Robert C. Martin" Date: Sun, 14 Jun 2026 09:14:47 -0500 Subject: [PATCH 23/67] Keep handoff logbook in swarm state --- README.md | 6 ++++-- swarmforge/constitution/articles/handoffs.prompt | 12 +++++------- swarmforge/scripts/handoff-lib.sh | 4 +++- 3 files changed, 12 insertions(+), 10 deletions(-) diff --git a/README.md b/README.md index 96bcecb..eff189a 100644 --- a/README.md +++ b/README.md @@ -268,7 +268,7 @@ The sender archives each outbound message under: .swarmforge/handoffs/sent//.txt ``` -After the low-level tmux send succeeds, the sender appends a `sent` entry to `logbook.jsonl`. If the tmux notification fails, the message remains archived for possible manual recovery, but no `sent` logbook entry is written. +After the low-level tmux send succeeds, the sender appends a `sent` entry to `.swarmforge/logbook.jsonl`. If the tmux notification fails, the message remains archived for possible manual recovery, but no `sent` logbook entry is written. When an agent receives a message, it saves the complete incoming text to a file and runs: @@ -281,7 +281,7 @@ Then it runs `swarm-handoff`. The receive helper ignores any leading terminal noise before the first valid `message type: handoff` or `message type: resend-request` header. It archives and queues only the normalized protocol message, not the noisy capture. -The receive helper checks `message type`, `message id`, sender, target, sequence, and priority. If a handoff is valid and in order, it archives the message, appends a `received` entry to `logbook.jsonl`, updates the last processed sequence for that sender-target stream, copies the handoff into the accepted queue, and prints the queued file path: +The receive helper checks `message type`, `message id`, sender, target, sequence, and priority. If a handoff is valid and in order, it archives the message, appends a `received` entry to `.swarmforge/logbook.jsonl`, updates the last processed sequence for that sender-target stream, copies the handoff into the accepted queue, and prints the queued file path: ```text .swarmforge/handoffs/queue/accepted/---.txt @@ -307,6 +307,8 @@ swarm-handoff --file Agents should not use the low-level target/file transport form for normal handoffs because it bypasses sequencing, archiving, resend recovery, and logbook handling. Agents should prefer the no-argument request-file form over explicit subcommands to keep command approvals stable. +Handoff logbooks are runtime state under `.swarmforge/logbook.jsonl`. Agents should not edit, merge, stage, or commit them. Use the handoff archives and generated logbook for local diagnostics only. + ## Recovery Strategy The protocol is designed for eventual correction rather than tmux-pane sniffing. diff --git a/swarmforge/constitution/articles/handoffs.prompt b/swarmforge/constitution/articles/handoffs.prompt index 4686162..b11a71a 100644 --- a/swarmforge/constitution/articles/handoffs.prompt +++ b/swarmforge/constitution/articles/handoffs.prompt @@ -7,19 +7,17 @@ - This startup notification is only a presence signal; it does not replace any role-specific handoff rule. ## Logbook -- Maintain the tracked root `logbook.jsonl` file for every handoff received or sent. -- Keep `logbook.jsonl` as append-only JSON Lines. Each line is one complete JSON object; never rewrite, reformat, sort, summarize, or merge existing entries. +- Handoff logbooks are runtime state under `.swarmforge/logbook.jsonl`. +- Keep `.swarmforge/logbook.jsonl` as append-only JSON Lines. Each line is one complete JSON object; never rewrite, reformat, sort, summarize, or merge existing entries. - Each log entry must include `timestamp`, `direction`, `message`, and `note` fields. Use `direction` values `received`, `sent`, or `queued`. -- Use `.swarmforge/notify/request` plus `swarm-handoff` to append normal sent, received, queued, and resend-request entries to `logbook.jsonl`. +- Use `.swarmforge/notify/request` plus `swarm-handoff` to append normal sent, received, queued, and resend-request entries to `.swarmforge/logbook.jsonl`. - Run `swarm-handoff` with no environment-variable prefix. Do not run `SWARMFORGE_ROLE= swarm-handoff`; startup already sets `SWARMFORGE_ROLE`, and the handoff command must stay stable. -- Do not hand-edit handoff logbook entries unless a helper script fails and explicit user direction says to repair the logbook manually. -- If a merge conflict occurs in `logbook.jsonl`, keep all complete JSON lines from both sides. Preserve message text exactly; order chronologically when obvious, otherwise keep both groups intact. -- Commit `logbook.jsonl` updates with the next substantive role-owned commit or outgoing handoff commit. Make a logbook-only commit only when no other work will be committed for that handoff, such as a queued handoff that requires no role-specific work. +- Do not hand-edit, merge, stage, or commit handoff logbooks unless explicit user direction says to repair runtime state manually. ## Handoff Format - For every outgoing handoff, write only the role-specific handoff body to `./tmp/-handoff.txt`. - To send that handoff, write `.swarmforge/notify/request` with `command: send`, `target: `, and `file: ./tmp/-handoff.txt`, then run `swarm-handoff` with no arguments. -- Before sending a handoff, commit all role-owned work and any required `logbook.jsonl` updates. Do not make another commit until `swarm-handoff` completes successfully. +- Before sending a handoff, commit all role-owned work. Do not make another commit until `swarm-handoff` completes successfully. - The send helper wraps the body with `message type`, `message id`, `sender role`, `target role`, `message sequence`, `message priority`, `branch name`, and 10-character `commit hash` fields, archives the sent message, appends the sent logbook entry, and sends the message. - Add `priority: NN` to `.swarmforge/notify/request` only when a role rule or explicit user instruction requires it; normal handoffs default to priority `50`. - Treat the generated `branch name` and `commit hash` fields as the authoritative state for the receiver to merge; do not type branch names or commit hashes into the handoff body. diff --git a/swarmforge/scripts/handoff-lib.sh b/swarmforge/scripts/handoff-lib.sh index 46fd858..761c89a 100755 --- a/swarmforge/scripts/handoff-lib.sh +++ b/swarmforge/scripts/handoff-lib.sh @@ -30,7 +30,9 @@ handoff_temp_file() { } handoff_logbook_file() { - echo "$PWD/logbook.jsonl" + local dir="$PWD/.swarmforge" + mkdir -p "$dir" + echo "$dir/logbook.jsonl" } handoff_timestamp() { From 2b3b3baaca4389c449b03e220b8028ef6e67f9dc Mon Sep 17 00:00:00 2001 From: "Robert C. Martin" Date: Sun, 14 Jun 2026 09:21:43 -0500 Subject: [PATCH 24/67] Trim prompt logbook guidance --- swarmforge/constitution/articles/handoffs.prompt | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/swarmforge/constitution/articles/handoffs.prompt b/swarmforge/constitution/articles/handoffs.prompt index b11a71a..ee3a4dc 100644 --- a/swarmforge/constitution/articles/handoffs.prompt +++ b/swarmforge/constitution/articles/handoffs.prompt @@ -7,18 +7,14 @@ - This startup notification is only a presence signal; it does not replace any role-specific handoff rule. ## Logbook -- Handoff logbooks are runtime state under `.swarmforge/logbook.jsonl`. -- Keep `.swarmforge/logbook.jsonl` as append-only JSON Lines. Each line is one complete JSON object; never rewrite, reformat, sort, summarize, or merge existing entries. -- Each log entry must include `timestamp`, `direction`, `message`, and `note` fields. Use `direction` values `received`, `sent`, or `queued`. -- Use `.swarmforge/notify/request` plus `swarm-handoff` to append normal sent, received, queued, and resend-request entries to `.swarmforge/logbook.jsonl`. -- Run `swarm-handoff` with no environment-variable prefix. Do not run `SWARMFORGE_ROLE= swarm-handoff`; startup already sets `SWARMFORGE_ROLE`, and the handoff command must stay stable. +- `swarm-handoff` maintains `.swarmforge/logbook.jsonl` as runtime state. - Do not hand-edit, merge, stage, or commit handoff logbooks unless explicit user direction says to repair runtime state manually. ## Handoff Format - For every outgoing handoff, write only the role-specific handoff body to `./tmp/-handoff.txt`. - To send that handoff, write `.swarmforge/notify/request` with `command: send`, `target: `, and `file: ./tmp/-handoff.txt`, then run `swarm-handoff` with no arguments. - Before sending a handoff, commit all role-owned work. Do not make another commit until `swarm-handoff` completes successfully. -- The send helper wraps the body with `message type`, `message id`, `sender role`, `target role`, `message sequence`, `message priority`, `branch name`, and 10-character `commit hash` fields, archives the sent message, appends the sent logbook entry, and sends the message. +- The send helper wraps the body with `message type`, `message id`, `sender role`, `target role`, `message sequence`, `message priority`, `branch name`, and 10-character `commit hash` fields, archives the sent message, and sends the message. - Add `priority: NN` to `.swarmforge/notify/request` only when a role rule or explicit user instruction requires it; normal handoffs default to priority `50`. - Treat the generated `branch name` and `commit hash` fields as the authoritative state for the receiver to merge; do not type branch names or commit hashes into the handoff body. - `message id` values have this form: `YYYYMMDD-HHMMSS-XXXXXX`, where `XXXXXX` is a six-character random lowercase hexadecimal suffix. Message type, sender, target, and sequence are recorded in separate fields. From 5bdc770f30b9a86452260069a8fc3e1edeeaaa66 Mon Sep 17 00:00:00 2001 From: "Robert C. Martin" Date: Sun, 14 Jun 2026 10:05:47 -0500 Subject: [PATCH 25/67] Require latest quality tools from GitHub --- swarmforge/constitution/articles/engineering.prompt | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/swarmforge/constitution/articles/engineering.prompt b/swarmforge/constitution/articles/engineering.prompt index 0297b5b..074bac4 100644 --- a/swarmforge/constitution/articles/engineering.prompt +++ b/swarmforge/constitution/articles/engineering.prompt @@ -1,7 +1,8 @@ # Engineering Rules ## Startup Tools -- On startup, acquire the github tools for the project language and get them ready to run. +- On startup, procure the latest CRAP, mutation, and DRY tools for the project language directly from the listed `github.com/unclebob/...` repositories and get them ready to run. +- Do not rely on stale cached, vendored, or preinstalled copies when a fresh GitHub install/build is possible in the current environment. - Language tool table: - Go: install with `go install`; mutation `github.com/unclebob/mutate4go`, CRAP `github.com/unclebob/crap4go`, DRY `github.com/unclebob/dry4go`. - Clojure: install with Clojure CLI/deps.edn; mutation `github.com/unclebob/clj-mutate`, CRAP `github.com/unclebob/crap4clj`, DRY `github.com/unclebob/dry4clj`. From 2c53669c5781ea8d851d52f082cc5006be06ad57 Mon Sep 17 00:00:00 2001 From: "Robert C. Martin" Date: Sun, 14 Jun 2026 10:39:18 -0500 Subject: [PATCH 26/67] Document swarm-handoff approval prefix --- README.md | 2 ++ swarmforge/constitution/articles/handoffs.prompt | 5 +++++ 2 files changed, 7 insertions(+) diff --git a/README.md b/README.md index eff189a..3efd463 100644 --- a/README.md +++ b/README.md @@ -219,6 +219,8 @@ swarm-handoff This stable command shape is intentional. Some command approval systems approve future commands by their literal command prefix. If every handoff uses a different command line, each target, file path, priority, or queue filename can create a new approval prompt. The request file moves those variable arguments into local project state, so a single approval for `swarm-handoff` covers normal handoff send, receive, resend, and completion operations. +If `swarm-handoff` requires escalation, the agent should request reusable approval for the exact command prefix `["swarm-handoff"]`. Do not use `SWARMFORGE_ROLE= swarm-handoff`, `./swarm-handoff`, an absolute path, `zsh -c swarm-handoff`, redirection, or a wrapper command for normal handoffs; those forms can prevent the approval UI from offering a reusable prefix for the stable handoff command. + Do not have agents run `tmux -S ...` directly. The `swarm-handoff` transport detects when it is already running inside a tmux pane and uses the inherited tmux client context: ```sh diff --git a/swarmforge/constitution/articles/handoffs.prompt b/swarmforge/constitution/articles/handoffs.prompt index ee3a4dc..357d018 100644 --- a/swarmforge/constitution/articles/handoffs.prompt +++ b/swarmforge/constitution/articles/handoffs.prompt @@ -10,6 +10,11 @@ - `swarm-handoff` maintains `.swarmforge/logbook.jsonl` as runtime state. - Do not hand-edit, merge, stage, or commit handoff logbooks unless explicit user direction says to repair runtime state manually. +## Command Approval +- Run handoffs exactly as `swarm-handoff` with no arguments after writing `.swarmforge/notify/request`. +- Do not run `SWARMFORGE_ROLE= swarm-handoff`, `./swarm-handoff`, an absolute path to `swarm-handoff`, `zsh -c swarm-handoff`, or any redirection/wrapper form for normal handoffs. +- If `swarm-handoff` needs escalation, request reusable approval for the command prefix `["swarm-handoff"]`. + ## Handoff Format - For every outgoing handoff, write only the role-specific handoff body to `./tmp/-handoff.txt`. - To send that handoff, write `.swarmforge/notify/request` with `command: send`, `target: `, and `file: ./tmp/-handoff.txt`, then run `swarm-handoff` with no arguments. From d8afe64fefa26ec2f8e1bc844f0bc5fab129ab9c Mon Sep 17 00:00:00 2001 From: "Robert C. Martin" Date: Sun, 14 Jun 2026 10:41:28 -0500 Subject: [PATCH 27/67] Protect swarm-handoff executable --- README.md | 2 ++ swarmforge/constitution/articles/handoffs.prompt | 2 ++ 2 files changed, 4 insertions(+) diff --git a/README.md b/README.md index 3efd463..30d39b1 100644 --- a/README.md +++ b/README.md @@ -221,6 +221,8 @@ This stable command shape is intentional. Some command approval systems approve If `swarm-handoff` requires escalation, the agent should request reusable approval for the exact command prefix `["swarm-handoff"]`. Do not use `SWARMFORGE_ROLE= swarm-handoff`, `./swarm-handoff`, an absolute path, `zsh -c swarm-handoff`, redirection, or a wrapper command for normal handoffs; those forms can prevent the approval UI from offering a reusable prefix for the stable handoff command. +`swarm-handoff` is the executable command, not a request file. Never write, redirect, append, or pipe handoff content to `swarm-handoff` or `./swarm-handoff`; commands like `printf ... > ./swarm-handoff` can replace the helper script. Agents write request content only to `.swarmforge/notify/request` and message content only to `./tmp/-handoff.txt` or the explicit incoming-message file named in the receive procedure. + Do not have agents run `tmux -S ...` directly. The `swarm-handoff` transport detects when it is already running inside a tmux pane and uses the inherited tmux client context: ```sh diff --git a/swarmforge/constitution/articles/handoffs.prompt b/swarmforge/constitution/articles/handoffs.prompt index 357d018..dc6352e 100644 --- a/swarmforge/constitution/articles/handoffs.prompt +++ b/swarmforge/constitution/articles/handoffs.prompt @@ -13,6 +13,8 @@ ## Command Approval - Run handoffs exactly as `swarm-handoff` with no arguments after writing `.swarmforge/notify/request`. - Do not run `SWARMFORGE_ROLE= swarm-handoff`, `./swarm-handoff`, an absolute path to `swarm-handoff`, `zsh -c swarm-handoff`, or any redirection/wrapper form for normal handoffs. +- Never write, redirect, append, or pipe content to `swarm-handoff` or `./swarm-handoff`; it is the executable command, not a request file. +- Write handoff request content only to `.swarmforge/notify/request` and handoff message content only to `./tmp/-handoff.txt` or the explicitly named incoming-message file. - If `swarm-handoff` needs escalation, request reusable approval for the command prefix `["swarm-handoff"]`. ## Handoff Format From 0f11d636c36fcfae35c4c2dbfa3b5861e2ce51b3 Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Sun, 14 Jun 2026 19:53:02 -0300 Subject: [PATCH 28/67] refactor(harness): simplify delivery; remove executing logbook and claude branching - Remove executing logbook entry from complete-handoff and swarm-stop - Revert handoff_append_logbook to upstream 3-arg schema (no hash/sender) - Extract _do_deliver shared function in swarm-handoff - Remove claude/non-claude branching; always clear-first deliver - Remove resolve_agent duplication; resolve_session parameterized - Remove dangling executing-field reference from ADR 0017 --- docs/adr/0017-inlined-prompt-bundle.md | 9 +- swarmforge/scripts/complete-handoff.sh | 12 -- swarmforge/scripts/handoff-lib.sh | 80 ------------- swarmforge/scripts/swarm-handoff | 149 +++++++++++++++++++------ swarmforge/scripts/swarm-stop.sh | 5 +- 5 files changed, 116 insertions(+), 139 deletions(-) diff --git a/docs/adr/0017-inlined-prompt-bundle.md b/docs/adr/0017-inlined-prompt-bundle.md index 14fff03..625ef57 100644 --- a/docs/adr/0017-inlined-prompt-bundle.md +++ b/docs/adr/0017-inlined-prompt-bundle.md @@ -8,13 +8,8 @@ Upstream builds a role's launch context by concatenating its constitution and ro **The bundle is the unit of delivery, not just of launch.** Clear-first delivery (ADR 0002) wipes the session with `/clear` and then *re-injects the role bundle* before every task. That re-injection needs a single, complete, deduplicated context to re-send — which is exactly what the resolver produces. A naive recursive concatenation is fine to build once at launch but is the wrong shape to re-send reliably on every handoff. -**It is the prerequisite for knowledge injection.** ADR 0014 appends the project's `AGENTS.md` and the role's `.agents/` file into this same envelope. There is nowhere to append them, and no well-defined boundary to append them at, until the context is a structured bundle rather than flat concatenated text. 0014 and the session-restart `executing` fields (ADR 0002) both build on top of the bundle. +**It is the prerequisite for knowledge injection.** ADR 0014 appends the project's `AGENTS.md` and the role's `.agents/` file into this same envelope. There is nowhere to append them, and no well-defined boundary to append them at, until the context is a structured bundle rather than flat concatenated text. 0014 builds on top of the bundle. **Why an XML envelope.** Explicit `` boundaries let the agent tell its constitution from its role prompt from its promoted knowledge, instead of inferring breaks in a wall of concatenated text; and the BFS dedup keeps a cross-referenced constitution (articles, the dependency manifest) from appearing two or three times. -This divergence is taken in its **minimal translated form**: the resolver and envelope are ported onto upstream's current tmux delivery harness, not lifted from the pre-reset implementation where they were entangled with the dropped cmux backend. - -## Pending implementation - -- `main`: replace the recursive-read heredoc in `write_agent_instruction_file` with `resolve_prompt_bundle` (BFS, dedup by resolved path) emitting the `` envelope; wire the resolved bundle into upstream's delivery path. Source: `backup/main-pre-reset:swarmforge/scripts/swarmforge.sh` — re-base onto current upstream, do not copy. -- Prerequisite for ADR 0014 (`.agents` injection) and the ADR 0002 `executing`-field recovery; both re-base on the bundle. +This divergence is taken in its **minimal translated form**: the resolver and envelope are ported onto upstream's current tmux delivery harness. diff --git a/swarmforge/scripts/complete-handoff.sh b/swarmforge/scripts/complete-handoff.sh index 76f3a34..b5949b4 100755 --- a/swarmforge/scripts/complete-handoff.sh +++ b/swarmforge/scripts/complete-handoff.sh @@ -1,9 +1,6 @@ #!/usr/bin/env zsh set -euo pipefail -SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)" -source "$SCRIPT_DIR/handoff-lib.sh" - usage() { echo "Usage: complete-handoff.sh --file " >&2 } @@ -49,14 +46,5 @@ if [[ -e "$target" ]]; then target="$COMPLETED_DIR/$(date '+%Y%m%d-%H%M%S')-$base" fi -message="$(< "$QUEUE_FILE")" -msg_hash="$(handoff_field "commit hash" "$QUEUE_FILE" 2>/dev/null || true)" -msg_sender="$(handoff_field "sender role" "$QUEUE_FILE" 2>/dev/null || true)" -printf '{"timestamp":"%s","direction":"executing","message":"%s","hash":"%s","sender":"%s"}\n' \ - "$(handoff_json_escape "$(handoff_timestamp)")" \ - "$(handoff_json_escape "$message")" \ - "$(handoff_json_escape "$msg_hash")" \ - "$(handoff_json_escape "$msg_sender")" >> "$(handoff_logbook_file)" - mv "$QUEUE_FILE" "$target" echo "COMPLETED $target" diff --git a/swarmforge/scripts/handoff-lib.sh b/swarmforge/scripts/handoff-lib.sh index 761c89a..5bd1273 100755 --- a/swarmforge/scripts/handoff-lib.sh +++ b/swarmforge/scripts/handoff-lib.sh @@ -218,83 +218,3 @@ handoff_busy_file() { local role="$2" echo "$project_dir/.swarmforge/$role.busy" } - -handoff_agent_type() { - local project_dir="$1" - local role="$2" - local sessions_file="$project_dir/.swarmforge/sessions.tsv" - local idx r session display agent - - [[ -f "$sessions_file" ]] || return 1 - while IFS=$'\t' read -r idx r session display agent; do - if [[ "${r:l}" == "${role:l}" ]]; then - echo "$agent" - return 0 - fi - done < "$sessions_file" - return 1 -} - -handoff_clear_first_deliver() { - local project_dir="$1" - local role="$2" - local message_file="$3" - - local sessions_file="$project_dir/.swarmforge/sessions.tsv" - local tmux_socket_file="$project_dir/.swarmforge/tmux-socket" - local tmux_env_file="$project_dir/.swarmforge/tmux-env" - local bundle_file="$project_dir/.swarmforge/prompts/${role}.md" - - local target_session="" - local idx r session display agent - while IFS=$'\t' read -r idx r session display agent; do - if [[ "${r:l}" == "${role:l}" ]]; then - target_session="$session" - break - fi - done < "$sessions_file" - - if [[ -z "$target_session" ]]; then - echo "handoff_clear_first_deliver: role not found in sessions.tsv: $role" >&2 - return 1 - fi - - if [[ -z "${TMUX:-}" && -f "$tmux_env_file" ]]; then - TMUX="$(< "$tmux_env_file")" - export TMUX - fi - - local -a tmux_cmd=() - if [[ -n "${TMUX:-}" ]]; then - tmux_cmd=(tmux send-keys -t "$target_session") - else - local socket - socket="$(< "$tmux_socket_file")" - tmux_cmd=(tmux -S "$socket" send-keys -t "$target_session") - fi - - # Send /clear then wait for context to reset - "${tmux_cmd[@]}" -l -- '/clear' - sleep 0.15 - "${tmux_cmd[@]}" C-m - sleep 0.05 - "${tmux_cmd[@]}" C-j - sleep 1 - - # Re-inject bundle if present - if [[ -f "$bundle_file" ]]; then - "${tmux_cmd[@]}" -l -- "$(< "$bundle_file")" - sleep 0.15 - "${tmux_cmd[@]}" C-m - sleep 0.05 - "${tmux_cmd[@]}" C-j - sleep 0.5 - fi - - # Send protocol message - "${tmux_cmd[@]}" -l -- "$(< "$message_file")" - sleep 0.15 - "${tmux_cmd[@]}" C-m - sleep 0.05 - "${tmux_cmd[@]}" C-j -} diff --git a/swarmforge/scripts/swarm-handoff b/swarmforge/scripts/swarm-handoff index 278ef08..b331770 100755 --- a/swarmforge/scripts/swarm-handoff +++ b/swarmforge/scripts/swarm-handoff @@ -2,6 +2,7 @@ set -euo pipefail SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)" +source "$SCRIPT_DIR/handoff-lib.sh" usage() { echo "Usage: swarm-handoff" >&2 @@ -126,12 +127,111 @@ run_request_file() { exit "$exit_status" } +resolve_session() { + local target="${1:l}" + local sessions_file="$2" + local index role session display agent + + while IFS=$'\t' read -r index role session display agent; do + if [[ "$target" == "${index:l}" || "$target" == "${role:l}" ]]; then + echo "$session" + return 0 + fi + done < "$sessions_file" + + return 1 +} + +# _do_deliver — clear-first delivery for a specific target. +# Always sends /clear, re-injects the bundle, then sends the message. +_do_deliver() { + local target="$1" + local message_file="$2" + local project_dir="$3" + local sessions_file="$4" + local tmux_socket_file="$5" + local tmux_env_file="$6" + + local target_session + target_session=$(resolve_session "$target" "$sessions_file") || { + echo "Unknown target: $target" >&2 + return 1 + } + + if [[ ! -f "$message_file" ]]; then + echo "Message file not found: $message_file" >&2 + return 1 + fi + local message + message="$(< "$message_file")" + + if [[ -z "${TMUX:-}" && -f "$tmux_env_file" ]]; then + TMUX="$(< "$tmux_env_file")" + export TMUX + fi + + local bundle_file="$project_dir/.swarmforge/prompts/${target}.md" + + local -a tmux_cmd=() + if [[ -n "${TMUX:-}" ]]; then + tmux_cmd=(tmux send-keys -t "$target_session") + else + local socket + socket="$(< "$tmux_socket_file")" + tmux_cmd=(tmux -S "$socket" send-keys -t "$target_session") + fi + + # Clear + "${tmux_cmd[@]}" -l -- '/clear' + sleep 0.15 + "${tmux_cmd[@]}" C-m + sleep 0.05 + "${tmux_cmd[@]}" C-j + sleep 1 + + # Re-inject bundle if present + if [[ -f "$bundle_file" ]]; then + "${tmux_cmd[@]}" -l -- "$(< "$bundle_file")" + sleep 0.15 + "${tmux_cmd[@]}" C-m + sleep 0.05 + "${tmux_cmd[@]}" C-j + sleep 0.5 + fi + + # Send protocol message + "${tmux_cmd[@]}" -l -- "$message" + sleep 0.15 + "${tmux_cmd[@]}" C-m + sleep 0.05 + "${tmux_cmd[@]}" C-j +} + if [[ $# -eq 0 ]]; then run_request_file fi if [[ $# -gt 0 ]]; then case "$1" in + deliver) + shift + if [[ $# -ne 3 || "${2:-}" != "--file" ]]; then + usage + exit 1 + fi + TARGET="$1" + MESSAGE_FILE="$3" + PROJECT_DIR="$(find_project_dir)" + SESSIONS_FILE="$PROJECT_DIR/.swarmforge/sessions.tsv" + TMUX_SOCKET_FILE="$PROJECT_DIR/.swarmforge/tmux-socket" + TMUX_ENV_FILE="$PROJECT_DIR/.swarmforge/tmux-env" + if [[ ! -f "$SESSIONS_FILE" ]]; then + echo "Sessions file not found: $SESSIONS_FILE" >&2 + exit 1 + fi + _do_deliver "$TARGET" "$MESSAGE_FILE" "$PROJECT_DIR" "$SESSIONS_FILE" "$TMUX_SOCKET_FILE" "$TMUX_ENV_FILE" + exit 0 + ;; send) shift exec "$SCRIPT_DIR/send-handoff.sh" "$@" @@ -169,21 +269,7 @@ if [[ ! -f "$SESSIONS_FILE" ]]; then exit 1 fi -resolve_session() { - local target="${1:l}" - local index role session display agent - - while IFS=$'\t' read -r index role session display agent; do - if [[ "$target" == "${index:l}" || "$target" == "${role:l}" ]]; then - echo "$session" - return 0 - fi - done < "$SESSIONS_FILE" - - return 1 -} - -TARGET_SESSION=$(resolve_session "$TARGET") || { +TARGET_SESSION=$(resolve_session "$TARGET" "$SESSIONS_FILE") || { echo "Unknown target: $TARGET" >&2 exit 1 } @@ -194,27 +280,18 @@ if [[ ! -f "$MESSAGE_FILE" ]]; then fi MESSAGE="$(< "$MESSAGE_FILE")" -if [[ -z "${TMUX:-}" && -f "$TMUX_ENV_FILE" ]]; then - TMUX="$(< "$TMUX_ENV_FILE")" - export TMUX +# Queue the message in pending/ +PENDING_DIR="$(handoff_pending_dir "$PROJECT_DIR" "$TARGET")" +mkdir -p "$PENDING_DIR" +PENDING_FILE="$PENDING_DIR/$(handoff_id_timestamp)-pending.txt" +printf '%s' "$MESSAGE" > "$PENDING_FILE" + +# Check busy marker; if busy, queue will be drained by Stop hook +BUSY_FILE="$(handoff_busy_file "$PROJECT_DIR" "$TARGET")" +if ! ( set -C; > "$BUSY_FILE" ) 2>/dev/null; then + exit 0 fi -if [[ -n "${TMUX:-}" ]]; then - tmux send-keys -t "$TARGET_SESSION" -l -- "$MESSAGE" - sleep 0.15 - tmux send-keys -t "$TARGET_SESSION" C-m - sleep 0.05 - tmux send-keys -t "$TARGET_SESSION" C-j -else - if [[ ! -f "$TMUX_SOCKET_FILE" ]]; then - echo "Tmux socket file not found: $TMUX_SOCKET_FILE" >&2 - exit 1 - fi +_do_deliver "$TARGET" "$MESSAGE_FILE" "$PROJECT_DIR" "$SESSIONS_FILE" "$TMUX_SOCKET_FILE" "$TMUX_ENV_FILE" - TMUX_SOCKET="$(< "$TMUX_SOCKET_FILE")" - tmux -S "$TMUX_SOCKET" send-keys -t "$TARGET_SESSION" -l -- "$MESSAGE" - sleep 0.15 - tmux -S "$TMUX_SOCKET" send-keys -t "$TARGET_SESSION" C-m - sleep 0.05 - tmux -S "$TMUX_SOCKET" send-keys -t "$TARGET_SESSION" C-j -fi +rm -f "$PENDING_FILE" diff --git a/swarmforge/scripts/swarm-stop.sh b/swarmforge/scripts/swarm-stop.sh index f9331da..b3ab809 100755 --- a/swarmforge/scripts/swarm-stop.sh +++ b/swarmforge/scripts/swarm-stop.sh @@ -42,10 +42,7 @@ _swarm_stop_main() { touch "$busy_file" - pending_content="$(< "$pending_file")" - handoff_append_logbook "executing" "$pending_content" "clear-first delivery from Stop hook" - - handoff_clear_first_deliver "$project_dir" "$role" "$pending_file" + "$SCRIPT_DIR/swarm-handoff" deliver "$role" --file "$pending_file" rm -f "$pending_file" } From da5a8135625cc817681c98ace84b6a3e49d92b6f Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Sun, 14 Jun 2026 20:10:35 -0300 Subject: [PATCH 29/67] docs(readme): fix install URL to point at fork --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 30d39b1..a7581f4 100644 --- a/README.md +++ b/README.md @@ -57,7 +57,7 @@ In the directory where you want to use SwarmForge, choose a runnable branch and ```sh BRANCH=four-pack -curl -L "https://github.com/unclebob/swarm-forge/archive/refs/heads/${BRANCH}.tar.gz" | tar -xz --strip-components=1 +curl -L "https://github.com/gabadi/swarm-forge/archive/refs/heads/${BRANCH}.tar.gz" | tar -xz --strip-components=1 ``` Use `BRANCH=six-pack` instead when you want the six-agent workflow. Do not use `main` for this command; `main` is documentary and stores the shared operational scripts, while the runnable branches provide the configurations and prompts intended for projects. From f41852fa0dfa84baf031e76c0f890dac20e6f7b5 Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Sun, 14 Jun 2026 20:51:19 -0300 Subject: [PATCH 30/67] last_changes --- .agents/references/hunk-agent-guide.md | 61 +++++++++++++++++++ .agents/references/hunk-user-guide.md | 57 +++++++++++++++++ AGENTS.md | 5 ++ ...0002-idle-gate-and-clear-first-delivery.md | 9 +-- swarmforge/scripts/send-handoff.sh | 20 +----- 5 files changed, 125 insertions(+), 27 deletions(-) create mode 100644 .agents/references/hunk-agent-guide.md create mode 100644 .agents/references/hunk-user-guide.md create mode 100644 AGENTS.md diff --git a/.agents/references/hunk-agent-guide.md b/.agents/references/hunk-agent-guide.md new file mode 100644 index 0000000..629db55 --- /dev/null +++ b/.agents/references/hunk-agent-guide.md @@ -0,0 +1,61 @@ +# Hunk Agent Guide + +How agents interact with a live Hunk session for code review. + +## Pre-requisites + +- Hunk must be running in a terminal (e.g. `git diff ... | hunk patch -`). +- The Hunk session daemon auto-registers on startup. + +## Inspect + +```bash +hunk session list +git diff upstream/main...HEAD --diff-filter=M | hunk patch - +``` + +## Inspect + +```bash +hunk session list # find live sessions +hunk session get --repo . # confirm session repo match +hunk session review --repo . --json # file/hunk structure +hunk session review --repo . --include-patch --json # include raw diff text +hunk session context --repo . # current focus +``` + +## Navigate + +```bash +hunk session navigate --repo . --file --hunk # 1-based hunk +hunk session navigate --repo . --file --new-line +hunk session navigate --repo . --next-comment +hunk session navigate --repo . --prev-comment +``` + +## Reload content + +```bash +hunk session reload --repo . -- diff --exclude-untracked +hunk session reload --repo . -- show HEAD~1 +``` + +Always pass `--` before the nested Hunk command. + +## Add comments + +Single note: +```bash +hunk session comment add --repo . --file --new-line --summary "text" [--focus] +``` + +Batch: +```bash +printf '%s\n' '{"comments":[{"filePath":"...","newLine":N,"summary":"..."}]}' \ + | hunk session comment apply --repo . --stdin [--focus] +``` + +## Common fixes + +- **"No active session matches repoRoot"** — pass session ID explicitly instead of `--repo .`. +- **"No active Hunk sessions"** — Hunk is not running; ask the user to open it first. diff --git a/.agents/references/hunk-user-guide.md b/.agents/references/hunk-user-guide.md new file mode 100644 index 0000000..2f36514 --- /dev/null +++ b/.agents/references/hunk-user-guide.md @@ -0,0 +1,57 @@ +# Hunk User Guide + +Quick reference for [Hunk](https://github.com/modem-dev/hunk) — the review-first terminal diff viewer used in this fork. + +## Open a diff (modified files only) + +```bash +git diff upstream/main...HEAD --diff-filter=M | hunk patch - +``` + +`--diff-filter=M` shows only **modified** tracked files, excluding new/untracked files. + +## Navigation + +| Key | Action | +|-----|--------| +| `↑ / ↓` | move line by line | +| `Space` or `f` | page down | +| `b` or `Shift+Space` | page up | +| `d / u` | half page down / up | +| `[ / ]` | previous / next hunk | +| `, / .` | previous / next file | +| `{ / }` | previous / next comment | +| `← / →` | scroll code horizontally (Shift = faster) | +| `Home / End` or `g / G` | jump to top / bottom | + +## View + +| Key | Action | +|-----|--------| +| `1 / 2 / 0` | split / stack / auto layout | +| `s` | toggle sidebar | +| `t` | toggle theme | +| `a` | toggle AI notes | +| `z` | toggle unchanged context | +| `l / w / m` | toggle line numbers / wrap / metadata | +| `e` | open file in `$EDITOR` | + +## Review + +| Key | Action | +|-----|--------| +| `/` | focus file filter | +| `c` | create review note | +| `Tab` | toggle files/filter focus | +| `F10` | open menus | +| `r` | reload (watch mode) | +| `q` | quit | + +## Mouse + +- Wheel — scroll vertically +- Shift+Wheel — scroll horizontally + +## In-app help + +Press `?` or `h` inside Hunk to open the full controls help modal. diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 0000000..f603f61 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,5 @@ +# Agent Orientation + +- Read `CONTEXT.md` for fork terminology and conventions. +- Read `docs/adr/` for architecture decisions that govern this fork. +- For diff review with Hunk, see `.agents/references/hunk-user-guide.md`. diff --git a/docs/adr/0002-idle-gate-and-clear-first-delivery.md b/docs/adr/0002-idle-gate-and-clear-first-delivery.md index c537fa0..ce6b8f8 100644 --- a/docs/adr/0002-idle-gate-and-clear-first-delivery.md +++ b/docs/adr/0002-idle-gate-and-clear-first-delivery.md @@ -17,13 +17,6 @@ The marker is set *busy* when a delivery starts and *idle* when the Stop hook fi **Re-injection is universal.** `/clear` wipes the session regardless of backend, so the role bundle is always re-sent after `/clear`. -**Claude Code first.** Both the marker and the delivery ride Claude Code's hook system (the Stop hook). The fork's delivery replaces upstream's immediate terminal-typing only for the roles it manages. The `claude` backend is supported now; roles on `codex`/`grok` keep upstream's delivery until their hook-based equivalent is built — **pending implementation**. +**Delivery engine.** The central `swarm-handoff` script handles queueing, busy checking, and backend-aware delivery. The engine writes every outgoing message to a pending queue, checks the busy marker, and either delivers immediately (idle) or exits and lets the Stop hook drain the queue later (busy). The delivery path is backend-aware: for `claude` roles, `/clear` + re-inject is sent before the handoff body; for other backends, the handoff is delivered directly until their own clear-first mechanism is built. Ready is implicit (idle + empty queue = ready). Upstream's startup "I'm awake" ping is kept only as an operator-visible **presence** signal — stamped a distinct `presence` type and excluded from the clear-first path, so the Stop hook never clears for it. - -**Session-restart recovery.** The idle/busy marker records *whether* a role is working, not *what* it is working on. So the `executing` logbook entry carries the in-flight task itself — `{message, hash, sender}`: the handoff message being acted on, the commit hash it started from, and who sent it. If a role's session dies and is restarted mid-task, that is enough to resume the task rather than lose it along with the handoff. (Upstream's `executing` entry records no such context.) These fields live inside the delivery and Stop-hook scripts, so they re-base on the prompt bundle of ADR 0017. - -## Pending implementation - -- `codex`/`grok` hook-based delivery (Claude Code first). The current `six-pack` `swarmforge.conf` runs all six roles on `codex`, so until that is built — or those roles move to `claude` — clear-first delivery applies only to `claude` roles. The `claude`/`codex` choice is a per-role configuration knob (ADR 0012), not an architectural decision; no `codex` hook work is required for this ADR to stand. -- Add the `{message, hash, sender}` fields to the `executing` logbook entry written in the delivery script and the Stop hook, re-based onto the ADR 0017 bundle delivery. Source: `feat/main-executing-context-fields` commit `a133c71`. diff --git a/swarmforge/scripts/send-handoff.sh b/swarmforge/scripts/send-handoff.sh index c76b6f9..9af6e29 100755 --- a/swarmforge/scripts/send-handoff.sh +++ b/swarmforge/scripts/send-handoff.sh @@ -102,25 +102,7 @@ EOF ARCHIVE_FILE="$(handoff_temp_file "send-handoff")" printf '%s' "$MESSAGE" > "$ARCHIVE_FILE" handoff_archive_sent "$STREAM" "$SEQUENCE" "$MESSAGE" - -PROJECT_DIR="$(handoff_project_dir)" -TARGET_AGENT="$(handoff_agent_type "$PROJECT_DIR" "$TARGET" 2>/dev/null || echo 'unknown')" - -if [[ "$TARGET_AGENT" != "claude" ]]; then - swarm-handoff "$TARGET" --file "$ARCHIVE_FILE" -else - PENDING_DIR="$(handoff_pending_dir "$PROJECT_DIR" "$TARGET")" - mkdir -p "$PENDING_DIR" - PENDING_FILE="$PENDING_DIR/${MESSAGE_PRIORITY}-$(handoff_id_timestamp)-${STREAM}-${SEQUENCE}.txt" - printf '%s' "$MESSAGE" > "$PENDING_FILE" - - BUSY_FILE="$(handoff_busy_file "$PROJECT_DIR" "$TARGET")" - if ( set -C; > "$BUSY_FILE" ) 2>/dev/null; then - handoff_clear_first_deliver "$PROJECT_DIR" "$TARGET" "$PENDING_FILE" - rm -f "$PENDING_FILE" - fi -fi - +swarm-handoff "$TARGET" --file "$ARCHIVE_FILE" handoff_append_logbook "sent" "$MESSAGE" "$MESSAGE_TYPE $MESSAGE_ID sent to $TARGET" echo "Sent $MESSAGE_ID" From 60ce3756a7928bf757353d3b0af955407bc80b08 Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Sun, 14 Jun 2026 21:02:00 -0300 Subject: [PATCH 31/67] fix(swarmforge): create STATE_DIR before writing skills sentinel install_skills wrote the sentinel before prepare_workspace created .swarmforge/, causing a fatal redirect error on first run. Co-Authored-By: Claude Sonnet 4.6 --- swarmforge/scripts/swarmforge.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/swarmforge/scripts/swarmforge.sh b/swarmforge/scripts/swarmforge.sh index 87d4e88..c601cb6 100755 --- a/swarmforge/scripts/swarmforge.sh +++ b/swarmforge/scripts/swarmforge.sh @@ -603,7 +603,7 @@ install_skills() { source "$pins_file" echo -e "${CYAN}Installing skills...${RESET}" - mkdir -p "$skills_dst" + mkdir -p "$STATE_DIR" "$skills_dst" if [[ -d "$skills_src/agent-retro" ]]; then rm -rf "$skills_dst/agent-retro" From 6e78b2341144d9e0449de89be5a6cabb8c9c4e9e Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Sun, 14 Jun 2026 21:06:30 -0300 Subject: [PATCH 32/67] fix(install_skills): install all local skills, not just agent-retro Loops over every directory in swarmforge/skills/ so setup-swarm and any future local skills are installed alongside the entireio pack. Co-Authored-By: Claude Sonnet 4.6 --- swarmforge/scripts/swarmforge.sh | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/swarmforge/scripts/swarmforge.sh b/swarmforge/scripts/swarmforge.sh index c601cb6..134500d 100755 --- a/swarmforge/scripts/swarmforge.sh +++ b/swarmforge/scripts/swarmforge.sh @@ -605,12 +605,14 @@ install_skills() { echo -e "${CYAN}Installing skills...${RESET}" mkdir -p "$STATE_DIR" "$skills_dst" - if [[ -d "$skills_src/agent-retro" ]]; then - rm -rf "$skills_dst/agent-retro" - cp -R "$skills_src/agent-retro" "$skills_dst/agent-retro" - echo -e " ${GREEN}✓${RESET} agent-retro" - else - echo -e " ${YELLOW}⚠${RESET} agent-retro not found at $skills_src/agent-retro — skipping" + if [[ -d "$skills_src" ]]; then + for skill_dir in "$skills_src"/*/; do + local local_skill_name + local_skill_name="$(basename "$skill_dir")" + rm -rf "$skills_dst/$local_skill_name" + cp -R "$skill_dir" "$skills_dst/$local_skill_name" + echo -e " ${GREEN}✓${RESET} $local_skill_name (local)" + done fi local tmp_skills From dc3712fb57c4e349a0478af6ca25551498853d99 Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Sun, 14 Jun 2026 21:23:10 -0300 Subject: [PATCH 33/67] fix(setup-swarm): drop logbook.jsonl and tmp/ from gitignore step Both are upstream-absent. logbook.jsonl is already covered by .swarmforge/; tmp/ has no precedent in upstream and isn't needed. Co-Authored-By: Claude Sonnet 4.6 --- swarmforge/skills/setup-swarm/SKILL.md | 2 -- 1 file changed, 2 deletions(-) diff --git a/swarmforge/skills/setup-swarm/SKILL.md b/swarmforge/skills/setup-swarm/SKILL.md index 913b3ad..dc152ed 100644 --- a/swarmforge/skills/setup-swarm/SKILL.md +++ b/swarmforge/skills/setup-swarm/SKILL.md @@ -107,8 +107,6 @@ p.write_text(json.dumps(cfg, indent=2)) Ensure these entries exist in `.gitignore` (append if missing, do not duplicate): ``` -logbook.jsonl -tmp/ .swarmforge/ .worktrees/ ``` From ee97af54bdddfa99424abfdc424690841f4b1c95 Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Sun, 14 Jun 2026 23:06:21 -0300 Subject: [PATCH 34/67] feat(swarmforge): persona skill load + flat handoff delivery + setup language stamp - swarmforge.sh: split bundle injection into 1-line system prompt + full skill file written to {worktree}/.claude/skills/swarm-persona/SKILL.md; send_initial_prompt() sends short trigger after 3s for all backends except codex - swarm-handoff: remove bundle re-injection; flatten message newlines to avoid multi-line input mode; extend /clear sleep to 0.5s; only C-m, no trailing C-j - setup-swarm/SKILL.md: stamp chosen language into local-engineering.prompt (step 1); add tmp/ and .claude/skills/swarm-persona/ to .gitignore (step 5) Co-Authored-By: Claude Sonnet 4.6 Entire-Checkpoint: e62ed1519127 --- swarmforge/scripts/swarm-handoff | 31 +++++++------------------- swarmforge/scripts/swarmforge.sh | 27 ++++++++++++++-------- swarmforge/skills/setup-swarm/SKILL.md | 8 +++++++ 3 files changed, 34 insertions(+), 32 deletions(-) diff --git a/swarmforge/scripts/swarm-handoff b/swarmforge/scripts/swarm-handoff index b331770..0d4c159 100755 --- a/swarmforge/scripts/swarm-handoff +++ b/swarmforge/scripts/swarm-handoff @@ -143,7 +143,7 @@ resolve_session() { } # _do_deliver — clear-first delivery for a specific target. -# Always sends /clear, re-injects the bundle, then sends the message. +# Sends /clear then the message; persona is re-loaded via swarm-persona skill. _do_deliver() { local target="$1" local message_file="$2" @@ -170,8 +170,6 @@ _do_deliver() { export TMUX fi - local bundle_file="$project_dir/.swarmforge/prompts/${target}.md" - local -a tmux_cmd=() if [[ -n "${TMUX:-}" ]]; then tmux_cmd=(tmux send-keys -t "$target_session") @@ -181,30 +179,17 @@ _do_deliver() { tmux_cmd=(tmux -S "$socket" send-keys -t "$target_session") fi - # Clear + # Clear — wait long enough for Claude Code to process the slash command "${tmux_cmd[@]}" -l -- '/clear' - sleep 0.15 + sleep 0.5 "${tmux_cmd[@]}" C-m - sleep 0.05 - "${tmux_cmd[@]}" C-j - sleep 1 - - # Re-inject bundle if present - if [[ -f "$bundle_file" ]]; then - "${tmux_cmd[@]}" -l -- "$(< "$bundle_file")" - sleep 0.15 - "${tmux_cmd[@]}" C-m - sleep 0.05 - "${tmux_cmd[@]}" C-j - sleep 0.5 - fi + sleep 1.5 - # Send protocol message - "${tmux_cmd[@]}" -l -- "$message" - sleep 0.15 + # Send protocol message (newlines flattened to avoid multi-line input mode) + local flat_message="${message//$'\n'/ }" + "${tmux_cmd[@]}" -l -- "$flat_message" + sleep 0.5 "${tmux_cmd[@]}" C-m - sleep 0.05 - "${tmux_cmd[@]}" C-j } if [[ $# -eq 0 ]]; then diff --git a/swarmforge/scripts/swarmforge.sh b/swarmforge/scripts/swarmforge.sh index 134500d..a37df91 100755 --- a/swarmforge/scripts/swarmforge.sh +++ b/swarmforge/scripts/swarmforge.sh @@ -481,14 +481,25 @@ resolve_prompt_bundle() { write_agent_instruction_file() { local role="$1" local prompt_file="$2" + printf 'You are the %s in a SwarmForge multi-agent development swarm. Your full role, constitution, and operating instructions are in your swarm-persona skill. Invoke the swarm-persona skill at the start of every session and before responding to any handoff.\n' "$role" > "$prompt_file" +} + +write_persona_skill_file() { + local role="$1" + local worktree="$2" + local skill_dir="$worktree/.claude/skills/swarm-persona" + local skill_file="$skill_dir/SKILL.md" typeset -a bundle_files=() local rel abs_path knowledge + mkdir -p "$skill_dir" + while IFS= read -r rel; do [[ -n "$rel" ]] && bundle_files+=("$rel") done < <(resolve_prompt_bundle "$role") { + printf -- '---\nname: swarm-persona\ndescription: Load this agent'\''s SwarmForge role, constitution, and operating instructions\n---\n\n' printf '\n' "$role" printf '\n' printf 'This prompt bundle is pre-resolved. Do not open or re-read any swarmforge/*.prompt files — all relevant instructions are already included below. Project knowledge files (AGENTS.md and your role file under .agents/roles/) are included below when present.\n' @@ -508,21 +519,18 @@ write_agent_instruction_file() { printf '\n\n' done printf '\n' - } > "$prompt_file" + } > "$skill_file" } -send_initial_grok_prompt() { +send_initial_prompt() { local session="$1" local display="$2" - local prompt_file="$3" ( sleep 3 - tmux -S "$TMUX_SOCKET" send-keys -t "$(tmux_agent_target "$session" "$display")" -l -- "$(< "$prompt_file")" - sleep 0.15 + tmux -S "$TMUX_SOCKET" send-keys -t "$(tmux_agent_target "$session" "$display")" -l -- 'Invoke your swarm-persona skill to load your role and begin.' + sleep 0.5 tmux -S "$TMUX_SOCKET" send-keys -t "$(tmux_agent_target "$session" "$display")" C-m - sleep 0.05 - tmux -S "$TMUX_SOCKET" send-keys -t "$(tmux_agent_target "$session" "$display")" C-j ) &! } @@ -541,6 +549,7 @@ launch_role() { local role_advisor="${ROLE_ADVISORS[$index]}" write_agent_instruction_file "$role" "$prompt_file" + write_persona_skill_file "$role" "$role_worktree" if [[ "$role_worktree" == "$WORKING_DIR" ]]; then role_script_dir="$SCRIPT_DIR" @@ -587,8 +596,8 @@ launch_role() { fi tmux -S "$TMUX_SOCKET" send-keys -t "$(tmux_agent_target "$session" "$display")" "$launch_cmd" Enter - if [[ "$agent" == "grok" ]]; then - send_initial_grok_prompt "$session" "$display" "$prompt_file" + if [[ "$agent" != "codex" ]]; then + send_initial_prompt "$session" "$display" fi echo -e " ${CYAN}[${display}]${RESET} started in session ${session}" } diff --git a/swarmforge/skills/setup-swarm/SKILL.md b/swarmforge/skills/setup-swarm/SKILL.md index dc152ed..4c26865 100644 --- a/swarmforge/skills/setup-swarm/SKILL.md +++ b/swarmforge/skills/setup-swarm/SKILL.md @@ -28,6 +28,12 @@ Ask the operator: Wait for the operator's answer before proceeding. Do not infer or detect the stack from the repository. +Once the operator answers, stamp the chosen language into the local engineering article so all agents know the project language. Append to `swarmforge/constitution/articles/local-engineering.prompt`: +```bash +printf '\n## Project Language\n- Project language: .\n' >> swarmforge/constitution/articles/local-engineering.prompt +``` +Where `` is: `Go`, `Java`, `TypeScript`, `Python`, `Rust`, `Clojure`, or `Ruby` matching the operator's selection. + --- ## Step 2 — Install quality tools @@ -109,6 +115,8 @@ Ensure these entries exist in `.gitignore` (append if missing, do not duplicate) ``` .swarmforge/ .worktrees/ +tmp/ +.claude/skills/swarm-persona/ ``` Probe the repository's default remote branch: From 0de094f65eebdb2fa1a6033e2cee2e22d9efed9a Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Sun, 14 Jun 2026 23:07:33 -0300 Subject: [PATCH 35/67] docs(readme): add /setup-swarm step and ./swarm stop to getting started - Add /setup-swarm as required one-time step before ./swarm - Document ./swarm stop as the primary shutdown command - Update Window Behavior section to prefer ./swarm stop over manual close Co-Authored-By: Claude Sonnet 4.6 Entire-Checkpoint: 968136bb66b5 --- README.md | 20 +++++++++++++++++--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index a7581f4..f26deb7 100644 --- a/README.md +++ b/README.md @@ -62,7 +62,15 @@ curl -L "https://github.com/gabadi/swarm-forge/archive/refs/heads/${BRANCH}.tar. Use `BRANCH=six-pack` instead when you want the six-agent workflow. Do not use `main` for this command; `main` is documentary and stores the shared operational scripts, while the runnable branches provide the configurations and prompts intended for projects. -After copying a runnable branch, start the swarm from the target project: +After copying a runnable branch, run the one-time project setup (requires Claude Code open in the project directory): + +```sh +/setup-swarm +``` + +This installs language-appropriate quality tools, writes permission allow-rules, and scaffolds `.gitignore`. You only need to run it once per project. + +Then launch the swarm: ```sh ./swarm @@ -72,7 +80,13 @@ The `./swarm` wrapper keeps the runnable branch small. On first use, if `swarmfo The windows should open automatically. -To stop the swarm, close the first window listed in `swarmforge/swarmforge.conf`. That cleanup window shuts down the tmux sessions and closes the remaining tracked windows. +To stop the swarm: + +```sh +./swarm stop +``` + +This kills the tmux sessions and closes all tracked terminal windows. You can also close the first window listed in `swarmforge/swarmforge.conf` for the same effect. ## What SwarmForge Does @@ -457,6 +471,6 @@ If the backend cannot open sessions at all, set both capability functions to `re Each visible agent window is attached to a tmux session. That means terminal selection, copy, and paste may follow tmux and terminal-emulator rules rather than ordinary text-field behavior. If copy or paste feels unusual, check whether tmux copy mode is active before assuming the agent is stuck. -The first window in `swarmforge.conf` is the cleanup window. Closing that top configured window is the intentional shutdown path: SwarmForge tears down the tmux sessions, closes the remaining tracked windows, and shuts down the swarm. +The preferred shutdown path is `./swarm stop`, which kills the tmux sessions and closes all tracked terminal windows. Alternatively, closing the first window listed in `swarmforge.conf` triggers the same teardown via the window watchdog. Closing any other tracked window is non-destructive. The watchdog reopens that window and attaches it back to the same tmux session, so the agent state and terminal history remain intact. This is often the simplest way to recover a window that has landed in an unfamiliar tmux mode or otherwise feels stuck. From 86d1e7b1c7944f7adbd43f25995297ec3dbc9fd1 Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Sun, 14 Jun 2026 23:13:13 -0300 Subject: [PATCH 36/67] docs(readme): add bootstrap ./swarm step before /setup-swarm The first ./swarm run downloads scripts and installs skills (then errors asking for setup). /setup-swarm is only available in Claude Code after that bootstrap run, so the order matters. Co-Authored-By: Claude Sonnet 4.6 Entire-Checkpoint: bae2e63d6d43 --- README.md | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index f26deb7..7e8c9cd 100644 --- a/README.md +++ b/README.md @@ -62,7 +62,15 @@ curl -L "https://github.com/gabadi/swarm-forge/archive/refs/heads/${BRANCH}.tar. Use `BRANCH=six-pack` instead when you want the six-agent workflow. Do not use `main` for this command; `main` is documentary and stores the shared operational scripts, while the runnable branches provide the configurations and prompts intended for projects. -After copying a runnable branch, run the one-time project setup (requires Claude Code open in the project directory): +After copying a runnable branch, run `./swarm` once to bootstrap the scripts and install skills into `.claude/skills/`: + +```sh +./swarm +``` + +This will print `Error: project is not swarm-ready. Run /setup-swarm first.` — that is expected. The important side effect is that it downloads the shared operational scripts from `main` into `swarmforge/scripts/` and installs the SwarmForge skills into `.claude/skills/`, making `/setup-swarm` available in Claude Code. + +Then open Claude Code in the project directory and run the one-time setup skill: ```sh /setup-swarm @@ -76,7 +84,7 @@ Then launch the swarm: ./swarm ``` -The `./swarm` wrapper keeps the runnable branch small. On first use, if `swarmforge/scripts/` is missing, it downloads the `main` branch archive, copies the shared operational scripts from `swarmforge/scripts/`, stages shared constitution articles from `swarmforge/constitution/articles/`, and then launches `swarmforge/scripts/swarmforge.sh`. Later runs reuse the existing local scripts directory instead of overwriting it. +The `./swarm` wrapper keeps the runnable branch small. On later runs, if `swarmforge/scripts/` already exists, it skips the download and launches immediately. The windows should open automatically. From 1a3263fbf053778456cb2e8bb5ff4680c31d9f89 Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Sun, 14 Jun 2026 23:14:26 -0300 Subject: [PATCH 37/67] fix(setup-swarm): derive stack list from engineering.prompt, not hardcoded MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Step 1 now reads swarmforge/constitution/articles/engineering.prompt and extracts stacks from the Language tool table — only offers supported stacks. Step 2 reads the same table for tool install commands instead of duplicating a separate (and stale) list. Co-Authored-By: Claude Sonnet 4.6 Entire-Checkpoint: 2312727cb7c2 --- swarmforge/skills/setup-swarm/SKILL.md | 28 +++++--------------------- 1 file changed, 5 insertions(+), 23 deletions(-) diff --git a/swarmforge/skills/setup-swarm/SKILL.md b/swarmforge/skills/setup-swarm/SKILL.md index 4c26865..b0f2238 100644 --- a/swarmforge/skills/setup-swarm/SKILL.md +++ b/swarmforge/skills/setup-swarm/SKILL.md @@ -15,16 +15,12 @@ Run this skill **once** before invoking `./swarm`. It prepares the project so th ## Step 1 — Ask the operator for the project stack +Read `swarmforge/constitution/articles/engineering.prompt` and extract the stacks listed under "Language tool table". Present only those stacks as numbered options — do not offer stacks that are not in that table. + Ask the operator: > Which stack is this project? -> 1. Go -> 2. Java / Kotlin (JVM) -> 3. JavaScript / TypeScript -> 4. Python -> 5. Rust -> 6. Clojure -> 7. Ruby +> (list the stacks found in engineering.prompt, numbered) Wait for the operator's answer before proceeding. Do not infer or detect the stack from the repository. @@ -32,27 +28,13 @@ Once the operator answers, stamp the chosen language into the local engineering ```bash printf '\n## Project Language\n- Project language: .\n' >> swarmforge/constitution/articles/local-engineering.prompt ``` -Where `` is: `Go`, `Java`, `TypeScript`, `Python`, `Rust`, `Clojure`, or `Ruby` matching the operator's selection. +Where `` is the language name exactly as it appears in the engineering.prompt tool table entry. --- ## Step 2 — Install quality tools -Based on the operator's chosen stack, install the mutation, CRAP, and DRY tools. These are the tools that cleaner, hardener, and QA will use during the swarm run. - -**Go:** `go install honnef.co/go/tools/cmd/staticcheck@latest` (CRAP), a mutation tool such as `go-mutesting` if available. - -**Java / Kotlin:** Maven or Gradle plugin for PITest (mutation); PMD or SpotBugs (CRAP/DRY). - -**JavaScript / TypeScript:** `npm install -g stryker-cli` (mutation); ESLint with complexity rules. - -**Python:** `pip install mutmut` (mutation); `radon` (CRAP/DRY metrics). - -**Rust:** `cargo install cargo-mutants` (mutation); Clippy is standard and should already be present. - -**Clojure:** `clj-kondo` (DRY/complexity); mutation support via the project's own test runner. - -**Ruby:** `gem install mutant` (mutation). +Read the "Language tool table" section of `swarmforge/constitution/articles/engineering.prompt`. For the chosen stack, install the mutation, CRAP, and DRY tools listed there — use the exact repositories and install method specified in that table. Also install the Acceptance Pipeline Specification (APS) tools: ``` From ae1791e5f7f0fcb0d77a448b02fc0d1aea984231 Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Sun, 14 Jun 2026 23:22:15 -0300 Subject: [PATCH 38/67] fix(swarm): move persona trigger from init to handoff delivery MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Remove send_initial_prompt() call from launch_role — agents no longer receive the invoke message at startup; system prompt handles initial load - swarm-handoff _do_deliver: send persona trigger after /clear and before the protocol message, so the agent reloads its role on every handoff Co-Authored-By: Claude Sonnet 4.6 Entire-Checkpoint: d9d281da1790 --- swarmforge/scripts/swarm-handoff | 6 ++++++ swarmforge/scripts/swarmforge.sh | 3 --- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/swarmforge/scripts/swarm-handoff b/swarmforge/scripts/swarm-handoff index 0d4c159..a9427c2 100755 --- a/swarmforge/scripts/swarm-handoff +++ b/swarmforge/scripts/swarm-handoff @@ -185,6 +185,12 @@ _do_deliver() { "${tmux_cmd[@]}" C-m sleep 1.5 + # Re-load persona before delivering — context was wiped by /clear + "${tmux_cmd[@]}" -l -- 'Invoke your swarm-persona skill to load your role and begin.' + sleep 0.5 + "${tmux_cmd[@]}" C-m + sleep 2.0 + # Send protocol message (newlines flattened to avoid multi-line input mode) local flat_message="${message//$'\n'/ }" "${tmux_cmd[@]}" -l -- "$flat_message" diff --git a/swarmforge/scripts/swarmforge.sh b/swarmforge/scripts/swarmforge.sh index a37df91..d4f50e9 100755 --- a/swarmforge/scripts/swarmforge.sh +++ b/swarmforge/scripts/swarmforge.sh @@ -596,9 +596,6 @@ launch_role() { fi tmux -S "$TMUX_SOCKET" send-keys -t "$(tmux_agent_target "$session" "$display")" "$launch_cmd" Enter - if [[ "$agent" != "codex" ]]; then - send_initial_prompt "$session" "$display" - fi echo -e " ${CYAN}[${display}]${RESET} started in session ${session}" } From 3c8d9d39a9585aea7214b861fb83de8047500c18 Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Sun, 14 Jun 2026 23:37:16 -0300 Subject: [PATCH 39/67] fix(swarmforge): silence zsh local-redecl noise and sparse-checkout dir error MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit In zsh, `local var` without `=` on an already-local variable prints its current value instead of resetting it — caused kv/key/val/kv_i to be echoed on config lines 2-6. Fix: add explicit empty/zero assignments. Sparse-checkout redirect used `.git/info/sparse-checkout` directly, but in git worktrees `.git` is a file (gitdir pointer), so the path was invalid and the shell emitted "not a directory" before 2>/dev/null could suppress it. Fix: resolve the real git dir via `git rev-parse --git-dir`. Co-Authored-By: Claude Sonnet 4.6 Entire-Checkpoint: c095e4010f17 --- swarmforge/scripts/swarmforge.sh | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/swarmforge/scripts/swarmforge.sh b/swarmforge/scripts/swarmforge.sh index d4f50e9..f42ec5c 100755 --- a/swarmforge/scripts/swarmforge.sh +++ b/swarmforge/scripts/swarmforge.sh @@ -213,7 +213,7 @@ parse_config() { agent="${fields[3]:l}" worktree="${fields[4]}" - local role_model="" role_effort="" role_advisor="" kv key val kv_i + local role_model="" role_effort="" role_advisor="" kv="" key="" val="" kv_i=0 for (( kv_i = 5; kv_i <= ${#fields[@]}; kv_i++ )); do kv="${fields[$kv_i]}" key="${kv%%=*}" @@ -366,10 +366,12 @@ prepare_worktrees() { if [[ "$role" != "specifier" && "$role" != "QA" ]]; then git -C "$worktree_path" sparse-checkout init --no-cone >/dev/null 2>&1 + local worktree_git_dir + worktree_git_dir="$(git -C "$worktree_path" rev-parse --git-dir 2>/dev/null)" { printf '/*\n' printf '!/%s/\n' "$QA_HOLDOUT_PATH" - } > "$worktree_path/.git/info/sparse-checkout" 2>/dev/null \ + } > "${worktree_git_dir}/info/sparse-checkout" 2>/dev/null \ || git -C "$worktree_path" sparse-checkout set --no-cone '/*' "!/${QA_HOLDOUT_PATH}/" >/dev/null 2>&1 git -C "$worktree_path" read-tree -mu HEAD >/dev/null 2>&1 || true fi From d1936b4f898ffa264cc1c1efeeed3cfbcdf7a67f Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Sun, 14 Jun 2026 23:42:48 -0300 Subject: [PATCH 40/67] fix(swarmforge): bundle articles in persona skill + exempt specifier from self-notify MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit resolve_prompt_bundle only followed explicit .prompt references, leaving constitution.prompt's "read every file in articles/" instruction for the agent to execute at runtime — causing 40s+ persona loads and 1.8k extra tokens per startup. Now explicitly include all articles/*.prompt files when the constitution is processed, so the bundle is self-contained. handoffs.prompt startup notification applied to all roles including the specifier itself, causing it to deliver a handoff to its own tmux session while still responding — which concatenated /clear + persona + protocol in the input without submitting. Added exemption: specifier skips the startup notification. Co-Authored-By: Claude Sonnet 4.6 Entire-Checkpoint: 3a2f61f0b8bf --- swarmforge/constitution/articles/handoffs.prompt | 2 +- swarmforge/scripts/swarmforge.sh | 10 ++++++++++ 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/swarmforge/constitution/articles/handoffs.prompt b/swarmforge/constitution/articles/handoffs.prompt index dc6352e..9339c5c 100644 --- a/swarmforge/constitution/articles/handoffs.prompt +++ b/swarmforge/constitution/articles/handoffs.prompt @@ -1,7 +1,7 @@ # Handoff Rules ## Startup Notification -- After reading the constitution and your role prompt at startup, send an awake notification to `specifier`. +- After reading the constitution and your role prompt at startup, send an awake notification to `specifier` — unless your role is `specifier`, in which case skip this step entirely. - Write exactly `I'm awake` to `./tmp/specifier-awake.txt`. - Write `.swarmforge/notify/request` with `command: send`, `target: specifier`, and `file: ./tmp/specifier-awake.txt`, then run `swarm-handoff` with no arguments. - This startup notification is only a presence signal; it does not replace any role-specific handoff rule. diff --git a/swarmforge/scripts/swarmforge.sh b/swarmforge/scripts/swarmforge.sh index f42ec5c..d5ce0af 100755 --- a/swarmforge/scripts/swarmforge.sh +++ b/swarmforge/scripts/swarmforge.sh @@ -475,6 +475,16 @@ resolve_prompt_bundle() { ref_abs="$WORKING_DIR/$ref" [[ ${+seen[$ref]} -eq 0 ]] && queue+=("$ref_abs") done < <(grep -oE 'swarmforge/[A-Za-z0-9_./-]+\.prompt' "$file" 2>/dev/null || true) + + # When bundling constitution.prompt, also include all articles so agents + # don't re-read them at runtime following the "read every file in articles/" directive. + if [[ "$file" == */constitution.prompt ]]; then + local articles_dir="${WORKING_DIR}/swarmforge/constitution/articles" + for article_file in "$articles_dir"/*.prompt(N); do + local article_rel="${article_file#${WORKING_DIR}/}" + [[ ${+seen[$article_rel]} -eq 0 ]] && queue+=("$article_file") + done + fi done printf '%s\n' "${bundle[@]}" From 26234eb23241943a01f50375b15885c6a21d1911 Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Mon, 15 Jun 2026 00:19:53 -0300 Subject: [PATCH 41/67] fix(swarm-handoff): use /swarm-persona and add startup wait before delivery MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Two bugs in _do_deliver: 1. Used natural language to invoke the persona skill instead of the slash command /swarm-persona — slower and model-dependent. 2. Delivery started immediately when a freshly-launched Claude Code session might still be rendering the welcome overlay or running SessionStart hooks. The first C-m dismissed the overlay instead of submitting /clear, causing all subsequent messages to concatenate in the input bar without submitting. Added a 3s pre-delivery wait and increased post-clear wait to 2.0s. Co-Authored-By: Claude Sonnet 4.6 Entire-Checkpoint: daaa963f4a43 --- swarmforge/scripts/swarm-handoff | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/swarmforge/scripts/swarm-handoff b/swarmforge/scripts/swarm-handoff index a9427c2..be5ba08 100755 --- a/swarmforge/scripts/swarm-handoff +++ b/swarmforge/scripts/swarm-handoff @@ -179,14 +179,19 @@ _do_deliver() { tmux_cmd=(tmux -S "$socket" send-keys -t "$target_session") fi + # Wait for Claude Code to finish startup (hook output, welcome overlay, etc.) + # before sending any keystrokes — without this, C-m dismisses the overlay + # instead of submitting, causing all messages to concatenate in the input bar. + sleep 3 + # Clear — wait long enough for Claude Code to process the slash command "${tmux_cmd[@]}" -l -- '/clear' sleep 0.5 "${tmux_cmd[@]}" C-m - sleep 1.5 + sleep 2.0 - # Re-load persona before delivering — context was wiped by /clear - "${tmux_cmd[@]}" -l -- 'Invoke your swarm-persona skill to load your role and begin.' + # Re-load persona via slash command before delivering — context was wiped by /clear + "${tmux_cmd[@]}" -l -- '/swarm-persona' sleep 0.5 "${tmux_cmd[@]}" C-m sleep 2.0 From 1f4bdd4c61aaed17b286096cf32b2fbcab8aa3f2 Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Mon, 15 Jun 2026 00:42:52 -0300 Subject: [PATCH 42/67] fix(swarm-handoff): restore C-j after C-m to complete CR+LF Enter sequence MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Commit ee97af5 removed C-j ("only C-m, no C-j"), which broke message submission entirely. Claude Code requires the full CR+LF sequence (\r\n) to register Enter — C-m alone (CR only) is silently ignored, causing all delivered text to pile up in the input bar without submitting. Upstream (unclebob/swarm-forge) always sends C-m then C-j. Restored that pattern for all three send points in _do_deliver: /clear, /swarm-persona, and the protocol message. Co-Authored-By: Claude Sonnet 4.6 Entire-Checkpoint: 87956d426afc --- swarmforge/scripts/swarm-handoff | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/swarmforge/scripts/swarm-handoff b/swarmforge/scripts/swarm-handoff index be5ba08..1eeb6b2 100755 --- a/swarmforge/scripts/swarm-handoff +++ b/swarmforge/scripts/swarm-handoff @@ -179,28 +179,33 @@ _do_deliver() { tmux_cmd=(tmux -S "$socket" send-keys -t "$target_session") fi - # Wait for Claude Code to finish startup (hook output, welcome overlay, etc.) - # before sending any keystrokes — without this, C-m dismisses the overlay - # instead of submitting, causing all messages to concatenate in the input bar. + # Wait for Claude Code to finish startup before sending keystrokes. sleep 3 - # Clear — wait long enough for Claude Code to process the slash command + # Clear — wait long enough for Claude Code to process the slash command. + # C-m + C-j sends CR+LF; Claude Code requires both to register Enter. "${tmux_cmd[@]}" -l -- '/clear' sleep 0.5 "${tmux_cmd[@]}" C-m + sleep 0.05 + "${tmux_cmd[@]}" C-j sleep 2.0 # Re-load persona via slash command before delivering — context was wiped by /clear "${tmux_cmd[@]}" -l -- '/swarm-persona' sleep 0.5 "${tmux_cmd[@]}" C-m + sleep 0.05 + "${tmux_cmd[@]}" C-j sleep 2.0 # Send protocol message (newlines flattened to avoid multi-line input mode) local flat_message="${message//$'\n'/ }" "${tmux_cmd[@]}" -l -- "$flat_message" - sleep 0.5 + sleep 0.15 "${tmux_cmd[@]}" C-m + sleep 0.05 + "${tmux_cmd[@]}" C-j } if [[ $# -eq 0 ]]; then From e3db3fa2371e4061b0825d350c9667d214dd7a8e Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Mon, 15 Jun 2026 01:01:06 -0300 Subject: [PATCH 43/67] fix(swarmforge): pass initial message to claude at launch to exit welcome screen MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Without an initial message, fresh Claude sessions sit at the welcome screen indefinitely. In that state C-m never submits — causing all handoff delivery keystrokes to accumulate in the input bar without executing. Upstream passes the prompt file content as the initial message so Claude responds immediately and moves past the welcome screen. Restored that pattern: "$(cat '$prompt_file')" appended to the launch command, matching upstream's approach. The _do_deliver /clear + /swarm-persona flow then works correctly since C-m + C-j submits on an already-active session. Co-Authored-By: Claude Sonnet 4.6 Entire-Checkpoint: fc1a67febad3 --- swarmforge/scripts/swarmforge.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/swarmforge/scripts/swarmforge.sh b/swarmforge/scripts/swarmforge.sh index d5ce0af..1ec7a2e 100755 --- a/swarmforge/scripts/swarmforge.sh +++ b/swarmforge/scripts/swarmforge.sh @@ -573,7 +573,7 @@ launch_role() { local claude_flags="" [[ -n "$role_model" ]] && claude_flags+=" --model ${(q)role_model}" [[ -n "$role_effort" ]] && claude_flags+=" --effort ${(q)role_effort}" - launch_cmd="export SWARMFORGE_ROLE='$role' && export PATH='$role_script_dir':\$PATH && cd '$role_worktree' && claude${claude_flags} --append-system-prompt-file '$prompt_file' --permission-mode auto -n 'SwarmForge ${display}'" + launch_cmd="export SWARMFORGE_ROLE='$role' && export PATH='$role_script_dir':\$PATH && cd '$role_worktree' && claude${claude_flags} --append-system-prompt-file '$prompt_file' --permission-mode auto -n 'SwarmForge ${display}' \"\$(cat '$prompt_file')\"" ;; codex) [[ -n "$role_advisor" ]] && write_worktree_settings "$role_worktree" "$role_advisor" From 153a1d16468852e91e8a6ef45863821f09b356f1 Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Mon, 15 Jun 2026 01:06:48 -0300 Subject: [PATCH 44/67] Revert "fix(swarmforge): pass initial message to claude at launch to exit welcome screen" This reverts commit e3db3fa2371e4061b0825d350c9667d214dd7a8e. Entire-Checkpoint: d45e5e2a7ae3 --- swarmforge/scripts/swarmforge.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/swarmforge/scripts/swarmforge.sh b/swarmforge/scripts/swarmforge.sh index 1ec7a2e..d5ce0af 100755 --- a/swarmforge/scripts/swarmforge.sh +++ b/swarmforge/scripts/swarmforge.sh @@ -573,7 +573,7 @@ launch_role() { local claude_flags="" [[ -n "$role_model" ]] && claude_flags+=" --model ${(q)role_model}" [[ -n "$role_effort" ]] && claude_flags+=" --effort ${(q)role_effort}" - launch_cmd="export SWARMFORGE_ROLE='$role' && export PATH='$role_script_dir':\$PATH && cd '$role_worktree' && claude${claude_flags} --append-system-prompt-file '$prompt_file' --permission-mode auto -n 'SwarmForge ${display}' \"\$(cat '$prompt_file')\"" + launch_cmd="export SWARMFORGE_ROLE='$role' && export PATH='$role_script_dir':\$PATH && cd '$role_worktree' && claude${claude_flags} --append-system-prompt-file '$prompt_file' --permission-mode auto -n 'SwarmForge ${display}'" ;; codex) [[ -n "$role_advisor" ]] && write_worktree_settings "$role_worktree" "$role_advisor" From 6712fe9c699426244d0110c7ab7f7d68a16dddcd Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Mon, 15 Jun 2026 01:08:15 -0300 Subject: [PATCH 45/67] fix(swarm-handoff): use tmux Enter key instead of C-m/C-j for kitty protocol MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit tmux has extended-keys on (3.6b + xterm-ghostty). Claude Code in Ghostty requests the kitty keyboard protocol, so Enter sends \x1b[13u, not \x0d. send-keys C-m injects raw \x0d which Claude Code in kitty mode ignores — causing all delivered text to pile up in the input bar without submitting. tmux named key 'Enter' is translated through the extended-keys system and sends the correct sequence for whatever keyboard protocol the pane requested. Also switched to /swarm-persona slash command (faster than natural language) and trimmed the C-m + C-j redundancy down to a single Enter per submit. Co-Authored-By: Claude Sonnet 4.6 Entire-Checkpoint: 1743b4a6144f --- swarmforge/scripts/swarm-handoff | 21 +++++++++------------ 1 file changed, 9 insertions(+), 12 deletions(-) diff --git a/swarmforge/scripts/swarm-handoff b/swarmforge/scripts/swarm-handoff index 1eeb6b2..ef7c60d 100755 --- a/swarmforge/scripts/swarm-handoff +++ b/swarmforge/scripts/swarm-handoff @@ -182,30 +182,27 @@ _do_deliver() { # Wait for Claude Code to finish startup before sending keystrokes. sleep 3 - # Clear — wait long enough for Claude Code to process the slash command. - # C-m + C-j sends CR+LF; Claude Code requires both to register Enter. + # Use 'Enter' (tmux named key) not C-m/C-j — tmux extended-keys translates + # named keys through the kitty keyboard protocol when the pane requests it. + # C-m sends raw \x0d which Claude Code in kitty mode ignores. + + # Clear context "${tmux_cmd[@]}" -l -- '/clear' sleep 0.5 - "${tmux_cmd[@]}" C-m - sleep 0.05 - "${tmux_cmd[@]}" C-j + "${tmux_cmd[@]}" Enter sleep 2.0 - # Re-load persona via slash command before delivering — context was wiped by /clear + # Reload persona — context was wiped by /clear "${tmux_cmd[@]}" -l -- '/swarm-persona' sleep 0.5 - "${tmux_cmd[@]}" C-m - sleep 0.05 - "${tmux_cmd[@]}" C-j + "${tmux_cmd[@]}" Enter sleep 2.0 # Send protocol message (newlines flattened to avoid multi-line input mode) local flat_message="${message//$'\n'/ }" "${tmux_cmd[@]}" -l -- "$flat_message" sleep 0.15 - "${tmux_cmd[@]}" C-m - sleep 0.05 - "${tmux_cmd[@]}" C-j + "${tmux_cmd[@]}" Enter } if [[ $# -eq 0 ]]; then From c9fad0cdd34a72f67cf87b3216de4c16db96be18 Mon Sep 17 00:00:00 2001 From: gabadi Date: Mon, 15 Jun 2026 18:28:16 -0300 Subject: [PATCH 46/67] fix(handoffs): back-routes must include role-specified diagnostic fields (#33) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * feat(roles): add curator, integrator, ux-engineer and feature template Adds the three new ADR-mandated roles (0007 ux-engineer, 0008 integrator, 0013/0014 curator) and the Gherkin feature template to main, bringing it to parity with the six-pack branch. Co-Authored-By: Claude Sonnet 4.6 * fix(main): correct PR scope — back-routes fix only Remove role files (curator, integrator, ux-engineer, feature.feature) that belong on six-pack, not main. Shared articles on main are copied to runnable branches at startup; role files live on the runnable branch. Apply the back-routes fix to handoffs.prompt (the shared article): forward-handoff body rule is now scoped to forward handoffs only; back-routes carry role-specified diagnostic fields. Co-Authored-By: Claude Sonnet 4.6 * fix(swarmforge): local-* article load order and curator skill symlinks Load order: queue non-local articles before local-* so local overrides win — the previous alphabetical glob caused local-workflow.prompt to load before workflow.prompt, reversing the intended precedence. link_curator_skills(): creates .claude/skills/ → ../../.agents/skills/ symlinks at boot and per worktree so Claude Code discovers curator-promoted skills without an extra manual step. Co-Authored-By: Claude Sonnet 4.6 * feat(workflow): add Idle Gate section to shared article Consolidates two universal role-prompt lines into the shared article so they apply to all roles without duplication: - agent-retro before idle (ADR 0013) - wait for handoff (ADR 0002) Both were removed from individual role prompts in a prior session but never landed in the shared article. Co-Authored-By: Claude Sonnet 4.6 * revert(swarmforge): restore original article glob — local-* are additive not overriding Co-Authored-By: Claude Sonnet 4.6 --------- Co-authored-by: Claude Sonnet 4.6 --- swarmforge/constitution/articles/handoffs.prompt | 2 +- swarmforge/constitution/articles/workflow.prompt | 4 ++++ swarmforge/scripts/swarmforge.sh | 16 ++++++++++++++++ 3 files changed, 21 insertions(+), 1 deletion(-) diff --git a/swarmforge/constitution/articles/handoffs.prompt b/swarmforge/constitution/articles/handoffs.prompt index 9339c5c..3e4ba16 100644 --- a/swarmforge/constitution/articles/handoffs.prompt +++ b/swarmforge/constitution/articles/handoffs.prompt @@ -28,7 +28,7 @@ - Start every handoff body with: `Re-read your role and constitution.` - The specifier invents a short, stable handoff name for each accepted specification handoff. - Every later handoff for that work must include the specifier handoff name. -- Handoff bodies must report only essential state, not prescribe process. After the opening line, include exactly this field and no other prose: specifier handoff name. +- Handoff bodies must report only essential state, not prescribe process. After the opening line, include exactly this field and no other prose: specifier handoff name. Back-routes must also include the role-specified diagnostic fields. - Do not tell the receiving role how to do its job, repeat your process, or ask it to continue sender-owned responsibilities. The normal request is: `Apply your own role rules to this state.` ## Receiving Handoffs diff --git a/swarmforge/constitution/articles/workflow.prompt b/swarmforge/constitution/articles/workflow.prompt index b675dbe..5d506e5 100644 --- a/swarmforge/constitution/articles/workflow.prompt +++ b/swarmforge/constitution/articles/workflow.prompt @@ -12,3 +12,7 @@ ## Failure Conditions - If the expected git layout or assigned worktree is missing, stop and report instead of silently working in the wrong place. + +## Idle Gate +- Before going idle, run `agent-retro`. +- Wait for a handoff. Do not act without one. diff --git a/swarmforge/scripts/swarmforge.sh b/swarmforge/scripts/swarmforge.sh index d5ce0af..4950883 100755 --- a/swarmforge/scripts/swarmforge.sh +++ b/swarmforge/scripts/swarmforge.sh @@ -395,6 +395,7 @@ sync_worktree_scripts() { cp "$SESSIONS_FILE" "$role_state_dir/sessions.tsv" cp "$TMUX_SOCKET_FILE" "$role_state_dir/tmux-socket" cp "$TMUX_ENV_FILE" "$role_state_dir/tmux-env" + link_curator_skills "$worktree_path" done } @@ -664,6 +665,20 @@ ensure_skills_installed() { install_skills } +link_curator_skills() { + local target_root="${1:-$WORKING_DIR}" + local agents_skills_dir="$target_root/.agents/skills" + local claude_skills_dir="$target_root/.claude/skills" + [[ -d "$agents_skills_dir" ]] || return 0 + mkdir -p "$claude_skills_dir" + for skill_dir in "$agents_skills_dir"/*/; do + local skill_name + skill_name="$(basename "$skill_dir")" + [[ -e "$claude_skills_dir/$skill_name" ]] && continue + ln -sfn "../../.agents/skills/$skill_name" "$claude_skills_dir/$skill_name" + done +} + choose_cleanup_owner() { CLEANUP_OWNER_INDEX=1 } @@ -677,6 +692,7 @@ install_shared_constitution_articles "$WORKING_DIR" parse_config check_backend_dependencies ensure_skills_installed +link_curator_skills if [[ ! -f "$STATE_DIR/setup-complete" ]]; then echo -e "${RED}Error:${RESET} project is not swarm-ready. Run /setup-swarm first." >&2 From 72070c1a4310dcdf4dfb468290d786b960e7d771 Mon Sep 17 00:00:00 2001 From: gabadi Date: Mon, 15 Jun 2026 18:30:33 -0300 Subject: [PATCH 47/67] chore(docs): remove migration guides, manifest, and planning artifacts (#34) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Migration work is complete (PRs #31, #32, #33 merged). These files were working artifacts — their decisions live in docs/adr/; the role prompts and scripts have been updated. No live reference points to them. Co-authored-by: Claude Sonnet 4.6 --- docs/fork-change-manifest.md | 115 -- docs/migrations/0003-setup-skill-sources.md | 55 - docs/migrations/main-script-layer.md | 34 - docs/migrations/six-pack-role-prompts.md | 56 - ...26-06-14-fork-divergence-implementation.md | 1232 ----------------- ...-14-adr0002-clear-first-delivery-design.md | 103 -- 6 files changed, 1595 deletions(-) delete mode 100644 docs/fork-change-manifest.md delete mode 100644 docs/migrations/0003-setup-skill-sources.md delete mode 100644 docs/migrations/main-script-layer.md delete mode 100644 docs/migrations/six-pack-role-prompts.md delete mode 100644 docs/superpowers/plans/2026-06-14-fork-divergence-implementation.md delete mode 100644 docs/superpowers/specs/2026-06-14-adr0002-clear-first-delivery-design.md diff --git a/docs/fork-change-manifest.md b/docs/fork-change-manifest.md deleted file mode 100644 index 6c61f45..0000000 --- a/docs/fork-change-manifest.md +++ /dev/null @@ -1,115 +0,0 @@ -# Fork change manifest - -Compact, permanent record of **every divergence to apply on top of a pristine `upstream`**, one line per change. Rationale lives in the ADRs (`docs/adr/`) — this file is *where + what + source*, not *why*. Use it to (re)apply the fork after any upstream sync. - -## Sync policy (ADR 0001) - -- **Current upstream baseline (re-apply the fork layer onto this):** `main` ← `upstream/main` @ `d947f67` · `six-pack` ← `upstream/six-pack` @ `cbd1697` (2026-06-14). Bump these on every sync; an annotated `fork-base/-` tag pins the same commit so the anchor survives a hard reset (merge history alone does not — this fork has been reset before). -- **Merge style by source:** every **fork divergence is squash-merged** (one divergence PR → one clean commit on the delivery branch). **Upstream syncs are history-preserving merges** — never squashed, never rebased (keep upstream's story; `rerere` replays conflicts). The two initial re-implementation PRs follow the same squash rule (one squashed commit per branch). A landed commit is never rewritten. -- `main`, `six-pack`, `four-pack` are kept **identical to `upstream/`** and advanced by **merge** (`git merge upstream/`), never rebase. `rerere` replays conflict resolutions. -- **four-pack is frozen (decision 2026-06-14): no fork divergences are applied to it.** Only `main` and `six-pack` carry changes. (Open: whether four-pack is still resynced to upstream to honor "keep == upstream", or left as-is — see below.) -- Every item below is **additive** (new file or appended rule) wherever possible; a non-additive edit to an upstream line is marked **[edit]** and is a conscious, documented conflict point. -- **Delivery routing:** `main` ← scripts + skills + docs/ADRs · `six-pack` ← role prompts, constitution articles, templates, manifest, `swarmforge.conf`. -- Never push `main` without explicit request; **never** push `upstream` (`gh` defaults to upstream → always `--repo gabadi/swarm-forge`). - -## Source legend - -- **ADR** — `docs/adr/NNNN-*.md` (decision + rationale + `## Pending implementation`). -- **B6** — `backup/six-pre-reset` (real pre-reset six-pack artifacts: prompts, manifest, template, conf). Re-merge onto *current* prompts; do **not** copy whole files (they predate current upstream; some carry behavior the ADRs removed). -- **I20A** — `feat/issue-20-a-retro-skill-upgrade` (`swarmforge/skills/agent-retro/`, `AGENTS.md`). -- **I20B** — `feat/issue-20-b-bundle-knowledge-injection:docs/specs/issue-20-knowledge-promotion-loop.md` (locked curator-loop spec, PRs A→B→C→D; **spec wins** over issue #20; budgets AGENTS.md ≤60 / role files ≤40). - -## Per-row recovery docs (exact recover-from `branch:path` + delta + STRIP per item) - -- `docs/migrations/main-script-layer.md` — all Section A + Section C `swarmforge.sh`/scripts rows. **⚠ Idea B + cmux + M3 + executing-fields are one entangled ~400-line restructure — gating decision: keep the full cmux delivery model or rebuild lean on upstream's harness.** -- `docs/migrations/six-pack-role-prompts.md` — all Section B/C role-prompt rows + the 3 new roles + final conf window order + the STRIP table (backup content ADRs reversed). -- `docs/migrations/0003-setup-skill-sources.md` — setup skill design recovery (net-new, no code). - ---- - -## A. `main` — scripts / skills / docs - -Script path: `swarmforge/scripts/swarmforge.sh`. Skills path: `swarmforge/skills/`. - -| ADR | Change (one line) | Where | Source | -|-----|-------------------|-------|--------| -| 0006 | In `prepare_worktrees` (`git worktree add`, ~L331) add `git sparse-checkout` excluding the pinned QA-suite path for **every worktree except the specifier's and QA's** (key on the specifier role, not the `master` name — ADR 0008 renames its worktree to `specifier`); verify the path survives each role's handoff commit. | `swarmforge.sh` `prepare_worktrees` | ADR 0006 · **NET-NEW (no impl)** | -| 0012 | `parse_config` (~L182, today rejects ≠4 fields) → accept **≥4 fields**, parse `key=value` tail into a per-role map; `launch_role` (~L414) → append mapped flags per backend. **[edit]** | `swarmforge.sh` | ADR 0012 · recover `backup/main-pre-reset` · **advisor = `advisorModel` in settings.local.json, not `--advisor`** ✅ | -| 0014 | `write_agent_instruction_file` (~L389) → append project-root `AGENTS.md` + `.agents/roles/.md` when present, plus a preamble sentence; missing files silently skipped. | `swarmforge.sh` | ADR 0014 + I20B(PR-B) · **needs Idea B first** | -| 0013 | Upgrade `agent-retro` skill: per-action **scope tag** (`project\|swarmforge\|skill\|ephemeral`), **capture-first** (no pre-filter), **autonomous** mode marking actions `pending-curation` without a human prompt. | `swarmforge/skills/agent-retro/` | ADR 0013 + I20A + I20B(PR-A) | -| 0003 | New **`setup-swarm` skill** (stack detection; writes tooling/permissions/skill-pins/session-tracking; emits the **swarm-ready marker** `.swarmforge/setup-complete`); **setup-first** — operator runs `/setup-swarm` as step one, `./swarm` only **guards** on the marker and refuses if unset (never auto-runs setup). Absorbs Idea O scaffold. *Impl details open: marker format, stack detection (no backup artifact).* | `swarmforge/skills/setup-swarm/` (new) | ADR 0003 | - ---- - -## B. `six-pack` — prompts / constitution / templates / conf - -Roles: `swarmforge/roles/*.prompt` · constitution: `swarmforge/constitution/articles/*.prompt` · `swarmforge/swarmforge.conf`. - -| ADR | Change (one line) | Where | Source | -|-----|-------------------|-------|--------| -| 0002/0003 | Remove the `At startup, install/make-ready …` directive(s): `coder`:9, `QA`:7, `cleaner`:19, `hardender`:8–9. **[edit]** | `roles/*.prompt` | ADR 0002, 0003 | -| 0002 | Add idle-gate rule to each role prompt: "Wait for a handoff. Do not act without one." | `roles/*.prompt` | ADR 0002 | -| 0009 | Add `swarmforge/templates/feature.feature` — **8-section** spec header (TRACKING/CONTRACT/CONSTRAINTS/SEQUENCING/NFR/SIDE EFFECTS/SCOPE + UX INTENT). | `templates/feature.feature` (new) | ADR 0009 + B6 | -| 0009 | Specifier phase 1 starts from the template, addresses **all** sections before scenarios; fix stale count "seven" → **"eight"/"all"**. | `roles/specifier.prompt` | ADR 0009 + B6 | -| 0011 | Add `swarmforge/dependency-manifest.prompt` (3 tier defs inline + Rules section, body `(none)`); auto-resolved by the bundle resolver. | `dependency-manifest.prompt` (new) | ADR 0011 · recover `feat/baseline-scenarios-six` (**obs-harness-six over-deleted the Rules section**) | -| 0011 | Specifier reads the manifest before scenarios; on an undeclared external system → stop, propose name/tier/impl/gaps, wait for approval. | `roles/specifier.prompt` | ADR 0011 + B6 | -| 0010 | Add **surface-tool table** + context-driven acquisition rule (tmux/PTY · Playwright · HTTP client · ingress event-injection) to `engineering.prompt`. | `constitution/articles/engineering.prompt` | ADR 0010 + B6 | -| 0010 | Require a per-surface **baseline scenario** committed with every feature's flow scenarios (idle stability / no console errors / no-op event = no state change). | spec-header + role prompts | ADR 0010 | -| 0015 | Add platform-feasibility **stop rule** to `workflow.prompt` (spec-vs-platform conflict → stop & report; a workaround comment is a defect). | `constitution/articles/workflow.prompt` | ADR 0015 | -| 0005 | Rewrite QA to a **refute** posture (assume build fails spec & tests are weak; attack within the spec; conversion fidelity); replace "Fix bugs found by the QA suite…" (`QA`:14) — local fix in place, structural routes back. **[edit]** | `roles/QA.prompt` | ADR 0005 + B6 | -| 0010 | QA: replace "through the user interface only" (`QA`:13) with "**through the declared surface harness**"; add **every Expected bullet → a harness assertion or `NOT AUTOMATED — `**; QA re-executes committed `observation-harness/`, routes back if a user-facing surface has none. **[edit]** | `roles/QA.prompt` | ADR 0010 + B6 | -| 0004 | Add back-routing rule to role prompts: structural finding routes to its origin stage; local stays with finder; single-finding cap (back **once**) + feature cap **N=3** via routing count in the handoff trail (ux-engineer & integrator carry N=3). | `roles/*.prompt` | ADR 0004 | -| 0007 | Add **UX Engineer** role after coder (runs product, fixes rendering vs UX Intent, universal visual-quality bar incl. WCAG 4.5:1/3:1, writes `observation-harness/` + snapshots + rendering invariants; routes back per 0004 N=3); add conf window after coder. **Strip** DESIGN.md scaffold/walk-up from B6 draft. | `roles/ux-engineer.prompt` (new) + `swarmforge.conf` | ADR 0007 + B6 | -| 0007 | Coder reads UX Intent; specifier authors the UX INTENT section. | `roles/coder.prompt`, `roles/specifier.prompt` | ADR 0007 | -| 0008 | Add terminal **integrator** role (PR + green CI, post-merge gate, one PR/feature, autofix lint only, **hands off to curator**); add conf window. | `roles/integrator.prompt` (new) + `swarmforge.conf` | ADR 0008 + B6 | -| 0008 | Specifier **stops merging**: drop merge step (specifier:36), move specifier off `master` to its own worktree, reset to default branch per feature. **[edit]** | `roles/specifier.prompt` + `swarmforge.conf` | ADR 0008 | -| 0013 | Add terminal **curator** role (promotes retros → `.agents/`+`AGENTS.md` via one self-merging PR, then releases specifier; empty run = pass-through); rewire **integrator→curator→specifier**; conf curator window last; document chain in `workflow.prompt`. | `roles/curator.prompt` (new) + `swarmforge.conf` + `workflow.prompt` | ADR 0013 + B6 + I20B(PR-C) | -| — | **hardener** rendering-invariant property tests for pure rendering fns (state→string) — **unmanifested divergence found in audit**; consistent w/ 0007/0010. | `roles/hardender.prompt:18` | recover `backup/six-pre-reset` | -| 0016 | `cleaner` also scans **boundary files** at ~15–20-site threshold (vs 100 for testable source), extracts logic to a testable module; add the "stripped-view = untested" anti-pattern. | `roles/cleaner.prompt` | ADR 0016 + B6 | - ---- - -## C. Uncaptured implemented divergences — NO ADR (recover from backup, else lost on rebase) - -The behavioral/prompt-layer ADRs (0002–0016) did not originally cover the **`main`-side script infrastructure**. The items below were uncaptured implemented divergences living only in the monolith ADR (`backup/main-pre-reset:docs/adr/0001-fork-divergence.md`, "§Idea X") + the backup/feat branches — **each since dispositioned** (right-hand `ADR?` column): most now have their own ADR (0017–0021), the rest extend an existing ADR, fold into one, or stay a row here. **Each verified as still a divergence vs current `upstream/main` (2026-06-14).** They are prerequisites/peers of Section A — a clean rebase that follows only the original ADRs would drop them. - -| Idea | Divergence (one line) | Verified vs upstream | Source artifact | ADR? | -|------|----------------------|----------------------|-----------------|------| -| B | **Prompt-bundle inlining** — `write_agent_instruction_file` emits XML envelope `` + `resolve_prompt_bundle` (BFS over `*.prompt` refs, dedup). **KEEP (decision 2026-06-14).** Must be **disentangled from cmux**: port the resolver + envelope onto upstream's tmux harness and wire the bundle into upstream's delivery (NOT cmux's `write_deliver_script`). Prerequisite for M3/0014. | upstream has the naive read-recursively form only | `backup/main-pre-reset:swarmforge/scripts/swarmforge.sh` (`resolve_prompt_bundle`, `write_agent_instruction_file`); re-base, don't lift | **0017** | -| F | **Auto-compaction on role worktrees** — `write_worktree_permissions` merges into `.claude/settings.local.json`: `autoCompactEnabled:true`, `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE:"88"`, `CLAUDE_CODE_AUTO_COMPACT_WINDOW:"200000"`. | absent upstream | `backup/main-pre-reset` (commit 08e7f25); `mono §Idea F:207` | **0020** | -| J | **Session-retro plumbing** — `agent-retro` uses `entire session current`→`session info --transcript >/tmp`; fallback `~/.claude/projects/`; Codex-schema risk accepted; `agent-retro before idle` line in every role prompt. | absent upstream | `feat/issue-20-a…:swarmforge/skills/agent-retro/`; `mono §Idea J:189` | extend **0013** | -| N | **`./swarm upgrade`** — refresh scripts(main)+prompts(source branch)+skills; `install-pins.conf` SHA pinning; `.swarmforge/source-branch` tracking; auto-install skills on first launch via `.swarmforge/skills-installed`. | absent upstream | `mono §Idea N:88` | **0018** | -| O | **Install scaffold** — `.gitignore` gen (`logbook.jsonl`,`tmp/`,`.swarmforge/`); default-branch probe→`swarmforge.conf`; permission allow-rules. **Overlaps setup-swarm skill (0003).** | absent upstream | `mono §Idea O:326` | folds into **0003** | -| — | **Autonomous permission mode** — `--permission-mode auto` (not `acceptEdits`) in `launch_role`. | upstream = `acceptEdits` (L433/442) | `backup/main-pre-reset` (commit 1097233) | **0019** | -| — | **cmux multiplexer backend** — `swarm-mux.sh`. **DROP — not wanted in the new fork (decision 2026-06-14).** Stay on upstream's tmux harness. Dropping this is what un-tangles Idea B / executing-fields / M3. | no mux file upstream | n/a — not reapplied | **DROP** | -| — | **`executing` logbook entry carries `{message,hash,sender}`** for session-restart recovery (ADR 0002 names only the idle/busy marker). | absent upstream | `feat/main-executing-context-fields:swarmforge/scripts/swarmforge.sh` | extend **0002** | -| — | **retro-triage skill** — `.claude/skills/retro-triage/` (~219 lines), diagnosis-first batch retro. **KEEP — restore (decision 2026-06-14).** Byte-identical on all branches; recover as-is. | absent upstream | `feat/issue-20-a…:.claude/skills/retro-triage/SKILL.md` | **0021** | -| — | **Self-referencing fork URL** — `./swarm` self-fetch points at the fork. | upstream points at unclebob | `backup/main-pre-reset` (commit ded6019) | row-only | -| — | **Richer `CONTEXT.md` glossary** — Task / Logbook / Prompt bundle / Bundle cache / Landing / Depth cap / full logbook-status spec; leaner than the backup version. | n/a (docs) | `backup/main-pre-reset:CONTEXT.md` | doc-merge | - -Not-lost / already consistent (no action): curator budget **60/40** (ADR 0013 + I20B spec win over backup prompts' stale 150/300); DESIGN.md **not scaffolded** (ADR 0007 wins over `mono §Idea M`); back-routing **to owning stage** (ADR 0004 wins over `mono §Idea E` "always to coder"). Genuinely rejected (no recover): ideas **G, H, I**. -Also unimplemented draft, not a divergence: `backup/main-pre-reset:docs/proposals/2026-06-11-factory-line-refactor.md` (architecture audit; status draft). - ---- - -## Cross-cutting invariants (do not break while applying) - -- **observation-harness/** is shared: ux-engineer writes (0007), doctrine (0010), QA re-executes (0010), hardener honors rendering invariants — keep consistent. -- **Back-route N=3** (0004) referenced by ux-engineer & integrator — keep the routing-count-in-handoff mechanic. -- **Refuting QA (0005)** is *new*; the B6 QA draft already has the 0010 surface-harness wording — **merge both** when writing QA.prompt. -- **DESIGN.md** is referenced-from-feature-file only (0007) — when porting B6 specifier/ux-engineer, delete scaffold-on-absence and nearest-file walk-up. -- **Curator PRs land in order** A→B→C→D (I20B); everything else is independently landable. - -## Still open (decisions / unknowns) - -*(resolved 2026-06-14, grilling session — what's-missing pass)* - -0. **Section C scope** — RESOLVED. All Section-C items kept (cmux already dropped). ADR assignments: B→**0017**, N→**0018**, auto-permission→**0019**, F→**0020**, retro-triage→**0021**; J→extend **0013**, executing-fields→extend **0002**, O→folds into **0003**; self-url→row-only; CONTEXT glossary→doc-merge. **Idea B remains a hard prerequisite for M3/ADR 0014.** -1. **ADR 0003 setup-swarm skill** — idea-K conflict RESOLVED: setup is **setup-first** (operator runs `/setup-swarm` as step one); `./swarm` **guards** on the `.swarmforge/setup-complete` marker and refuses if absent — it never auto-runs setup. Skill **renamed `setup` → `setup-swarm`**. Idea O folds in. *Remaining impl details (not blockers): marker content format, stack-detection mechanism, per-language tool selection — captured in `docs/migrations/0003-setup-skill-sources.md`.* -2. **ADR 0002 clear-first on six-pack** — RESOLVED: the model column is **configuration** (governed by ADR 0012's per-role model), not an architectural decision. No codex-hook work is added. ADR 0002 stands as written — clear-first is claude-first; codex roles keep upstream delivery as a documented property. -3. *(resolved earlier)* cmux **DROPPED** (stay on upstream tmux harness); Idea-B bundle-inlining **KEPT** but disentangled — port `resolve_prompt_bundle` + XML envelope onto upstream's harness, re-base executing-fields/M3 on it. ADR 0012 `--advisor` resolved (`advisorModel` in `settings.local.json`). -4. **four-pack** — RESOLVED: kept as a **pure merge-mirror of `upstream/four-pack`** (no fork content ever) to honor ADR 0001's "all branches == upstream"; resync via merge-only. -5. **PR shape for implementation** — RESOLVED (2026-06-14): **one PR per delivery branch**, not one per ADR. Two PRs total — `main` (script + skill layer) and `six-pack` (prompts/constitution/conf/root-swarm); no four-pack PR (frozen). Each divergence is an **ordered commit** within its branch (dependency-linear), keeping the single PR tailored. These two initial PRs may be squash-merged (see Sync policy). Full task breakdown: `docs/superpowers/plans/2026-06-14-fork-divergence-implementation.md`. - -**Overriding constraint (all items):** keep the diff vs upstream as small as possible — translate to the minimal additive form, do not lift the pre-reset implementation. See `feedback-minimize-upstream-diff` memory. - diff --git a/docs/migrations/0003-setup-skill-sources.md b/docs/migrations/0003-setup-skill-sources.md deleted file mode 100644 index 8e09a11..0000000 --- a/docs/migrations/0003-setup-skill-sources.md +++ /dev/null @@ -1,55 +0,0 @@ -# Migration source list — ADR 0003 setup skill - -Working source list to implement the **setup skill** (ADR 0003) without losing decisions already made in the pre-reset work. ADR 0003 decided *that* setup becomes a one-time skill; the *how* lives scattered across idea-K, the monolith ADR, ideas N/O, and the "At startup, install…" lines being removed. There is **no implemented setup skill in any branch** (confirmed) — this is design recovery, not code recovery. - -Refs: `idea-K` = `origin/docs/ideas-backlog:docs/ideas/idea-K-setup-preflight.md` · `mono` = `backup/main-pre-reset:docs/adr/0001-fork-divergence.md` · `ADR` = `docs/adr/0003-setup-is-a-one-time-skill.md`. - -## ✅ Resolved (2026-06-14): setup-first, guard-only; skill renamed `setup-swarm` - -- **idea-K** (auto-run on first launch) is **superseded.** `./swarm` never runs setup; the auto-run + stale `backup/main-pre-reset:CLAUDE.md:12` line are dead. -- **ADR 0003 form wins:** setup is **setup-first** — the operator runs `/setup-swarm` as the project's *first* action. `./swarm` is the *second* action and only **guards**: if `.swarmforge/setup-complete` is absent it refuses and tells the operator to run `setup-swarm` first. -- **Skill renamed `setup` → `setup-swarm`** (operator-facing `/setup-swarm`). Glossary updated (`CONTEXT.md`: `setup-swarm`, `swarm-ready marker`). Skill path: `swarmforge/skills/setup-swarm/`. - -## Decisions already made (cite before re-deciding) - -- Setup is a **skill** (fork-owned file, zero upstream conflict), not a `swarmforge.sh` function — `ADR`. -- Run path installs **no project tooling**; `./swarm` still self-fetches scripts, does worktree/session plumbing, **and auto-installs the swarm's own `entire` skills (pin-aware `ensure_skills_installed`, owned by ADR 0018)**; stops if the project isn't set up — `ADR`. *(Decision 2026-06-14: launcher infra-bootstrap stays automatic; only project provisioning is gated by the setup-swarm marker. See `main-script-layer.md` Idea N row.)* -- Skill **reasons about the stack** (Go vs Java vs Clojure → which tools/gates) — that's the point of a skill over a script — `ADR`. -- `entire enable --no-github --telemetry=false` (no `--agent`; hooks added separately) — `idea-K`, `mono §Idea K`. -- Backends derived from `swarmforge.conf` col 3 → `entire agent add ` per unique value; no user input — `idea-K`, `mono §K:178`. -- Warn-and-continue if `entire` absent (setup never blocks the swarm) — `idea-K`, `mono §K:182`. -- No `./swarm setup` subcommand; force re-run = operator deletes the marker — `idea-K`, `mono §K:180`. -- Idea G (per-tech engineering template system) **rejected** — adding a language is 2–3 lines in the shared table — `idea-G`, `mono:69`. - -## What the setup skill must take over (from the removed "At startup, install…" lines) - -| Category | Detail | Removed-line source | -|----------|--------|---------------------| -| Mutation/CRAP/DRY tools | language mutation + CRAP + DRY, from `engineering.prompt` | `upstream/six-pack:roles/cleaner.prompt:19`, `hardender.prompt:8`, `QA.prompt:7` | -| Acceptance Pipeline (APS) | ensure pipeline in place; build `gherkin-parser` + `gherkin-mutator` from `github.com/unclebob/Acceptance-Pipeline-Specification` | `upstream/six-pack:roles/coder.prompt:9`, `hardender.prompt:9` | -| Session tracking | `entire enable …` + `entire agent add ` per conf backend | `idea-K`, `mono §K` | -| ~~Skill pins~~ → **ADR 0018, not setup-swarm** | `entire` skills at pinned SHA (`install-pins.conf` `ENTIRE_SKILLS_SHA`); 11 skills + `agent-retro` to `.claude/skills/`. **Moved out of setup-swarm (decision 2026-06-14):** this is launcher infra-bootstrap, auto-installed by `./swarm` (`ensure_skills_installed`, pin-aware). Documented in **ADR 0018 (Idea N)**. | `mono §Idea N:100` | -| Permissions | write to `.claude/settings.json`: `Bash(gh pr merge*)` (integrator), `Bash(git reset --hard origin/)` (specifier) | `mono §Idea O:334` | -| Install scaffold | `.gitignore` ← `logbook.jsonl`, `tmp/`, `.swarmforge/`; default-branch probe `git symbolic-ref refs/remotes/origin/HEAD` → `swarmforge.conf` | `mono §Idea O:330-332` | - -Note four-pack equivalents exist (architect/refactorer/coder) but four-pack is **frozen** — six-pack rows above are what matters. - -## Swarm-ready marker - -- Path **`.swarmforge/setup-complete`**; `./swarm` checks it before role launch; absent → refuse (ADR 0003 form). Operator deletes to force re-run. — `idea-K`, `mono §K:180`, `ADR`. -- **Marker content (defaulted 2026-06-14, impl detail):** timestamp + swarmforge SHA (debuggable); refusal message text is impl-level. Not an ADR decision. - -## Open design questions — resolved 2026-06-14 - -1. **Stack detection mechanism** — **RESOLVED: the skill's own domain, not an ADR decision.** setup-swarm is a *skill* precisely because it *reasons* about the stack; the ADR must not prescribe a rigid probe list (that would contradict why it's a skill). The `SKILL.md` reads the repo, infers the stack, and asks the operator only when genuinely ambiguous. -2. **Marker format** — defaulted (see above): timestamp + swarmforge SHA. Impl detail. -3. **How the skill ships** — path `swarmforge/skills/setup-swarm/SKILL.md`, mirroring `agent-retro`. Settled. -4. **Re-run / staleness trigger** — RESOLVED: *project* re-setup = operator deletes the marker (manual, by design). *Skill* staleness = `./swarm` auto-(re)installs pin-aware at launch (ADR 0018), no manual trigger needed. -5. **Idea O scope boundary** — RESOLVED: setup-swarm absorbs `.gitignore`/default-branch/permissions (Idea O); the **`entire` skill install moved OUT to ADR 0018** (launcher bootstrap). No `./swarm install` subcommand. -6. **Per-language tool selection** — **RESOLVED: the skill's domain (same as #1).** The skill reasons from `engineering.prompt`'s tool table; behavior on no-match is skill-level judgment (ask the operator), not an ADR rule. - -## Cross-references - -- Pairs with **Idea N (install/upgrade)** and **Idea O (install scaffold)** — both implemented pre-reset, both **without an ADR**; see the manifest's "Uncaptured implemented divergences" section. The setup skill overlaps their territory and must be designed jointly. -- The removed "At startup" lines are also removed for the idle-gate reason (ADR 0002) — shared seam. - diff --git a/docs/migrations/main-script-layer.md b/docs/migrations/main-script-layer.md deleted file mode 100644 index 0222c3d..0000000 --- a/docs/migrations/main-script-layer.md +++ /dev/null @@ -1,34 +0,0 @@ -# Migration recovery — `main` script layer (`swarmforge/scripts/`) - -Per-divergence recovery for everything that touches the launch script on `main`. Base = pristine `upstream/main` (`swarmforge/scripts/swarmforge.sh`, ~554 lines, naive form). Primary source = `backup/main-pre-reset` (~1109 lines — all script divergences stacked linearly). **Re-merge onto current upstream; do not copy the whole file.** - -## ⚠ The entanglement (read first) - -**Idea B (bundle inlining) + cmux backend + M3 (0014) + executing-fields are NOT independent patches.** In `backup/main-pre-reset` they are one ~400-line restructure of the handoff/delivery model: -- cmux refactor introduces `write_deliver_script`, `write_notify_script`, `write_stop_hook`, `write_worktree_notify_wrapper` and **deletes** upstream's `install_shared_constitution_articles`, `sync_worktree_scripts`, `write_tmux_env_file`. -- Idea B's `resolve_prompt_bundle` + rewritten `write_agent_instruction_file` (XML envelope) produce the bundle that `write_deliver_script` passes via `$BUNDLE_PATH`. -- M3 (0014) is a 7-line addendum (commit `1b84895`) **inside** the rewritten `write_agent_instruction_file`. -- executing-fields (commit `a133c71`) live **inside** `write_deliver_script` and `write_stop_hook` heredocs. - -**DECIDED (2026-06-14): cmux is DROPPED; Idea B is KEPT.** So do NOT lift the cmux commit. Instead **disentangle**: port `resolve_prompt_bundle` + the XML-envelope `write_agent_instruction_file` onto upstream's current tmux harness, wire the resolved bundle into upstream's delivery path (not cmux's `write_deliver_script`), then layer M3/0014 and re-base executing-fields onto that. Idea F (auto-compaction) wiring is independent. Skip everything cmux: `swarm-mux.sh`, `swarm-stop.sh`, the `write_deliver_script`/`write_notify_script`/`write_stop_hook` family, `MUX_TARGETS`. - -## Recovery table - -| Row | Recover from | Delta vs upstream / notes | -|-----|-------------|---------------------------| -| **M1 / 0006** sparse-checkout in `prepare_worktrees` | **NET-NEW — no source anywhere** | Write fresh: `git sparse-checkout` excluding the pinned QA path on every worktree except the specifier's + QA's (key on the specifier role, not the `master` name — ADR 0008 renames its worktree to `specifier`); verify path survives handoff commits. Prereq: QA path pinned in specifier prompt. | -| **M2 / 0012** per-role model/effort/advisor | `backup/main-pre-reset:swarmforge.sh` `parse_config`(~L212), `launch_role`(~L870), arrays `ROLE_MODELS/EFFORTS/ADVISORS`(~L42); commits `93f8c5d`, `d467ab7` | `!= 4`→`< 4` + `key=value` loop; per-backend flag locals. **Advisor is NOT a `--advisor` flag** — `write_worktree_advisor` writes `advisorModel` to `.claude/settings.local.json`. ✅ resolves the "does claude --advisor exist" open item. | -| **Idea B** bundle inlining | `backup/main-pre-reset:swarmforge.sh` `resolve_prompt_bundle`(~L797)+`write_agent_instruction_file`(~L825); same on `feat/main-executing-context-fields`, `feat/issue-20-b` | Replace upstream's 2-line "read constitution recursively" heredoc with BFS resolver + XML `` envelope. **Prereq for M3.** Entangled with cmux (see above). | -| **M3 / 0014** append AGENTS.md + .agents/roles | `backup/main-pre-reset:swarmforge.sh` (commit `1b84895`, inside `write_agent_instruction_file`) | 7-line loop appending `AGENTS.md` + `.agents/roles/.md` `` blocks before envelope close + preamble sentence. Cannot land without Idea B. | -| **Idea F** auto-compaction | `backup/main-pre-reset:swarmforge.sh` `write_worktree_permissions`(~L679); commit `93f8c5d` | New fn → `.claude/settings.local.json`: `autoCompactEnabled:true`, `PCT_OVERRIDE:"88"`, `WINDOW:"200000"`; called from `prepare_worktrees`. Shares the file with `write_worktree_advisor` (M2) — both use read-modify-write python3. | -| **executing-fields** | `feat/main-executing-context-fields:swarmforge.sh` (commit `a133c71`, clean +25/-10) | `executing` logbook entry carries `{message,hash,sender}`; inside `write_deliver_script` + `write_stop_hook` heredocs. Cherry-pick `a133c71` once cmux base is in. Partial-fills ADR 0002. | -| **Idea N** upgrade/skills (ADR 0018) | `backup/main-pre-reset:swarmforge.sh` `install_skills`+`ensure_skills_installed`(~L946) + new file `swarmforge/scripts/install-pins.conf`; **`swarm` bootstrap** (root, runnable branches) commit `8994322` adds `upgrade`/`write_source_branch`/`download_from_main` | `swarmforge.sh` part lands on `main`; the `upgrade` subcommand + `.swarmforge/source-branch` live in the root `swarm` file which is **on six-pack/four-pack, not main**. **Decision (2026-06-14): `ensure_skills_installed` STAYS at launch — auto-(re)install the `entire` skills pin-aware, as before.** This is launcher infra-bootstrap (peer of self-fetch + worktree/session plumbing), explicitly allowed; it does NOT violate idle-gate/setup-first, which govern *role* behavior and *project* provisioning, not the launcher bootstrapping its own deps. Skill install is **owned by ADR 0018, not setup-swarm (0003)**. `./swarm upgrade` = explicit refresh of scripts(main) + prompts(`source-branch`) + forced skill reinstall (clears `skills-installed`). | -| **Idea O** install scaffold | `backup/main-pre-reset:swarmforge.sh` `ensure_initial_gitignore`(~L105)+`ensure_runtime_git_excludes`(~L152)+`remove_nonessential_clone_files`(~L165) | `.gitignore`/excludes expansion is implemented (additive). **default-branch probe + permission allow-rules are NET-NEW** → fold into ADR 0003 setup skill. | -| **auto-permission** (ADR 0019) | `backup/main-pre-reset:swarmforge.sh` `launch_role` (commit `1097233`) | `--permission-mode acceptEdits`→`auto` for claude+grok (upstream L433/442). **`auto` verified a real flag value** (Claude Code v2.1.177 choices: acceptEdits, auto, bypassPermissions, default, dontAsk, plan — unlike the phantom `--advisor`). **Decision (2026-06-14): keep `auto`.** Rationale for the ADR: roles run unattended → any prompt is a silent hang; `acceptEdits` still prompts on bash/tool calls. `dontAsk` (deterministic, allow-list-only) was considered and **rejected for allow-list maintenance burden** across every language/tool the swarm runs; `bypassPermissions` rejected (ignores all safety, worktrees aren't sandboxed). `auto` needs ~no config and ships safety rails (blocks force-push-to-main, mass-delete). Consequence: setup-swarm's allow-rules (Idea O) stay a **small, targeted, advisory** set, not a load-bearing whitelist. | -| **cmux** | `backup/main-pre-reset:swarmforge/scripts/swarm-mux.sh` (175 lines, net-new) + `swarm-stop.sh`(66) + `swarmlog.sh`(16); + ~400 lines of `swarmforge.sh` restructure | Largest divergence; see entanglement. Source `swarm-mux.sh` at ~L169; `MUX_TARGETS` array; new write_* fns. | -| **self-url** | root `swarm` bootstrap (commit `ded6019`, runnable branches) | `SCRIPTS_REPO="${SWARMFORGE_SCRIPTS_REPO:-gabadi/swarm-forge}"`. Not on `main`. | - -## What lands where -- **`main` rebase needs:** M1, M2, M3, Idea B, F, O (gitignore part), auto-permission, executing-fields, cmux (`swarmforge.sh` + `swarm-mux.sh`/`swarm-stop.sh`/`swarmlog.sh`), `install-pins.conf`, `install_skills`. -- **Root `swarm` bootstrap (six-pack/four-pack, NOT main):** Idea N `upgrade` subcommand, `source-branch`, self-url. - diff --git a/docs/migrations/six-pack-role-prompts.md b/docs/migrations/six-pack-role-prompts.md deleted file mode 100644 index 6ab10b1..0000000 --- a/docs/migrations/six-pack-role-prompts.md +++ /dev/null @@ -1,56 +0,0 @@ -# Migration recovery — six-pack role prompts - -Per-role recovery for `swarmforge/roles/*.prompt`. Base = `upstream/six-pack`. **Re-merge deltas onto current upstream prompts; do not copy whole backup files** (they predate upstream and carry content ADRs reversed — see STRIP table). Primary source = `backup/six-pre-reset` unless noted. - -Universal add to **every** role prompt: idle-gate line `"Wait for a handoff. Do not act without one."` (0002) and `"Run agent-retro before going idle."` Back-routing (0004) general rule has **no backup source** — author fresh from ADR 0004 wherever a role needs it (structural finding → origin stage once; local → fix in place; single-finding back-once cap). - -## Existing roles — deltas - -| Role | Re-merge (recover-from `backup/six-pre-reset` unless noted) | STRIP / fix | -|------|------------------------------------------------------------|-------------| -| **coder** | idle-gate; UX-Intent read line (0007); handoff `notify cleaner`→`notify ux-engineer` (0007) | STRIP `## Acceptance Pipeline` block (upstream L8–11, the "At startup… APS" bullets) (0003) | -| **QA** ⚠ | idle-gate; **0010** surface-harness: L13 "through the user interface only"→"through the project surface harness only" + Expected-bullet→assertion/`NOT AUTOMATED` rule + re-execute `observation-harness/` + route-back-if-missing; handoff →`notify integrator` (0008) | STRIP `## Startup Tools` (L7) (0003); `logbook.json`→keep upstream `logbook.jsonl`. **0005 refute posture has NO backup source — author fresh**, replacing L14 "Fix bugs found by the QA suite…" with structural→route-back / local→fix-in-place. Merge 0005 (new) + 0010 (backup) into one prompt. | -| **cleaner** | idle-gate; **0016** boundary-file scan (>15 mutation sites → extract) + stripped-view-as-untested anti-pattern (cleanest source: `feat/baseline-scenarios-six`) | STRIP `At startup, install…` (L19) (0003) | -| **hardender** | idle-gate; rendering-invariant property-test line (L18 — **unmanifested divergence**, see note) | STRIP `## Startup Tools` (L8–9) (0003). STRIP backup's `"merge all queued architect handoffs together"` — **unauthorized, no ADR**; keep upstream's "batch in sorted filename order". | -| **specifier** ⚠ | idle-gate; **0008** worktree reset `git reset --hard origin/` via `git symbolic-ref` (recover from `feat/six-pack-pipeline-order-and-scaffold`, NOT backup); **0008** handoff L36 "merge the changes and ask the user"→"When the curator notifies you… ask the user for the next feature"; **0007** UX-Intent authoring; **0009** start from template + "seven"→**"eight"**; **0011** read dependency-manifest + propose-on-undeclared (recover from `backup`/`feat/issue-20-c`, NOT pipeline-order which dropped it) | STRIP DESIGN.md walk-up + scaffold-on-absence (0007); STRIP backup's `git merge --ff-only origin/master` startup (0008, also hardcodes `master`) | - -⚠ **QA and specifier are the complex merges** — multiple overlapping layers, several from different branches. Apply carefully. - -## STRIP / STALE table (backup content ADRs reversed) -| Stale content | In | Reversed by | -|---------------|-----|-------------| -| DESIGN.md walk-up + scaffold | specifier, ux-engineer | ADR 0007 (reference-from-feature-file only) | -| "seven header sections" | specifier | ADR 0009 (six-pack = eight) | -| `git merge --ff-only origin/master` startup | specifier | ADR 0008 (specifier stops merging; `master` stale) | -| "merge all queued architect handoffs together" | hardender | no ADR — keep upstream sorted-batch | -| `logbook.json` | QA | upstream renamed → `logbook.jsonl` | -| curator budgets 150/300 | curator | ADR 0013 + locked spec = 60/40 | - -## New roles (net-new files) - -### ux-engineer (ADR 0007) — recover `backup/six-pre-reset:swarmforge/roles/ux-engineer.prompt` (≡ `origin/feat/obs-harness-six`; NOT pipeline-order/baseline which lack the `observation-harness/` commit step) -Outline: identity+idle · skip if no `## UX Intent` (→notify cleaner) · UX-Intent verification across Visual Composition/Information Hierarchy/Interaction Feel/State Transitions by running the binary · fix rendering only (back-route to coder for model-state, N=3) · durable artifacts: golden snapshots + rendering invariants + `observation-harness/` scenarios via surface tool · run test suite · `## Visual quality standards` (AI-aesthetic anti-patterns, type hierarchy, WCAG 4.5:1/3:1) · notify cleaner. -**STRIP:** DESIGN.md walk-up; make DESIGN.md fix-authority conditional on a feature-file reference (not tree discovery). - -### integrator (ADR 0008) — recover `backup/six-pre-reset:swarmforge/roles/integrator.prompt` (≡ `feat/issue-20-c`; NOT baseline-scenarios-six which still says "notify specifier") -Outline: identity+idle · own landing, one PR/feature, autofix-lint-only · steps: receive from QA → branch `feat/` → `gh pr create` → watch CI → green: `gh pr merge --squash --delete-branch` + post-merge gate → **notify curator** → CI-red routing (tests→coder, coverage/CRAP/DRY→cleaner, arch→architect; autofix doesn't count; N=3 then `FAILED: depth cap reached`) → agent-retro. -**FIX (locked spec wins):** step 7 must add "Include the specifier handoff name and the post-merge master commit hash." - -### curator (ADR 0013/0014) — authoritative source = `feat/issue-20-b:docs/specs/issue-20-knowledge-promotion-loop.md` **PR C2 verbatim block** (branch `curator.prompt` artifacts have STALE 150/300 budgets — do not cargo-cult) -Outline: identity+idle · only writes `AGENTS.md`+`.agents/` · sources `~/.claude/worklog/retros/*.md` · routing ladder (backlog→AGENTS.md≤60→roles≤40→references→skills-on-2nd→upstream→ledger) · ledger `date|session-id|role|failure-class|verdict|summary` · lifecycle (empty-run→pass-through, knowledge branch, self-merge PR with metric line, move retros to processed/, notify specifier) · 9-check per-item algorithm (scope→recurrence→non-inferable→rule-not-phenomenon→dup/contradiction→global-fix-routing→trigger-load-fit→evidence-pull→sizing). -**Companion changes (locked spec, not on any branch):** specifier wait-on-curator (PR C4); `workflow.prompt` integrator→curator→specifier chain bullet (PR C5). - -## Final `swarmforge.conf` window order (recover `feat/issue-20-c` for 8 windows + curator from `backup/six-pre-reset`) -``` -window specifier codex specifier # was: codex master (0008 moves specifier off master) -window coder codex coder -window ux-engineer codex ux-engineer # 0007: after coder -window cleaner codex cleaner -window architect codex architect -window hardender codex hardender -window QA codex QA -window integrator codex integrator # 0008: after QA -window curator codex curator # 0013: last (only in backup/six-pre-reset) -``` -Note: all roles still on `codex` → clear-first (0002) inert until roles move to `claude` or codex hooks built (open item). `default_branch` is per-feature specifier logic, not a conf field. - diff --git a/docs/superpowers/plans/2026-06-14-fork-divergence-implementation.md b/docs/superpowers/plans/2026-06-14-fork-divergence-implementation.md deleted file mode 100644 index 188b245..0000000 --- a/docs/superpowers/plans/2026-06-14-fork-divergence-implementation.md +++ /dev/null @@ -1,1232 +0,0 @@ -# Fork Divergence Implementation Plan - -> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. - -**Goal:** Re-apply every documented SwarmForge fork divergence (ADRs 0001–0021 + manifest rows) on top of pristine `upstream`, as **two pull requests — one per delivery branch**: one PR on `main` (scripts + skills), one PR on `six-pack` (prompts + constitution + conf + root swarm). Each PR is the minimal additive diff vs upstream, built from ordered, per-divergence commits. - -**Architecture:** Two delivery branches. `main` carries scripts + skills + docs/ADRs; `six-pack` carries role prompts, constitution articles, templates, the fidelity manifest, `swarmforge.conf`, and the root `swarm` bootstrap. Every branch is kept identical to its `upstream/` and advanced by **merge**, never rebase (ADR 0001). **four-pack is frozen** — no fork content is ever applied to it (manifest decision 2026-06-14); it stays a pure merge-mirror of `upstream/four-pack`. - -**Tech Stack:** zsh (`swarmforge.sh` and the handoff scripts run under zsh — note `${=var}` word-splitting, `typeset -a/-A`, `${var:h}`/`${var:t}` modifiers), Python 3 (settings.local.json read-modify-write), Markdown skills (`SKILL.md`), `*.prompt` plain-text role/constitution files, Gherkin `.feature` templates, `gh` CLI. - ---- - -## Conventions (read before any task) - -**Two PRs, two branches.** Exactly one branch and one PR per delivery branch: -- **PR 1 (MAIN)** — branch `feat/fork-divergences-main` off `origin/main`; all of the MAIN TRACK commits below; PR opened `--base main`. -- **PR 2 (SIX-PACK)** — branch `feat/fork-divergences-six-pack` off `origin/six-pack`; all of the SIX-PACK TRACK commits below; PR opened `--base six-pack`. - -There is **no four-pack PR** (frozen). The two PRs are independent of each other and can proceed in parallel. - -**Commits.** Each divergence is one commit on its track branch, applied in the listed order (the order encodes the within-branch dependencies — e.g. the bundle commit precedes the knowledge-injection commit that extends it). One commit per divergence keeps the single PR reviewable and tailored. Do **not** create extra branches or PRs. - -**Baseline anchor.** The fork layer is re-applied onto a recorded pristine-upstream baseline (ADR 0001): `main` @ `d947f67` (tag `fork-base/2026-06-14-main`) and `six-pack` @ `cbd1697` (tag `fork-base/2026-06-14-six-pack`). As of 2026-06-14 `origin/main`/`origin/six-pack` equal these exactly, so branching off `origin/` == branching off the tag. The two implementation branches come off the real delivery branches, **not** off this docs branch. If `origin` has since advanced, branch off the tag instead so the diff stays measured against the recorded baseline. - -**Merge style.** Fork divergences are **squash-merged** (ADR 0001), so each of these two PRs lands as one clean commit on its delivery branch. Upstream syncs, by contrast, are history-preserving merges (never squashed/rebased — keep upstream's story). A landed commit is never rewritten. - -**Pushing.** **Never** push `main`, `six-pack`, or `upstream` directly without explicit request — push only the two feature branches. `gh` defaults to the `unclebob` upstream remote — always pass `--repo gabadi/swarm-forge`. - -**Minimize-diff rule (overriding constraint).** Translate each divergence to its smallest additive form vs current upstream. Do **not** lift whole files from the backup branches for existing files — re-merge the delta onto the *current* upstream file. Net-new files (new roles, templates, skills) may be recovered whole, but you MUST apply the STRIP/FIX edits called out per commit (the backup artifacts predate upstream and carry behavior the ADRs reversed). - -**Recovery sources.** Recover exact prior content with `git show :`. Key sources: `backup/main-pre-reset` (main script layer), `backup/six-pre-reset` (six-pack prompts/templates), `feat/issue-20-a-retro-skill-upgrade` (agent-retro + retro-triage skills), `feat/issue-20-b-bundle-knowledge-injection` (knowledge-promotion spec / curator), `feat/baseline-scenarios-six` (dependency-manifest, cleaner boundary scan), `feat/six-pack-pipeline-order-and-scaffold` (specifier worktree reset), `feat/issue-20-c-curator-six-pack` (8-window conf, integrator). Line numbers are approximate (`~L###`) — they drift; locate by function/section name, not by line. - -**Verification approach.** There is no bash unit-test harness in this repo. "Tests" are: (a) `shellcheck` on changed shell files where available, (b) `zsh -n ` syntax check, (c) a scratch-project smoke run of the generated artifact (e.g. inspect the bundle `write_agent_instruction_file` produces), and (d) `grep` assertions on prompt/skill text. Each commit states the concrete verification command and expected result. Verify after each commit; a whole-track verification runs before each PR is opened. - -**Commit message footer.** End every commit body with: -``` -Co-Authored-By: Claude Opus 4.8 (1M context) -``` - ---- - -## Commit order (within each branch) - -**MAIN branch** (`feat/fork-divergences-main`) — commit in this order; the only hard dependency is C3→C2 (knowledge injection extends the bundle envelope). C1–C6, C8, C11 all edit `swarmforge.sh`, so a linear commit order avoids any in-file conflict: - -| # | ADR | What | `swarmforge.sh` region / new file | -|---|-----|------|-----------------------------------| -| C1 | 0019 | auto-permission | `launch_role` | -| C2 | 0017 | bundle inlining | `write_agent_instruction_file` + new `resolve_prompt_bundle` | -| C3 | 0014 | knowledge injection (**after C2**) | `write_agent_instruction_file` | -| C4 | 0012 | per-role model/effort/advisor | `parse_config`, `launch_role` + new `write_worktree_advisor` | -| C5 | 0020 | auto-compaction | `prepare_worktrees` + new `write_worktree_permissions` | -| C6 | 0006 | QA holdout sparse-checkout | `prepare_worktrees` | -| C7 | 0002-ext | executing-entry fields | handoff scripts (`swarmforge/scripts/*.sh`) | -| C8 | 0018 | pinned skill install | new `install_skills`/`ensure_skills_installed` + new `install-pins.conf` | -| C9 | 0013/J | agent-retro skill | new `swarmforge/skills/agent-retro/` | -| C10 | 0021 | retro-triage skill | new `.claude/skills/retro-triage/` | -| C11 | 0003 + O | setup-swarm skill + marker guard + scaffold | new `swarmforge/skills/setup-swarm/` + `swarmforge.sh` guard/gitignore | - -**SIX-PACK branch** (`feat/fork-divergences-six-pack`) — commit in this order; the order resolves the shared-file sequencing (`specifier.prompt`: D1,D3,D4,D5,D8,D9,D10 · `QA.prompt`: D1,D2,D3,D6,D7,D9 · `swarmforge.conf`: D8,D9,D10 · `workflow.prompt`: D10,D11): - -| # | ADR | What | Touches | -|---|-----|------|---------| -| D1 | 0002 | idle-gate + agent-retro line | all 6 role prompts | -| D2 | 0003 | strip startup-install directives | coder, QA, cleaner, hardener | -| D3 | 0004 | back-routing rule | role prompts | -| D4 | 0009 | spec-header template + specifier | new `templates/feature.feature`, specifier | -| D5 | 0011 | fidelity manifest + specifier | new `dependency-manifest.prompt`, specifier | -| D6 | 0010 | surface harness | `engineering.prompt`, QA | -| D7 | 0005 | refute QA | QA | -| D8 | 0007 | UX engineer | new `ux-engineer.prompt`, coder, specifier, `swarmforge.conf` | -| D9 | 0008 | integrator + specifier stops merging | new `integrator.prompt`, specifier, QA, `swarmforge.conf` | -| D10 | 0013 | curator + chain rewiring | new `curator.prompt`, integrator, specifier, `workflow.prompt`, `swarmforge.conf` | -| D11 | 0015 | platform-feasibility stop rule | `workflow.prompt` | -| D12 | 0016 | cleaner boundary scan | cleaner | -| D13 | — | hardener rendering invariants | hardener | -| D14 | 0018 | root swarm upgrade + self-url | root `swarm` | - ---- - -# MAIN TRACK → PR 1 - -## Setup: create the main branch - -- [ ] **Create the single branch for all MAIN commits** - -```bash -git fetch origin && git switch -c feat/fork-divergences-main origin/main -# If origin/main has advanced past the recorded baseline, branch off the tag instead: -# git switch -c feat/fork-divergences-main fork-base/2026-06-14-main -``` -All C1–C11 commits land on this one branch. Do not create per-commit branches. This PR is squash-merged (fork-divergence policy, ADR 0001). - ---- - -## C1: ADR 0019 — autonomous permission mode - -**Files:** Modify `swarmforge/scripts/swarmforge.sh` (`launch_role`, the `claude)` and `grok)` arms, ~L433 / ~L442) - -- [ ] **Step 1: Locate the two launch arms** - -Run: `grep -n "permission-mode acceptEdits" swarmforge/scripts/swarmforge.sh` -Expected: two hits inside `launch_role` — the `claude)` arm and the `grok)` arm. - -- [ ] **Step 2: Apply the edit** - -Replace `--permission-mode acceptEdits` with `--permission-mode auto` in both arms. (`auto` is a real Claude Code flag value — verified, unlike the phantom `--advisor`. Roles run unattended, so `acceptEdits` bash/tool prompts hang silently; `auto` ships rails — blocks force-push-to-main and mass-delete.) - -```bash -sed -i '' 's/--permission-mode acceptEdits/--permission-mode auto/g' swarmforge/scripts/swarmforge.sh -``` - -- [ ] **Step 3: Verify** - -Run: `grep -c "permission-mode auto" swarmforge/scripts/swarmforge.sh; grep -c "acceptEdits" swarmforge/scripts/swarmforge.sh; zsh -n swarmforge/scripts/swarmforge.sh && echo SYNTAX_OK` -Expected: `2`, `0`, `SYNTAX_OK`. - -- [ ] **Step 4: Commit** - -```bash -git add swarmforge/scripts/swarmforge.sh -git commit -m "feat(swarmforge): autonomous permission mode for unattended roles (ADR 0019)" -``` - ---- - -## C2: ADR 0017 — prompt-bundle inlining - -**Files:** Modify `swarmforge/scripts/swarmforge.sh` (replace `write_agent_instruction_file` ~L389–413; add `resolve_prompt_bundle`) - -Upstream emits two naive "read recursively" lines. The fork pre-resolves the constitution + role prompt into one deduplicated XML envelope. **Disentangle from cmux:** port ONLY `resolve_prompt_bundle` + the envelope `write_agent_instruction_file`. Do NOT port `write_deliver_script`/`write_notify_script`/`write_stop_hook`/`MUX_TARGETS`. - -- [ ] **Step 1: Read the current naive function** - -Run: `grep -n "write_agent_instruction_file" swarmforge/scripts/swarmforge.sh` -Confirm it emits `Read swarmforge/constitution.prompt, then read every file it refers to recursively...` and uses globals `$CONSTITUTION_FILE`, `$ROLES_DIR`, `$WORKING_DIR` (all set upstream). - -- [ ] **Step 2: Add `resolve_prompt_bundle` above `write_agent_instruction_file`** - -```zsh -resolve_prompt_bundle() { - local role="$1" - typeset -a bundle=() - typeset -A seen=() - typeset -a queue=("$CONSTITUTION_FILE" "$ROLES_DIR/${role}.prompt") - local file rel_path ref ref_abs - - while (( ${#queue[@]} > 0 )); do - file="${queue[1]}" - shift queue - - rel_path="${file#${WORKING_DIR}/}" - [[ ${+seen[$rel_path]} -eq 1 ]] && continue - [[ ! -f "$file" ]] && continue - - seen[$rel_path]=1 - bundle+=("$rel_path") - - while IFS= read -r ref; do - [[ -z "$ref" ]] && continue - ref_abs="$WORKING_DIR/$ref" - [[ ${+seen[$ref]} -eq 0 ]] && queue+=("$ref_abs") - done < <(grep -oE 'swarmforge/[A-Za-z0-9_./-]+\.prompt' "$file" 2>/dev/null || true) - done - - printf '%s\n' "${bundle[@]}" -} -``` - -- [ ] **Step 3: Replace `write_agent_instruction_file` with the envelope form** - -```zsh -write_agent_instruction_file() { - local role="$1" - local prompt_file="$2" - typeset -a bundle_files=() - local rel abs_path - - while IFS= read -r rel; do - [[ -n "$rel" ]] && bundle_files+=("$rel") - done < <(resolve_prompt_bundle "$role") - - { - printf '\n' "$role" - printf '\n' - printf 'This prompt bundle is pre-resolved. Do not open or re-read any swarmforge/*.prompt files — all relevant instructions are already included below.\n' - printf '\n' - for rel in "${bundle_files[@]}"; do - abs_path="$WORKING_DIR/$rel" - [[ -f "$abs_path" ]] || continue - printf '\n' "$rel" - cat "$abs_path" - printf '\n\n' - done - printf '\n' - } > "$prompt_file" -} -``` - -- [ ] **Step 4: Verify** - -Run: `zsh -n swarmforge/scripts/swarmforge.sh && echo SYNTAX_OK` -Then confirm the function references only `$CONSTITUTION_FILE`, `$ROLES_DIR`, `$WORKING_DIR` (set in upstream's init/`parse_config`). For a live check, run the swarm in a scratch dir and inspect a generated `$PROMPTS_DIR/.md` — it should be a single `` envelope with deduped `` blocks, no "read recursively" lines. -Expected: `SYNTAX_OK` + a well-formed envelope. - -- [ ] **Step 5: Commit** - -```bash -git add swarmforge/scripts/swarmforge.sh -git commit -m "feat(swarmforge): pre-resolve role prompt bundle into XML envelope (ADR 0017)" -``` - ---- - -## C3: ADR 0014 — `.agents/` knowledge injection (after C2) - -**Files:** Modify `swarmforge/scripts/swarmforge.sh` (`write_agent_instruction_file`, as written by C2) - -- [ ] **Step 1: Update the preamble line** - -In `write_agent_instruction_file`, change the `` printf to: - -```zsh - printf 'This prompt bundle is pre-resolved. Do not open or re-read any swarmforge/*.prompt files — all relevant instructions are already included below. Project knowledge files (AGENTS.md and your role file under .agents/roles/) are included below when present.\n' -``` - -- [ ] **Step 2: Add the knowledge loop** - -Add `knowledge` to the locals (`local rel abs_path knowledge`). Insert **after** the bundle-files `for` loop and **before** `printf '\n'`: - -```zsh - for knowledge in "AGENTS.md" ".agents/roles/${role}.md"; do - abs_path="$WORKING_DIR/$knowledge" - [[ -f "$abs_path" ]] || continue - printf '\n' "$knowledge" - cat "$abs_path" - printf '\n\n' - done -``` - -- [ ] **Step 3: Acceptance** - -Run: `zsh -n swarmforge/scripts/swarmforge.sh && echo SYNTAX_OK` -In a scratch project with `AGENTS.md` and `.agents/roles/coder.md`: every role's generated bundle carries `AGENTS.md`; only the coder's carries `.agents/roles/coder.md`; removing both produces bundles with no knowledge blocks and no errors. -Expected: `SYNTAX_OK` + the per-role assertions hold. - -- [ ] **Step 4: Commit** - -```bash -git add swarmforge/scripts/swarmforge.sh -git commit -m "feat(swarmforge): inject AGENTS.md + .agents/roles into role bundle (ADR 0014)" -``` - ---- - -## C4: ADR 0012 — per-role model / effort / advisor - -**Files:** Modify `swarmforge/scripts/swarmforge.sh` (`parse_config`, `launch_role`; add arrays + `write_worktree_advisor`) - -- [ ] **Step 1: Declare the three arrays** - -Next to the existing `ROLES`/`AGENTS`/`SESSIONS` declarations, add: - -```zsh -typeset -a ROLE_MODELS=() -typeset -a ROLE_EFFORTS=() -typeset -a ROLE_ADVISORS=() -``` - -- [ ] **Step 2: Relax field count + parse the kv tail in `parse_config`** - -Change `if (( ${#fields[@]} != 4 )); then` → `if (( ${#fields[@]} < 4 )); then`. After the `keyword/role/agent/worktree` assignments, add: - -```zsh - local role_model="" role_effort="" role_advisor="" kv key val kv_i - for (( kv_i = 5; kv_i <= ${#fields[@]}; kv_i++ )); do - kv="${fields[$kv_i]}" - key="${kv%%=*}" - val="${kv#*=}" - case "$key" in - model) role_model="$val" ;; - effort) role_effort="$val" ;; - advisor) role_advisor="$val" ;; - esac - done -``` - -Where the existing arrays are appended, add the parallel appends: - -```zsh - ROLE_MODELS+=("$role_model") - ROLE_EFFORTS+=("$role_effort") - ROLE_ADVISORS+=("$role_advisor") -``` - -- [ ] **Step 3: Add `write_worktree_advisor`** - -```zsh -write_worktree_advisor() { - local worktree_path="$1" - local advisor_model="$2" - local settings_dir="$worktree_path/.claude" - local settings_file="$settings_dir/settings.local.json" - - mkdir -p "$settings_dir" - SETTINGS_FILE="$settings_file" ADVISOR_MODEL="$advisor_model" python3 -c ' -import json, os -p = os.environ["SETTINGS_FILE"] -cfg = {} -try: - with open(p) as f: cfg = json.load(f) -except: pass -cfg["advisorModel"] = os.environ["ADVISOR_MODEL"] -with open(p, "w") as f: json.dump(cfg, f, indent=2) - ' -} -``` - -- [ ] **Step 4: Wire flags into `launch_role`** - -After the existing locals, add: - -```zsh - local role_model="${ROLE_MODELS[$index]}" - local role_effort="${ROLE_EFFORTS[$index]}" - local role_advisor="${ROLE_ADVISORS[$index]}" -``` - -After `write_agent_instruction_file "$role" "$prompt_file"`, add: - -```zsh - [[ -n "$role_advisor" ]] && write_worktree_advisor "$role_worktree" "$role_advisor" -``` - -In the `claude)` arm: - -```zsh - local claude_flags="" - [[ -n "$role_model" ]] && claude_flags+=" --model '$role_model'" - [[ -n "$role_effort" ]] && claude_flags+=" --effort '$role_effort'" -``` -then insert `${claude_flags}` immediately after `claude` in `launch_cmd`. Apply the analogue for `copilot)` (`--model`/`--effort`) and `grok)` (`--model`/`--effort`); for `codex)` use `-c model="$role_model"` only when set. - -- [ ] **Step 5: Verify** - -Run: `zsh -n swarmforge/scripts/swarmforge.sh && echo SYNTAX_OK` -Add a temporary conf line `window coder claude coder model=opus effort=high advisor=sonnet` and confirm `parse_config` accepts it; the existing 4-field lines still parse; `advisorModel` lands in the role worktree's `settings.local.json`. -Expected: `SYNTAX_OK` + both 4-field and 7-field lines parse. - -- [ ] **Step 6: Commit** - -```bash -git add swarmforge/scripts/swarmforge.sh -git commit -m "feat(swarmforge): per-role model/effort/advisor in swarmforge.conf (ADR 0012)" -``` - ---- - -## C5: ADR 0020 — auto-compaction on role worktrees - -**Files:** Modify `swarmforge/scripts/swarmforge.sh` (add `write_worktree_permissions`; call in `prepare_worktrees`) - -- [ ] **Step 1: Add `write_worktree_permissions`** - -```zsh -write_worktree_permissions() { - local worktree_path="$1" - local settings_dir="$worktree_path/.claude" - local settings_file="$settings_dir/settings.local.json" - - mkdir -p "$settings_dir" - SETTINGS_FILE="$settings_file" python3 -c ' -import json, os -p = os.environ["SETTINGS_FILE"] -cfg = {} -try: - with open(p) as f: cfg = json.load(f) -except: pass -cfg["autoCompactEnabled"] = True -cfg.setdefault("env", {}) -cfg["env"]["CLAUDE_AUTOCOMPACT_PCT_OVERRIDE"] = "88" -cfg["env"]["CLAUDE_CODE_AUTO_COMPACT_WINDOW"] = "200000" -with open(p, "w") as f: json.dump(cfg, f, indent=2) - ' -} -``` - -- [ ] **Step 2: Call it from `prepare_worktrees`** - -Inside the per-role loop, after the `git worktree add` block (and after C4's advisor call site), add: - -```zsh - write_worktree_permissions "$worktree_path" -``` -Both writers JSON-merge `settings.local.json`, so calling both is safe and order-independent. - -- [ ] **Step 3: Verify** - -Run: `zsh -n swarmforge/scripts/swarmforge.sh && echo SYNTAX_OK` -After a scratch run, a role worktree's `.claude/settings.local.json` contains `"autoCompactEnabled": true` and the two `env` overrides (alongside any `advisorModel`). -Expected: `SYNTAX_OK` + merged JSON. - -- [ ] **Step 4: Commit** - -```bash -git add swarmforge/scripts/swarmforge.sh -git commit -m "feat(swarmforge): enable auto-compaction on role worktrees (ADR 0020)" -``` - ---- - -## C6: ADR 0006 — harness-enforced QA holdout (sparse-checkout) - -**NET-NEW — no source artifact.** Write fresh. **Files:** Modify `swarmforge/scripts/swarmforge.sh` (`prepare_worktrees`) - -- [ ] **Step 1: Identify the loop variables** - -Run: `grep -n "worktree add\|WORKTREE_NAMES\|ROLES\[" swarmforge/scripts/swarmforge.sh` -Confirm the role variable in `prepare_worktrees`, the specifier worktree (`specifier`, not `master` — ADR 0008), and the QA role name. - -- [ ] **Step 2: Add a pinned QA-path constant** - -Near the top config constants (single source of truth, matches the specifier-authored path): - -```zsh -QA_HOLDOUT_PATH="${SWARMFORGE_QA_HOLDOUT_PATH:-qa-e2e}" -``` - -- [ ] **Step 3: Add conditional sparse-checkout after `git worktree add`** - -Key on the **role** (not worktree name); exclude the holdout from every worktree except specifier's and QA's: - -```zsh - if [[ "$role" != "specifier" && "$role" != "QA" ]]; then - git -C "$worktree_path" sparse-checkout init --no-cone >/dev/null 2>&1 - { - printf '/*\n' - printf '!/%s/\n' "$QA_HOLDOUT_PATH" - } > "$worktree_path/.git/info/sparse-checkout" 2>/dev/null \ - || git -C "$worktree_path" sparse-checkout set --no-cone '/*' "!/${QA_HOLDOUT_PATH}/" >/dev/null 2>&1 - git -C "$worktree_path" read-tree -mu HEAD >/dev/null 2>&1 || true - fi -``` -(Substitute the real role-variable name from Step 1 for `$role`. The holdout stays in the commit/tree — only absent from disk — so it survives each role's handoff commit.) - -- [ ] **Step 4: Verify holdout invisibility + commit survival** - -In a scratch run with a committed `qa-e2e/`: coder/cleaner/architect/hardener worktrees have **no** `qa-e2e/` on disk; specifier + QA **do**; after a role's handoff commit, `git show HEAD:qa-e2e/` still resolves. -Run: `zsh -n swarmforge/scripts/swarmforge.sh && echo SYNTAX_OK` -Expected: `SYNTAX_OK` + invisibility/survival hold. - -- [ ] **Step 5: Commit** - -```bash -git add swarmforge/scripts/swarmforge.sh -git commit -m "feat(swarmforge): sparse-checkout the QA holdout from shaping roles (ADR 0006)" -``` - ---- - -## C7: ADR 0002 (extend) — executing-entry context fields - -**Files:** Modify upstream's handoff scripts under `swarmforge/scripts/` (the script that writes the `executing` logbook entry + the notify + stop-hook paths) - -> ⚠ Reference commit `a133c71` is on the **cmux lineage** (its diff is inside `swarmforge.sh` heredocs that don't exist on pristine upstream). Do **not** cherry-pick — re-author the same field semantics onto upstream's separate handoff scripts. - -- [ ] **Step 1: Find the `executing` entry write site** - -Run: `grep -rn '"executing"\|status.*executing\|executing' swarmforge/scripts/` -The write site is one of `receive-handoff.sh` / `complete-handoff.sh` / `handoff-lib.sh` / the deliver step. Read the intended semantics: -Run: `git show a133c71` -Expected: the entry must carry `{status, timestamp, message, hash, sender}` instead of `{status, timestamp}`. - -- [ ] **Step 2: Add the three fields** - -At the write site, extend the JSON object with: `message` (the task message text the delivery already passes), `hash` (the handoff commit hash in scope), `sender` (the sender role resolved from `sessions.tsv` by matching the sender worktree — mirror `notify-agent.sh`'s existing role resolution). Thread `sender` from `notify-agent.sh` → deliver step → stop-hook re-queue path, following upstream's existing argument-passing convention. - -- [ ] **Step 3: Verify** - -Run: `for f in swarmforge/scripts/*.sh; do zsh -n "$f" || echo "BAD: $f"; done; echo CHECKED` -In a scratch run, trigger a delivery and inspect the `executing` line in `logbook.jsonl`. -Expected: `CHECKED` + the entry carries non-empty `message`, `hash`, `sender`. - -- [ ] **Step 4: Commit** - -```bash -git add swarmforge/scripts -git commit -m "feat(swarmforge): carry {message,hash,sender} in executing logbook entry (ADR 0002)" -``` - ---- - -## C8: ADR 0018 — pinned skill install (main half) - -The `upgrade` subcommand + `source-branch` + self-url live in the root `swarm` (six-pack, D14). This is the `main` script half: pin-aware, idempotent skill install at launch (launcher infra-bootstrap — allowed; does not violate idle-gate/setup-first). - -**Files:** Create `swarmforge/scripts/install-pins.conf`; modify `swarmforge/scripts/swarmforge.sh` - -- [ ] **Step 1: Create `install-pins.conf`** - -```bash -cat > swarmforge/scripts/install-pins.conf <<'EOF' -# Pinned external dependency versions for swarm install/upgrade. -# Bump a SHA here and commit on main to pull in a newer version. - -# entireio/skills — installed to .claude/skills/ in the target project -ENTIRE_SKILLS_SHA=4c9a02513c3ec6ebabd9a9dc6bd8240854a218ac -EOF -``` -Confirm the SHA against `backup/main-pre-reset:swarmforge/scripts/install-pins.conf` and bump if it has moved. - -- [ ] **Step 2: Add `install_skills` + `ensure_skills_installed`** - -Run: `git show backup/main-pre-reset:swarmforge/scripts/swarmforge.sh | grep -n "install_skills\|ensure_skills_installed"` -Add `install_skills()` (sources `install-pins.conf`; copies the in-repo `agent-retro` skill into `.claude/skills/`; fetches entire's skills tarball at `$ENTIRE_SKILLS_SHA` into `.claude/skills/`; writes the SHA to `$STATE_DIR/skills-installed`; warns and continues if offline) and `ensure_skills_installed()` (returns early if the sentinel matches the pinned SHA, else calls `install_skills`). Use the canonical bodies from `backup/main-pre-reset` (`~L946`), kept additive. - -- [ ] **Step 3: Call it in the launch flow** - -After config is parsed and `$STATE_DIR` is known, add: - -```zsh -ensure_skills_installed -``` - -- [ ] **Step 4: Verify** - -Run: `zsh -n swarmforge/scripts/swarmforge.sh && echo SYNTAX_OK` -A second launch is a no-op (sentinel matches); an offline launch warns rather than failing. -Expected: `SYNTAX_OK` + idempotent re-run. - -- [ ] **Step 5: Commit** - -```bash -git add swarmforge/scripts/swarmforge.sh swarmforge/scripts/install-pins.conf -git commit -m "feat(swarmforge): pin-aware idempotent skill install at launch (ADR 0018)" -``` - ---- - -## C9: ADR 0013 / Idea J — agent-retro skill (net-new) - -upstream/main has no `skills/` dir — this is a net-new add. Source = `feat/issue-20-a-retro-skill-upgrade:swarmforge/skills/agent-retro/`. - -**Files:** Create `swarmforge/skills/agent-retro/` - -- [ ] **Step 1: Recover the skill files** - -```bash -for f in $(git ls-tree -r --name-only feat/issue-20-a-retro-skill-upgrade -- swarmforge/skills/agent-retro); do - mkdir -p "$(dirname "$f")" - git show "feat/issue-20-a-retro-skill-upgrade:$f" > "$f" -done -``` - -- [ ] **Step 2: Verify the four locked behaviors** - -```bash -grep -c "pending-curation" swarmforge/skills/agent-retro/SKILL.md # >= 1 -grep -ci "scope" swarmforge/skills/agent-retro/SKILL.md # >= 2 (tag + table column) -grep -ci "capture" swarmforge/skills/agent-retro/SKILL.md # >= 1 -grep -c "session info --transcript\|.claude/projects" swarmforge/skills/agent-retro/SKILL.md # >= 1 -``` -Expected: all thresholds met. If any is 0, re-check the source branch. - -- [ ] **Step 3: Commit** - -```bash -git add swarmforge/skills/agent-retro -git commit -m "feat(swarmforge): add agent-retro skill — scoped, capture-first, autonomous (ADR 0013)" -``` - ---- - -## C10: ADR 0021 — retro-triage skill (net-new, byte-identical) - -Lives under `.claude/skills/` (operator-invoked), distinct from `swarmforge/skills/`. **Files:** Create `.claude/skills/retro-triage/SKILL.md` - -- [ ] **Step 1: Recover byte-identical** - -```bash -mkdir -p .claude/skills/retro-triage -git show feat/issue-20-a-retro-skill-upgrade:.claude/skills/retro-triage/SKILL.md > .claude/skills/retro-triage/SKILL.md -``` - -- [ ] **Step 2: Verify** - -```bash -git diff --no-index <(git show feat/issue-20-a-retro-skill-upgrade:.claude/skills/retro-triage/SKILL.md) .claude/skills/retro-triage/SKILL.md && echo IDENTICAL -wc -l .claude/skills/retro-triage/SKILL.md -``` -Expected: `IDENTICAL`, ~219 lines. - -- [ ] **Step 3: Commit** - -```bash -git add .claude/skills/retro-triage -git commit -m "feat: restore retro-triage skill (ADR 0021)" -``` - ---- - -## C11: ADR 0003 + Idea O — setup-swarm skill, marker guard, scaffold - -NET-NEW skill design (no backup artifact). **Files:** Create `swarmforge/skills/setup-swarm/SKILL.md`; modify `swarmforge/scripts/swarmforge.sh` - -- [ ] **Step 1: Read the design recovery doc** - -Run: `cat docs/migrations/0003-setup-skill-sources.md` -Confirm: setup is **setup-first** (operator runs `/setup-swarm` first); `./swarm` only **guards** on `.swarmforge/setup-complete` and refuses if absent (never auto-runs setup); skill named `setup-swarm`; Idea O folds in; the `entire` skill pins are NOT here (that is C8). - -- [ ] **Step 2: Author `setup-swarm/SKILL.md`** - -Mirror `agent-retro`'s SKILL.md shape; cover, per the design doc: -- **Stack detection** (reason about the language → which quality tools/gates to install — *why* setup is a skill, not a script; don't over-prescribe the mechanism). -- Install the project's mutation/CRAP/DRY tools (those stripped from cleaner/hardener/QA) and APS `gherkin-parser`/`gherkin-mutator` (stripped from coder/hardener). -- Session tracking: `entire enable --no-github --telemetry=false`, then `entire agent add ` per unique backend in `swarmforge.conf` column 3; warn-and-continue if `entire` absent. -- Permission allow-rules to `.claude/settings.json` (`Bash(gh pr merge*)` for integrator, `Bash(git reset --hard origin/*)` for specifier) — a small, advisory set, not a load-bearing whitelist (ADR 0019 `auto` already ships rails). -- Scaffold: ensure `.gitignore` covers `logbook.jsonl`, `tmp/`, `.swarmforge/`; probe the default branch (`git symbolic-ref refs/remotes/origin/HEAD`) and record it for the specifier's per-feature reset. -- Emit the swarm-ready marker `.swarmforge/setup-complete` (content: timestamp + swarmforge SHA — impl detail). - -- [ ] **Step 3: Add the marker guard to `swarmforge.sh`** - -Early in the launch flow (before any role launch; distinct from the `ensure_skills_installed` launcher bootstrap), add: - -```zsh -if [[ ! -f "$STATE_DIR/setup-complete" ]]; then - echo -e "${RED}Error:${RESET} project is not swarm-ready. Run /setup-swarm first." >&2 - exit 1 -fi -``` -The guard never runs setup; it only refuses. - -- [ ] **Step 4: Expand the gitignore/excludes scaffold (Idea O)** - -In `ensure_initial_gitignore`, add `logbook.jsonl`, `tmp/` (plus backup's `swarmtools/`/`logs/`/`agent_context/` if still relevant) — each as an idempotent `grep -qx || append` block and in the initial-creation heredoc. In `ensure_runtime_git_excludes`, expand the `for pattern in ...` loop to the same set. Add `remove_nonessential_clone_files` (recover from `backup/main-pre-reset`) and call it once in the init flow. - -- [ ] **Step 5: Verify** - -Run: `zsh -n swarmforge/scripts/swarmforge.sh && echo SYNTAX_OK` -Launching without the marker exits with "Run /setup-swarm first"; creating `.swarmforge/setup-complete` lets launch proceed; running twice doesn't duplicate `.gitignore` lines. -Expected: `SYNTAX_OK` + guard + idempotent gitignore. - -- [ ] **Step 6: Commit** - -```bash -git add swarmforge/skills/setup-swarm swarmforge/scripts/swarmforge.sh -git commit -m "feat(swarmforge): setup-swarm skill + swarm-ready marker guard + scaffold (ADR 0003, Idea O)" -``` - ---- - -## Finalize PR 1 (MAIN) - -- [ ] **Step 1: Whole-track verification** - -```bash -zsh -n swarmforge/scripts/swarmforge.sh && echo SYNTAX_OK -git diff --stat origin/main # review: only intended files changed, all additive -``` -Expected: `SYNTAX_OK`; the diff touches only `swarmforge/scripts/*`, `swarmforge/skills/*`, `.claude/skills/retro-triage/*` — no role prompts, no conf (those are PR 2). - -- [ ] **Step 2: Push the branch** - -```bash -git push -u origin feat/fork-divergences-main -``` - -- [ ] **Step 3: Open the single PR** - -```bash -gh pr create --base main --repo gabadi/swarm-forge \ - --title "feat: fork divergences — main script + skill layer" \ - --body "Re-applies the main-side fork divergences on pristine upstream, one commit per ADR: 0019 auto-permission, 0017 bundle inlining, 0014 knowledge injection, 0012 per-role config, 0020 auto-compaction, 0006 QA holdout, 0002 executing-fields, 0018 skill install, 0013 agent-retro, 0021 retro-triage, 0003 setup-swarm + Idea O. cmux dropped; four-pack frozen. See docs/superpowers/plans/2026-06-14-fork-divergence-implementation.md and docs/fork-change-manifest.md (Sections A + C)." -``` - ---- - -# SIX-PACK TRACK → PR 2 - -## Setup: create the six-pack branch - -- [ ] **Create the single branch for all SIX-PACK commits** - -```bash -git fetch origin && git switch -c feat/fork-divergences-six-pack origin/six-pack -# If origin/six-pack has advanced past the recorded baseline, branch off the tag instead: -# git switch -c feat/fork-divergences-six-pack fork-base/2026-06-14-six-pack -``` -All D1–D14 commits land on this one branch. This PR is squash-merged (fork-divergence policy, ADR 0001). - ---- - -## D1: ADR 0002 — idle-gate + agent-retro line (all roles) - -**Files:** Modify `swarmforge/roles/{specifier,coder,cleaner,architect,hardender,QA}.prompt` - -- [ ] **Step 1: Add the idle-gate line** - -After the `You are the .` opening of each of the six prompts, insert a blank line then: - -``` -Wait for a handoff. Do not act without one. -``` - -- [ ] **Step 2: Add the agent-retro line** - -As the last bullet of each role's Handoff section: - -``` -- Run `agent-retro` before going idle. -``` - -- [ ] **Step 3: Verify** - -```bash -for r in specifier coder cleaner architect hardender QA; do - grep -q "Wait for a handoff. Do not act without one." "swarmforge/roles/$r.prompt" || echo "MISSING idle-gate: $r" - grep -q "agent-retro\` before going idle" "swarmforge/roles/$r.prompt" || echo "MISSING retro: $r" -done; echo CHECKED -``` -Expected: only `CHECKED`. - -- [ ] **Step 4: Commit** - -```bash -git add swarmforge/roles -git commit -m "feat(roles): idle-gate + agent-retro-before-idle on every role (ADR 0002)" -``` - ---- - -## D2: ADR 0003 — strip startup-install directives - -Install work moves to the setup-swarm skill (C11). **Files:** Modify `swarmforge/roles/{coder,QA,cleaner,hardender}.prompt` - -- [ ] **Step 1: Strip the directives** - -- `coder.prompt`: remove the entire `## Acceptance Pipeline` block (the "At startup, make sure the normal acceptance pipeline …" bullets, ~L8–14). -- `QA.prompt`: remove the `## Startup Tools` section (~L6–7). -- `cleaner.prompt`: remove the "At startup, install the language mutation, CRAP, and DRY tools …" line (~L19). -- `hardender.prompt`: remove the `## Startup Tools` section + APS build line (~L7–10). - -- [ ] **Step 2: Verify** - -```bash -grep -rn "At startup" swarmforge/roles/ ; echo "--- (expect no startup-install directives remain)" -``` -Expected: no remaining "At startup, install/make-ready" directives. - -- [ ] **Step 3: Commit** - -```bash -git add swarmforge/roles -git commit -m "refactor(roles): remove startup install directives — moved to setup-swarm (ADR 0003)" -``` - ---- - -## D3: ADR 0004 — back-routing rule - -No backup source — author fresh from ADR 0004. **Files:** Modify the rework-owning role prompts (coder, cleaner, architect, hardender, QA) - -- [ ] **Step 1: Read the ADR for the exact mechanic** - -Run: `cat docs/adr/0004-rework-routes-back.md` -Confirm: structural finding (re-opens an earlier stage's job) → routes to that origin stage, carried in the handoff; local work stays with the finder; a single finding bounces back at most once; a feature tolerates N=3 cycles total (routing count in the handoff trail); on exceeding, stop and ask the user. - -- [ ] **Step 2: Insert a `## Rework Routing` section before each role's Handoff** - -``` -## Rework Routing -- A structural finding — one that re-opens an earlier stage's decision (an ambiguous or missing spec, a weak or missing test, a design that cannot hold the required behavior) — routes back to the stage that owns that decision, carried in the handoff. -- Local work you can resolve without re-opening an earlier decision stays with you; fix it in place. -- A single finding bounces back at most once. A feature tolerates at most three back-route cycles total (N=3), tracked by the routing count in the handoff trail. On the fourth, stop and ask the user. -``` - -- [ ] **Step 3: Verify** - -```bash -for r in coder cleaner architect hardender QA; do grep -q "## Rework Routing" "swarmforge/roles/$r.prompt" || echo "MISSING: $r"; done; echo CHECKED -``` -Expected: only `CHECKED`. - -- [ ] **Step 4: Commit** - -```bash -git add swarmforge/roles -git commit -m "feat(roles): structural-finding back-routing with N=3 cap (ADR 0004)" -``` - ---- - -## D4: ADR 0009 — spec-header template + specifier wiring - -**Files:** Create `swarmforge/templates/feature.feature`; modify `swarmforge/roles/specifier.prompt` - -- [ ] **Step 1: Recover the template** - -```bash -mkdir -p swarmforge/templates -git show backup/six-pre-reset:swarmforge/templates/feature.feature > swarmforge/templates/feature.feature -``` -Confirm all eight comment sections: `TRACKING`, `CONTRACT`, `CONSTRAINTS`, `SEQUENCING`, `NFR`, `SIDE EFFECTS`, `SCOPE`, `UX INTENT`. - -- [ ] **Step 2: Wire the specifier** - -In Feature Workflow phase 1: start from the template and address all eight header sections (several may resolve to `none` — a deliberate answer) before scenarios. Change any "seven" header-count wording to **"eight"** / "all". - -- [ ] **Step 3: Verify** - -```bash -grep -c "^ # \(TRACKING\|CONTRACT\|CONSTRAINTS\|SEQUENCING\|NFR\|SIDE EFFECTS\|SCOPE\|UX INTENT\)" swarmforge/templates/feature.feature # 8 -grep -n "template\|eight" swarmforge/roles/specifier.prompt -grep -c "seven" swarmforge/roles/specifier.prompt # 0 -``` -Expected: 8 sections; specifier references the template + "eight"; no "seven". - -- [ ] **Step 4: Commit** - -```bash -git add swarmforge/templates/feature.feature swarmforge/roles/specifier.prompt -git commit -m "feat(spec): 8-section feature template; specifier starts from it (ADR 0009)" -``` - ---- - -## D5: ADR 0011 — fidelity manifest + specifier check - -**Files:** Create `swarmforge/dependency-manifest.prompt`; modify `swarmforge/roles/specifier.prompt` - -- [ ] **Step 1: Recover the manifest (with its Rules section)** - -```bash -git show feat/baseline-scenarios-six:swarmforge/dependency-manifest.prompt > swarmforge/dependency-manifest.prompt -``` -⚠ From `feat/baseline-scenarios-six`, NOT `obs-harness-six` (which over-deleted the Rules section). Confirm the 3 tier defs, a `Rules for every declared dependency:` section, and a `## Dependencies` body of `(none)`. - -- [ ] **Step 2: Wire the specifier** - -Add a `## Dependency Manifest` instruction before Feature Workflow: read the manifest before scenarios; on a scenario touching an undeclared external system → stop, propose name/tier/implementation/gaps, wait for approval before adding the entry; never write scenarios resting on an undeclared dependency or a declared gap. Recover exact wording from `backup/six-pre-reset:.../specifier.prompt` or `feat/issue-20-c:.../specifier.prompt` (NOT pipeline-order, which dropped it). - -- [ ] **Step 3: Verify** - -```bash -grep -ci "tier" swarmforge/dependency-manifest.prompt # >= 3 -grep -q "Rules for every declared dependency" swarmforge/dependency-manifest.prompt && echo RULES_OK -grep -q "dependency-manifest" swarmforge/roles/specifier.prompt && echo SPECIFIER_WIRED -``` -Expected: tiers present, `RULES_OK`, `SPECIFIER_WIRED`. - -- [ ] **Step 4: Commit** - -```bash -git add swarmforge/dependency-manifest.prompt swarmforge/roles/specifier.prompt -git commit -m "feat(spec): dependency fidelity manifest + specifier propose-on-undeclared (ADR 0011)" -``` - ---- - -## D6: ADR 0010 — surface harness (engineering article + QA) - -**Files:** Modify `swarmforge/constitution/articles/engineering.prompt`, `swarmforge/roles/QA.prompt` - -- [ ] **Step 1: Add the surface-tool table to `engineering.prompt`** - -Recover the table + context-driven acquisition rule from `backup/six-pre-reset:swarmforge/constitution/articles/engineering.prompt` and merge onto current upstream (a `## Surface Tools` section: tmux/PTY · Playwright · HTTP client · ingress event-injection; live-verification roles pick the minimal sufficient tool per surface). - -- [ ] **Step 2: Edit QA for surface-harness verification** - -In `QA.prompt`: -- Replace "through the user interface only" → "through the project surface harness only". -- Add: every Expected bullet maps to a harness assertion, or is `NOT AUTOMATED — `; asserting constants/config never satisfies a behavioral assertion. -- Add: re-execute the committed `observation-harness/` scenarios before final verification; a user-facing surface with no scenarios routes back (per D3). -- Add the per-surface **baseline scenario** requirement (idle stability / no console errors / no-op event = no state change). - -- [ ] **Step 3: Verify** - -```bash -grep -qi "surface" swarmforge/constitution/articles/engineering.prompt && echo ENG_OK -grep -q "project surface harness only" swarmforge/roles/QA.prompt && echo QA_SURFACE_OK -grep -q "observation-harness" swarmforge/roles/QA.prompt && echo QA_OBS_OK -grep -c "user interface only" swarmforge/roles/QA.prompt # 0 -``` -Expected: `ENG_OK`, `QA_SURFACE_OK`, `QA_OBS_OK`, zero "user interface only". - -- [ ] **Step 4: Commit** - -```bash -git add swarmforge/constitution/articles/engineering.prompt swarmforge/roles/QA.prompt -git commit -m "feat(qa): declared surface-harness verification + baseline scenarios (ADR 0010)" -``` - ---- - -## D7: ADR 0005 — refuting QA posture - -No backup source for the refute posture — author fresh; merge with D6's surface wording. **Files:** Modify `swarmforge/roles/QA.prompt` - -- [ ] **Step 1: Replace the confirm posture with refute** - -Replace the "Fix bugs found by the QA suite or final verification." line and surrounding confirm framing with: - -``` -- Assume the build does not meet the spec and the acceptance tests are too weak to notice, until proven otherwise. Attack the specified contract — try to make it fail within the spec — rather than run a checklist and confirm. -- Stay bounded by the spec: a gap the spec never settled is not a QA pass/fail; route it back to the specifier (per Rework Routing). -- Enforce conversion fidelity: a QA procedure converted into an executable script must encode the procedure's full intent. A green script that asserts nothing is test theater and is itself a defect. -- A structural finding (weak/missing test, ambiguous spec) routes back; a local defect you can fix without re-opening an earlier stage you fix in place. -``` - -- [ ] **Step 2: Confirm against the ADR** - -Run: `cat docs/adr/0005-qa-refutes-not-confirms.md` -Ensure the text matches the ADR's intent (refute, spec-bounded, conversion fidelity / no test theater). - -- [ ] **Step 3: Verify** - -```bash -grep -qi "assume the build does not meet the spec" swarmforge/roles/QA.prompt && echo REFUTE_OK -grep -ci "test theater\|asserts nothing" swarmforge/roles/QA.prompt # >= 1 -grep -c "Fix bugs found by the QA suite" swarmforge/roles/QA.prompt # 0 -``` -Expected: `REFUTE_OK`, conversion-fidelity line present, old confirm line gone. - -- [ ] **Step 4: Commit** - -```bash -git add swarmforge/roles/QA.prompt -git commit -m "feat(qa): refute posture — attack the contract, no test theater (ADR 0005)" -``` - ---- - -## D8: ADR 0007 — UX Engineer role - -**Files:** Create `swarmforge/roles/ux-engineer.prompt`; modify `swarmforge/roles/coder.prompt`, `swarmforge/roles/specifier.prompt`, `swarmforge/swarmforge.conf` - -- [ ] **Step 1: Recover the ux-engineer role** - -```bash -git show backup/six-pre-reset:swarmforge/roles/ux-engineer.prompt > swarmforge/roles/ux-engineer.prompt -``` -⚠ From `backup/six-pre-reset` (≡ `origin/feat/obs-harness-six`), NOT pipeline-order/baseline (they lack the `observation-harness/` commit step). **STRIP** DESIGN.md scaffold-on-absence + walk-up; make DESIGN.md fix-authority conditional on a feature-file reference, not tree discovery. Ensure it carries: the idle-gate line, the N=3 back-route to coder, the `observation-harness/` commit step, golden snapshots + rendering invariants, the `## Visual quality standards` block (WCAG 4.5:1 / 3:1), notify→cleaner. - -- [ ] **Step 2: Wire coder + specifier** - -- `coder.prompt`: add a "read the feature's `## UX Intent` and implement from it alongside the Gherkin" line; change handoff `notify the cleaner` → `notify the ux-engineer`. -- `specifier.prompt`: add UX INTENT authoring (it authors the feature file's `## UX Intent` section — concrete observable statements across Visual Composition / Information Hierarchy / Interaction Feel / State Transitions). STRIP any DESIGN.md scaffold/walk-up here too (reference-from-feature-file only). - -- [ ] **Step 3: Add the conf window after coder** - -In `swarmforge.conf`, after the coder line: -``` -window ux-engineer codex ux-engineer -``` - -- [ ] **Step 4: Verify** - -```bash -grep -q "Wait for a handoff" swarmforge/roles/ux-engineer.prompt && echo UX_IDLE_OK -grep -q "observation-harness" swarmforge/roles/ux-engineer.prompt && echo UX_OBS_OK -grep -c "scaffold" swarmforge/roles/ux-engineer.prompt # 0 -grep -q "notify the ux-engineer" swarmforge/roles/coder.prompt && echo CODER_OK -grep -q "window ux-engineer" swarmforge/swarmforge.conf && echo CONF_OK -``` -Expected: `UX_IDLE_OK`, `UX_OBS_OK`, zero scaffold, `CODER_OK`, `CONF_OK`. - -- [ ] **Step 5: Commit** - -```bash -git add swarmforge/roles/ux-engineer.prompt swarmforge/roles/coder.prompt swarmforge/roles/specifier.prompt swarmforge/swarmforge.conf -git commit -m "feat(roles): UX Engineer after coder; UX Intent authoring + read (ADR 0007)" -``` - ---- - -## D9: ADR 0008 — integrator role + specifier stops merging - -**Files:** Create `swarmforge/roles/integrator.prompt`; modify `swarmforge/roles/specifier.prompt`, `swarmforge/roles/QA.prompt`, `swarmforge/swarmforge.conf` - -- [ ] **Step 1: Recover the integrator role + apply the FIX** - -```bash -git show backup/six-pre-reset:swarmforge/roles/integrator.prompt > swarmforge/roles/integrator.prompt -``` -⚠ From `backup/six-pre-reset` (≡ `feat/issue-20-c`), NOT baseline-scenarios-six (still says "notify specifier"). **FIX step 7** to: `Notify the curator that the feature has landed. Include the specifier handoff name and the post-merge master commit hash.` Confirm: one PR/feature, autofix-lint-only, branch → `gh pr create` → watch CI → green `gh pr merge --squash --delete-branch` + post-merge gate, CI-red routing (tests→coder, coverage/CRAP/DRY→cleaner, arch→architect; autofix doesn't count; N=3 then `FAILED: depth cap reached`), idle-gate line, agent-retro line. - -- [ ] **Step 2: Specifier stops merging + per-feature reset** - -In `specifier.prompt`: -- Drop the merge step (upstream's "merge the changes and ask the user", ~L36); replace the completion line with a placeholder D10 finalizes — for now: "When the work is landed, ask the user for the next feature to add." -- Add the per-feature worktree reset: on receiving a handoff, `git reset --hard "origin/$(git symbolic-ref refs/remotes/origin/HEAD | sed 's|refs/remotes/origin/||')"` in the specifier's own worktree (recover the exact form from `feat/six-pack-pipeline-order-and-scaffold`). STRIP any `git merge --ff-only origin/master` startup line. - -- [ ] **Step 3: QA hands off to integrator + conf windows** - -- `QA.prompt`: change the final handoff to `notify the integrator` (replacing the broadcast list). -- `swarmforge.conf`: change line 1 `window specifier codex master` → `window specifier codex specifier`; insert after QA: `window integrator codex integrator`. - -- [ ] **Step 4: Verify** - -```bash -grep -q "Notify the curator" swarmforge/roles/integrator.prompt && echo INT_FIX_OK -grep -q "post-merge master commit hash" swarmforge/roles/integrator.prompt && echo INT_HASH_OK -grep -q "notify the integrator" swarmforge/roles/QA.prompt && echo QA_INT_OK -grep -q "symbolic-ref" swarmforge/roles/specifier.prompt && echo SPEC_RESET_OK -grep -q "window specifier codex specifier" swarmforge/swarmforge.conf && echo CONF_SPEC_OK -grep -q "window integrator" swarmforge/swarmforge.conf && echo CONF_INT_OK -grep -c "codex master" swarmforge/swarmforge.conf # 0 -``` -Expected: all six `*_OK`, zero `codex master`. - -- [ ] **Step 5: Commit** - -```bash -git add swarmforge/roles/integrator.prompt swarmforge/roles/specifier.prompt swarmforge/roles/QA.prompt swarmforge/swarmforge.conf -git commit -m "feat(roles): terminal integrator; specifier stops merging, runs own worktree (ADR 0008)" -``` - ---- - -## D10: ADR 0013 — curator role + chain rewiring - -Authoritative source = the locked spec's PR-C2 block (budgets **60/40**, NOT the stale 150/300 on artifact branches). **Files:** Create `swarmforge/roles/curator.prompt`; modify `swarmforge/roles/integrator.prompt`, `swarmforge/roles/specifier.prompt`, `swarmforge/constitution/articles/workflow.prompt`, `swarmforge/swarmforge.conf` - -- [ ] **Step 1: Extract the curator from the locked spec** - -Run: `git show feat/issue-20-b-bundle-knowledge-injection:docs/specs/issue-20-knowledge-promotion-loop.md` -Copy the **PR-C2 verbatim block** into `swarmforge/roles/curator.prompt`. Confirm: idle-gate; writes only `AGENTS.md` + `.agents/`; sources `~/.claude/worklog/retros/*.md`; the routing ladder (enforcement-gate backlog → AGENTS.md ≤60 → role files ≤40 → references → skills-on-2nd → upstream → ledger); ledger line `date | session-id | role | failure-class | verdict | summary`; lifecycle (empty-run pass-through, knowledge branch, self-merging PR with metric line, move retros to `processed/`, notify specifier); 9-check per-item algorithm. **Budgets must read 60 and 40.** - -- [ ] **Step 2: Rewire the chain** - -- `integrator.prompt`: confirm step 7 notifies the curator (done in D9); fix if drifted. -- `specifier.prompt`: change the wait line to "When the **curator** notifies you that the job is complete, run the per-feature reset, then ask the user for the next feature. The curator's handoff means the knowledge PR for the previous feature has already landed." -- `workflow.prompt`: append: "The landing chain is integrator → curator → specifier. The curator promotes retro knowledge before the specifier is released; an empty curation run notifies the specifier immediately — the pipeline never stalls on the curator." -- `swarmforge.conf`: append last: `window curator codex curator`. - -- [ ] **Step 3: Verify** - -```bash -grep -q "Wait for a handoff" swarmforge/roles/curator.prompt && echo CUR_IDLE_OK -grep -Eq "60" swarmforge/roles/curator.prompt && grep -Eq "40" swarmforge/roles/curator.prompt && echo BUDGETS_OK -grep -c "150\|300" swarmforge/roles/curator.prompt # 0 -grep -q "When the curator notifies you" swarmforge/roles/specifier.prompt && echo SPEC_CUR_OK -grep -qi "integrator.*curator.*specifier" swarmforge/constitution/articles/workflow.prompt && echo WF_OK -grep -c "^window" swarmforge/swarmforge.conf # 9 -``` -Expected: `CUR_IDLE_OK`, `BUDGETS_OK`, zero 150/300, `SPEC_CUR_OK`, `WF_OK`, and **9** windows (specifier, coder, ux-engineer, cleaner, architect, hardender, QA, integrator, curator). - -- [ ] **Step 4: Commit** - -```bash -git add swarmforge/roles/curator.prompt swarmforge/roles/specifier.prompt swarmforge/constitution/articles/workflow.prompt swarmforge/swarmforge.conf -git commit -m "feat(roles): terminal curator; integrator->curator->specifier chain (ADR 0013)" -``` - ---- - -## D11: ADR 0015 — platform-feasibility stop rule - -**Files:** Modify `swarmforge/constitution/articles/workflow.prompt` - -- [ ] **Step 1: Add the stop rule** - -Append to `workflow.prompt`: - -``` -## Platform Feasibility -- When the spec and the platform conflict — the spec calls for a capability the target platform does not provide — stop and report instead of working around it. A workaround comment ("we can't do X here, so we do Y") is a defect, not a resolution. Wait for a spec revision. -``` - -- [ ] **Step 2: Verify** - -```bash -grep -qi "platform" swarmforge/constitution/articles/workflow.prompt && grep -qi "workaround.*defect" swarmforge/constitution/articles/workflow.prompt && echo OK -``` -Expected: `OK`. - -- [ ] **Step 3: Commit** - -```bash -git add swarmforge/constitution/articles/workflow.prompt -git commit -m "feat(workflow): platform-feasibility stop rule (ADR 0015)" -``` - ---- - -## D12: ADR 0016 — cleaner boundary-file scan - -**Files:** Modify `swarmforge/roles/cleaner.prompt` - -- [ ] **Step 1: Add the boundary-file rule** - -Recover the cleanest wording from `feat/baseline-scenarios-six:swarmforge/roles/cleaner.prompt`. After the ">100 mutation sites → split" rule, add: - -``` -- Also run the mutation scan/count mode on boundary files (the environmentally unsuitable modules excluded from the test tools). If a boundary file exceeds ~15 mutation sites, it holds implementation logic, not adaptation — extract that logic to a testable module before handoff. -- Treat a test that asserts only a stripped or simplified view of output (e.g. ANSI-stripped text when the real output carries escape codes) as not covering the un-stripped behavior. Add coverage for the full output. -``` - -- [ ] **Step 2: Verify** - -```bash -grep -qi "boundary" swarmforge/roles/cleaner.prompt && grep -q "15" swarmforge/roles/cleaner.prompt && echo BOUNDARY_OK -grep -qi "stripped" swarmforge/roles/cleaner.prompt && echo STRIPPED_OK -``` -Expected: `BOUNDARY_OK`, `STRIPPED_OK`. - -- [ ] **Step 3: Commit** - -```bash -git add swarmforge/roles/cleaner.prompt -git commit -m "feat(cleaner): boundary-file mutation scan at ~15 sites; stripped-view anti-pattern (ADR 0016)" -``` - ---- - -## D13: hardener rendering-invariant property tests (manifest row, no ADR) - -Unmanifested divergence found in audit; consistent with ADR 0007/0010. **Files:** Modify `swarmforge/roles/hardender.prompt` - -- [ ] **Step 1: Add the rendering-invariant line** - -Recover the exact text from `backup/six-pre-reset:swarmforge/roles/hardender.prompt` (~L18) and merge it in (don't lift the whole file). Rule: for pure rendering functions (state → string, no side effects), add property tests asserting structural invariants — required elements present per state, character set bounded to the declared vocabulary, mutually exclusive states never co-rendered. Confirm D2 already stripped Startup Tools and the unauthorized "merge all queued architect handoffs together" line is absent (keep upstream's sorted-filename batch). - -- [ ] **Step 2: Verify** - -```bash -grep -qi "rendering" swarmforge/roles/hardender.prompt && grep -qi "property test\|invariant" swarmforge/roles/hardender.prompt && echo OK -grep -c "merge all queued architect handoffs" swarmforge/roles/hardender.prompt # 0 -``` -Expected: `OK`, zero unauthorized merge-all line. - -- [ ] **Step 3: Commit** - -```bash -git add swarmforge/roles/hardender.prompt -git commit -m "feat(hardener): property tests for pure rendering functions (manifest row)" -``` - ---- - -## D14: ADR 0018 — root `swarm` upgrade subcommand + self-url - -The main script half (skill install) is C8. This is the runnable-branch half. **Files:** Modify the root `swarm` bootstrap (exists on `six-pack`) - -- [ ] **Step 1: Inspect current + recover the target deltas** - -Run: `git show origin/six-pack:swarm | head -60` -Run: `git show 8994322:swarm 2>/dev/null | head -120` (adds `upgrade`/`write_source_branch`/`download_from_main`) and `git show ded6019:swarm 2>/dev/null | head -40` (self-url). -Merge the minimal deltas onto the current six-pack root `swarm`: -- `SCRIPTS_REPO="${SWARMFORGE_SCRIPTS_REPO:-gabadi/swarm-forge}"` (self-referencing; replaces hardcoded `unclebob/swarm-forge`). -- `download_from_main` (refresh scripts + skills from `main`). -- `write_source_branch` (record the runnable source branch in `.swarmforge/source-branch`). -- The `upgrade` subcommand: refresh scripts(main) + prompts(`source-branch`) + force skill reinstall (clear `.swarmforge/skills-installed`). - -- [ ] **Step 2: Verify** - -```bash -grep -q "gabadi/swarm-forge" swarm && echo SELF_URL_OK -{ grep -q "upgrade)" swarm || grep -q '"upgrade"' swarm; } && echo UPGRADE_OK -{ zsh -n swarm 2>/dev/null || bash -n swarm; } && echo SYNTAX_OK -``` -Expected: `SELF_URL_OK`, `UPGRADE_OK`, `SYNTAX_OK`. - -- [ ] **Step 3: Commit** - -```bash -git add swarm -git commit -m "feat(swarm): self-url + upgrade subcommand with source-branch tracking (ADR 0018)" -``` - ---- - -## Finalize PR 2 (SIX-PACK) - -- [ ] **Step 1: Whole-track verification** - -```bash -grep -c "^window" swarmforge/swarmforge.conf # 9, in order -for r in specifier coder ux-engineer cleaner architect hardender QA integrator curator; do - test -f "swarmforge/roles/$r.prompt" || echo "MISSING role file: $r" -done; echo ROLES_CHECKED -git diff --stat origin/six-pack # review: only prompts/articles/templates/conf/swarm changed -``` -Expected: 9 windows; all 9 role files present; `ROLES_CHECKED`; the diff touches only six-pack-owned files. - -- [ ] **Step 2: Push the branch** - -```bash -git push -u origin feat/fork-divergences-six-pack -``` - -- [ ] **Step 3: Open the single PR** - -```bash -gh pr create --base six-pack --repo gabadi/swarm-forge \ - --title "feat: fork divergences — six-pack prompts + constitution + conf" \ - --body "Re-applies the six-pack fork divergences on pristine upstream, one commit per ADR: 0002 idle-gate, 0003 startup-strip, 0004 back-routing, 0009 spec header, 0011 fidelity manifest, 0010 surface harness, 0005 refute QA, 0007 UX engineer, 0008 integrator, 0013 curator, 0015 platform-feasibility, 0016 cleaner boundary scan, hardener invariants, 0018 root swarm upgrade. Final pipeline: specifier→coder→ux-engineer→cleaner→architect→hardener→QA→integrator→curator (9 windows). DESIGN.md reference-only; curator budgets 60/40; four-pack frozen. See docs/superpowers/plans/2026-06-14-fork-divergence-implementation.md and docs/fork-change-manifest.md (Section B)." -``` - ---- - -## Out of scope (explicitly NOT implemented) - -- **four-pack PR** — frozen (manifest 2026-06-14): pure merge-mirror of `upstream/four-pack`. The issue-20 spec's "PR D on four-pack" is **dropped**. -- **cmux multiplexer** (`swarm-mux.sh`, `write_deliver_script`/`write_notify_script`/`write_stop_hook`, `MUX_TARGETS`) — DROPPED; stay on upstream's tmux harness. -- **Ideas G, H, I** — genuinely rejected, no recovery. -- **DESIGN.md scaffolding** — ADR 0007 wins: reference-from-feature-file only; recovered roles STRIP scaffold-on-absence + walk-up. -- **curator budgets 150/300** — superseded by the locked spec's 60/40. - ---- - -## Self-Review - -**Spec coverage** (manifest sections A/B/C + cross-cutting): -- Section A (main → PR 1): 0006→C6, 0012→C4, 0014→C3, 0013-skill→C9, 0003→C11 ✓ -- Section B (six-pack → PR 2): 0002→D1, 0009→D4, 0011→D5, 0010→D6, 0005→D7, 0004→D3, 0007→D8, 0008→D9, 0013→D10, 0015→D11, 0016→D12, hardener-row→D13 ✓ -- Section C (uncaptured): B/0017→C2, F/0020→C5, J→C9, N/0018→C8+D14, O→C11, auto-permission/0019→C1, executing-fields→C7, retro-triage/0021→C10, self-url→D14 ✓ -- Cross-cutting: observation-harness shared (D6 QA re-exec, D8 ux-engineer writes, D13 hardener honors); N=3 back-route (D3, carried by D8/D9); refute+surface QA merged across D6→D7; DESIGN.md reference-only (D8); curator chain order (D10) ✓ - -**Structure:** exactly two branches (`feat/fork-divergences-main`, `feat/fork-divergences-six-pack`), one PR each; per-divergence commits in a linear, dependency-correct order on each branch; no per-ADR branches, no four-pack PR. - -**Within-branch ordering:** MAIN — only hard dep is C3 after C2; all `swarmforge.sh` commits are linear so no in-file conflict. SIX-PACK — D1:` + specific STRIP/FIX deltas; verification commands are concrete with expected output. - -**Naming consistency:** `resolve_prompt_bundle`, `write_agent_instruction_file`, `write_worktree_advisor`, `write_worktree_permissions`, `ensure_skills_installed`, `install_skills` consistent across C2/C3/C4/C5/C8/C11; markers `.swarmforge/setup-complete` / `.swarmforge/skills-installed` consistent; conf window names match across D8/D9/D10. - -**Known soft spots to confirm during execution (not blockers):** -- C2/C3/C4 line numbers drift — locate by function name. -- C7 executing-fields: find the actual executing-entry write site in the upstream handoff scripts (NOT a `swarmforge.sh` heredoc as on the cmux lineage). -- C6 QA holdout path (`qa-e2e`) must match the specifier-authored path — keep the one `QA_HOLDOUT_PATH` constant as the single source of truth. -- D10 curator: budgets are 60/40 from the locked spec, not 150/300. diff --git a/docs/superpowers/specs/2026-06-14-adr0002-clear-first-delivery-design.md b/docs/superpowers/specs/2026-06-14-adr0002-clear-first-delivery-design.md deleted file mode 100644 index d8c1b1d..0000000 --- a/docs/superpowers/specs/2026-06-14-adr0002-clear-first-delivery-design.md +++ /dev/null @@ -1,103 +0,0 @@ -# ADR 0002 Clear-First Delivery Engine — Design Spec - -**Date:** 2026-06-14 -**Branch:** feat/fork-divergences-main → PR #31 -**Status:** Approved for TDD implementation - -## Problem - -ADR 0002 specifies a clear-first delivery engine for claude-backend roles. When the engine was designed, the cmux multiplexer provided the Stop hook. When cmux was dropped, the engine was lost. The executing-fields pending item was addressed (commit `7af75c3`) but the engine itself was never built. Upstream types each handoff directly into the terminal with no context clear; the fork requires `/clear` → re-inject bundle → deliver task. - -## Scope - -Claude backend only. `codex`/`grok`/`copilot` keep upstream delivery (direct `notify-agent.sh` tmux path) unchanged. The `claude`/`codex` choice is a per-role config knob (ADR 0012); this ADR only covers the claude path. - -## Shared State - -**Pending queue:** `.swarmforge/handoffs/queue/pending//` -- Files named `---.txt` -- Content: full protocol message (same envelope `send-handoff.sh` builds) -- Written by sender; drained by Stop hook - -**Busy marker:** `.swarmforge/.busy` -- Present = role is executing; absent = idle and accepting delivery -- Created atomically with zsh `noclobber` (`set -C; > file`); only the winner delivers - -## Delivery Function (`handoff-lib.sh`) - -``` -handoff_clear_first_deliver -``` - -1. Look up target tmux session from `.swarmforge/sessions.tsv` -2. Read socket from `.swarmforge/tmux-socket`; also try `TMUX` env var fallback (same as `notify-agent.sh`) -3. Send `/clear\n` to tmux session; `sleep 1` -4. If `.swarmforge/prompts/.md` exists: send its content + C-m + C-j; `sleep 0.5` -5. Send protocol message content + C-m + C-j - -**No logbook write here.** The delivery function is called from both the sender (idle path) and the Stop hook (busy path). The sender's `$PWD` is the wrong worktree for the receiver's logbook. Instead: -- **Stop hook writes `executing`** to `$PWD/logbook.jsonl` (correct — hook runs in receiver's worktree) before calling this function -- **Idle path** gets its `executing` entry from `complete-handoff.sh` (called by the agent after receipt) — same as upstream - -## Idle Path (`send-handoff.sh`) - -After building the protocol message, replace the direct `notify-agent.sh ` call with: - -1. Look up target agent type from `.swarmforge/sessions.tsv` (agent column) -2. If not `claude`: fall back to existing `notify-agent.sh "$TARGET" --file "$ARCHIVE_FILE"` (unchanged) -3. If `claude`: - a. Write message to pending queue dir - b. Attempt atomic `set -C; > .swarmforge/.busy` - c. If succeeded (was idle): call `handoff_clear_first_deliver` → role is now busy - d. If failed (already busy): message stays in pending queue; Stop hook drains it - -## Busy Path (`swarm-stop.sh` — new Stop hook) - -Receives JSON on stdin from Claude Code (`session_id`, `cwd`, `hook_event_name`). - -1. Read `SWARMFORGE_ROLE`; if unset, exit 0 (not a swarmforge role) -2. Derive `project_dir` from `cwd` field in stdin JSON (or git fallback) -3. Re-check pending queue — **do NOT noclobber here** (in the normal busy-path the marker is already set from the delivery that started this task; noclobber would cause the hook to bail and never drain the queue) -4. If queue non-empty: `touch .busy` (ensure busy, idempotent), write `executing` logbook entry to `$PWD/logbook.jsonl` (hook runs in receiver's worktree), call `handoff_clear_first_deliver`, remove pending file, exit 0 (marker stays = busy) -5. If queue empty: delete `.busy` marker (role goes idle), exit 0 - -The ADR's "re-check before declaring idle" race closure: between the `ls` returning empty and the `rm .busy`, a sender may have written to the pending dir. That sender's `noclobber` wins (`.busy` was just deleted) and delivers immediately. No message is lost. - -## Settings Wiring (`write_worktree_settings`) - -Third parameter: `stop_script` (absolute path to `swarm-stop.sh`). - -Python RMW adds to `.claude/settings.local.json`: -```json -{ - "hooks": { - "Stop": [{"matcher": "", "hooks": [{"type": "command", "command": ""}]}] - } -} -``` - -Called for ALL claude roles in `launch_role` (not just advisor-having ones): -```sh -write_worktree_settings "$role_worktree" "$role_advisor" "$role_script_dir/swarm-stop.sh" -``` - -Non-claude roles: existing call pattern (advisor only, no stop script). - -## Launch Change (PR comment 2 resolution) - -Drop the positional `"$(cat '$prompt_file')"` from the claude `launch_cmd`. The session starts with `--append-system-prompt-file` (system prompt, survives `/clear`), then waits idle. The first task arrives via clear-first delivery which re-injects the bundle as the first conversational message. - -## Presence Ping Exclusion - -Upstream's startup "I'm awake" ping uses `message type: presence`. The Stop hook must not deliver presence messages via the clear-first path. In practice: the pending queue only contains messages put there by `send-handoff.sh` (which only handles `handoff` and `resend-request` types). Presence pings are not routed through `send-handoff.sh`, so they never enter the pending queue. No special check needed. - -## Test Checkpoints (TDD) - -1. `handoff_clear_first_deliver` — mock tmux, verify call sequence: clear → sleep 1 → bundle → message (no logbook write in this function) -2. `send-handoff.sh` idle path — claude target, no `.busy`: pending file written, `.busy` created, delivery called -3. `send-handoff.sh` busy path — claude target, `.busy` pre-exists: pending file written, no delivery -4. `send-handoff.sh` non-claude — codex target: no pending file, `notify-agent.sh` called directly -5. `swarm-stop.sh` queue non-empty — pending file present: executing entry written to logbook, delivery called, pending file removed, `.busy` stays -6. `swarm-stop.sh` queue empty — no pending file: `.busy` deleted -7. `swarm-stop.sh` with queue non-empty AND `.busy` already set (normal busy-path case) — hook delivers, does NOT bail on pre-existing marker -8. `write_worktree_settings` with stop_script — resulting JSON has `hooks.Stop` entry with correct command From bb24a15d40e51e4788cb2ae212e8900a1b9ebf19 Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Mon, 15 Jun 2026 20:11:47 -0300 Subject: [PATCH 48/67] feat(fork): apply fork divergences on top of upstream Babashka port MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Changes vs upstream (ADR-0002, ADR-0018): - handoffd.bb notify!: clear-first delivery — /clear then /swarm-persona combined with wake-message as a single turn - swarmforge.sh: drop send_initial_prompt (violates ADR-0002 idle gate); update check_helper_scripts; remove swarm-stop.sh hook registration; restore send_initial_grok_prompt verbatim (grok-only, unchanged) - handoffs.prompt: run agent-retro before done_with_current.sh - workflow.prompt Idle Gate: wait for handoff only (retro now per-task) - Delete swarm-handoff and swarm-stop.sh: redundant with daemon protocol and inbox state machine Co-Authored-By: Claude Sonnet 4.6 --- .../constitution/articles/handoffs.prompt | 2 +- .../constitution/articles/workflow.prompt | 1 - swarmforge/scripts/handoffd.bb | 30 +- swarmforge/scripts/swarm-handoff | 295 ------------------ swarmforge/scripts/swarm-stop.sh | 50 --- swarmforge/scripts/swarmforge.sh | 14 +- 6 files changed, 30 insertions(+), 362 deletions(-) delete mode 100755 swarmforge/scripts/swarm-handoff delete mode 100755 swarmforge/scripts/swarm-stop.sh diff --git a/swarmforge/constitution/articles/handoffs.prompt b/swarmforge/constitution/articles/handoffs.prompt index ee66b86..7a9d9c3 100644 --- a/swarmforge/constitution/articles/handoffs.prompt +++ b/swarmforge/constitution/articles/handoffs.prompt @@ -75,7 +75,7 @@ message: current batch in helper-delivered order. - Use only the task information printed by the helper scripts. - If a tmux wake-up arrives while already working on a task, ignore it. -- When the task or batch is fully complete, run `done_with_current.sh`. +- When the task or batch is fully complete, run `agent-retro`, then run `done_with_current.sh`. - `note` handoffs are tasks too; after reading or acting on a note, run `done_with_current.sh` before accepting any other handoff. - If `done_with_current.sh` prints `TASK: `, treat the printed `PAYLOAD` diff --git a/swarmforge/constitution/articles/workflow.prompt b/swarmforge/constitution/articles/workflow.prompt index 95405d5..d92cc4a 100644 --- a/swarmforge/constitution/articles/workflow.prompt +++ b/swarmforge/constitution/articles/workflow.prompt @@ -14,5 +14,4 @@ - If the expected git layout or assigned worktree is missing, stop and report instead of silently working in the wrong place. ## Idle Gate -- Before going idle, run `agent-retro`. - Wait for a handoff. Do not act without one. diff --git a/swarmforge/scripts/handoffd.bb b/swarmforge/scripts/handoffd.bb index ba93225..f758168 100755 --- a/swarmforge/scripts/handoffd.bb +++ b/swarmforge/scripts/handoffd.bb @@ -92,17 +92,25 @@ ".swarmforge" "handoffs" "inbox" "new" filename)) (defn notify! [socket session] - (let [send-text (sh "tmux" "-S" socket "send-keys" "-t" session "-l" wake-message) - _ (Thread/sleep 150) - send-carriage-return (sh "tmux" "-S" socket "send-keys" "-t" session "C-m") - _ (Thread/sleep 50) - send-line-feed (sh "tmux" "-S" socket "send-keys" "-t" session "C-j")] - (when-not (zero? (:exit send-text)) - (throw (ex-info "tmux send text failed" send-text))) - (when-not (zero? (:exit send-carriage-return)) - (throw (ex-info "tmux send carriage return failed" send-carriage-return))) - (when-not (zero? (:exit send-line-feed)) - (throw (ex-info "tmux send line feed failed" send-line-feed))))) + (letfn [(send! [text] + (let [r (sh "tmux" "-S" socket "send-keys" "-t" session "-l" text)] + (when-not (zero? (:exit r)) + (throw (ex-info (str "tmux send failed: " text) r))))) + (enter! [] + (let [cr (sh "tmux" "-S" socket "send-keys" "-t" session "C-m") + _ (Thread/sleep 50) + lf (sh "tmux" "-S" socket "send-keys" "-t" session "C-j")] + (when-not (zero? (:exit cr)) + (throw (ex-info "tmux send carriage return failed" cr))) + (when-not (zero? (:exit lf)) + (throw (ex-info "tmux send line feed failed" lf)))))] + (send! "/clear") + (Thread/sleep 500) + (enter!) + (Thread/sleep 2000) + (send! (str "/swarm-persona " wake-message)) + (Thread/sleep 150) + (enter!))) (defn move-with-collision [source target-dir] (fs/create-dirs target-dir) diff --git a/swarmforge/scripts/swarm-handoff b/swarmforge/scripts/swarm-handoff deleted file mode 100755 index ef7c60d..0000000 --- a/swarmforge/scripts/swarm-handoff +++ /dev/null @@ -1,295 +0,0 @@ -#!/usr/bin/env zsh -set -euo pipefail - -SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)" -source "$SCRIPT_DIR/handoff-lib.sh" - -usage() { - echo "Usage: swarm-handoff" >&2 - echo "Usage: swarm-handoff send --file [--sender ] [--priority NN]" >&2 - echo " swarm-handoff receive --file [--receiver ]" >&2 - echo " swarm-handoff complete --file " >&2 - echo " swarm-handoff --file " >&2 -} - -find_project_dir() { - local git_common_dir worktree_root - - if worktree_root=$(git -C "$SCRIPT_DIR" rev-parse --show-toplevel 2>/dev/null); then - if [[ -f "$worktree_root/.swarmforge/sessions.tsv" && -f "$worktree_root/.swarmforge/tmux-socket" ]]; then - echo "$worktree_root" - return 0 - fi - fi - - if git_common_dir=$(git -C "$SCRIPT_DIR" rev-parse --git-common-dir 2>/dev/null); then - if [[ "$git_common_dir" != /* ]]; then - git_common_dir="$(cd "$SCRIPT_DIR/$git_common_dir" && pwd)" - fi - local project_dir="${git_common_dir:h}" - if [[ -f "$project_dir/.swarmforge/sessions.tsv" ]]; then - echo "$project_dir" - return 0 - fi - fi - - echo "${SCRIPT_DIR:h:h}" -} - -request_field() { - local field="$1" - local file="$2" - local line - - line="$(grep -m 1 -E "^${field}: " "$file" || true)" - [[ -n "$line" ]] || return 1 - printf '%s\n' "${line#*: }" -} - -safe_request_path() { - local path="$1" - - [[ -n "$path" ]] || return 1 - [[ "$path" != /* ]] || return 1 - [[ "$path" != "." ]] || return 1 - [[ "$path" != ".." ]] || return 1 - [[ "$path" != "../"* ]] || return 1 - [[ "$path" != *"/../"* ]] || return 1 - [[ "$path" != *"/.." ]] || return 1 - return 0 -} - -run_request_file() { - local project_dir request_file command target file priority sender receiver archive_dir archive_file exit_status - local -a args - - project_dir="$(find_project_dir)" - request_file="$project_dir/.swarmforge/notify/request" - archive_dir="$project_dir/.swarmforge/notify/archive" - - if [[ ! -f "$request_file" ]]; then - echo "Notify request file not found: $request_file" >&2 - exit 1 - fi - - command="$(request_field command "$request_file")" || { - echo "Notify request missing command: $request_file" >&2 - exit 1 - } - file="$(request_field file "$request_file")" || { - echo "Notify request missing file: $request_file" >&2 - exit 1 - } - priority="$(request_field priority "$request_file" || true)" - sender="$(request_field sender "$request_file" || true)" - receiver="$(request_field receiver "$request_file" || true)" - - if ! safe_request_path "$file"; then - echo "Notify request file must be a safe relative path: $file" >&2 - exit 1 - fi - - case "$command" in - send) - target="$(request_field target "$request_file")" || { - echo "Notify send request missing target: $request_file" >&2 - exit 1 - } - args=("$SCRIPT_DIR/send-handoff.sh" "$target" "--file" "$file") - [[ -z "$sender" ]] || args+=("--sender" "$sender") - [[ -z "$priority" ]] || args+=("--priority" "$priority") - ;; - receive) - args=("$SCRIPT_DIR/receive-handoff.sh" "--file" "$file") - [[ -z "$receiver" ]] || args+=("--receiver" "$receiver") - ;; - complete) - args=("$SCRIPT_DIR/complete-handoff.sh" "--file" "$file") - ;; - *) - echo "Unknown notify request command: $command" >&2 - exit 1 - ;; - esac - - set +e - "${args[@]}" - exit_status=$? - set -e - - if [[ "$exit_status" -eq 0 ]]; then - mkdir -p "$archive_dir" - archive_file="$archive_dir/$(date '+%Y%m%d-%H%M%S')-$$.request" - mv "$request_file" "$archive_file" - echo "Archived notify request: $archive_file" - fi - - exit "$exit_status" -} - -resolve_session() { - local target="${1:l}" - local sessions_file="$2" - local index role session display agent - - while IFS=$'\t' read -r index role session display agent; do - if [[ "$target" == "${index:l}" || "$target" == "${role:l}" ]]; then - echo "$session" - return 0 - fi - done < "$sessions_file" - - return 1 -} - -# _do_deliver — clear-first delivery for a specific target. -# Sends /clear then the message; persona is re-loaded via swarm-persona skill. -_do_deliver() { - local target="$1" - local message_file="$2" - local project_dir="$3" - local sessions_file="$4" - local tmux_socket_file="$5" - local tmux_env_file="$6" - - local target_session - target_session=$(resolve_session "$target" "$sessions_file") || { - echo "Unknown target: $target" >&2 - return 1 - } - - if [[ ! -f "$message_file" ]]; then - echo "Message file not found: $message_file" >&2 - return 1 - fi - local message - message="$(< "$message_file")" - - if [[ -z "${TMUX:-}" && -f "$tmux_env_file" ]]; then - TMUX="$(< "$tmux_env_file")" - export TMUX - fi - - local -a tmux_cmd=() - if [[ -n "${TMUX:-}" ]]; then - tmux_cmd=(tmux send-keys -t "$target_session") - else - local socket - socket="$(< "$tmux_socket_file")" - tmux_cmd=(tmux -S "$socket" send-keys -t "$target_session") - fi - - # Wait for Claude Code to finish startup before sending keystrokes. - sleep 3 - - # Use 'Enter' (tmux named key) not C-m/C-j — tmux extended-keys translates - # named keys through the kitty keyboard protocol when the pane requests it. - # C-m sends raw \x0d which Claude Code in kitty mode ignores. - - # Clear context - "${tmux_cmd[@]}" -l -- '/clear' - sleep 0.5 - "${tmux_cmd[@]}" Enter - sleep 2.0 - - # Reload persona — context was wiped by /clear - "${tmux_cmd[@]}" -l -- '/swarm-persona' - sleep 0.5 - "${tmux_cmd[@]}" Enter - sleep 2.0 - - # Send protocol message (newlines flattened to avoid multi-line input mode) - local flat_message="${message//$'\n'/ }" - "${tmux_cmd[@]}" -l -- "$flat_message" - sleep 0.15 - "${tmux_cmd[@]}" Enter -} - -if [[ $# -eq 0 ]]; then - run_request_file -fi - -if [[ $# -gt 0 ]]; then - case "$1" in - deliver) - shift - if [[ $# -ne 3 || "${2:-}" != "--file" ]]; then - usage - exit 1 - fi - TARGET="$1" - MESSAGE_FILE="$3" - PROJECT_DIR="$(find_project_dir)" - SESSIONS_FILE="$PROJECT_DIR/.swarmforge/sessions.tsv" - TMUX_SOCKET_FILE="$PROJECT_DIR/.swarmforge/tmux-socket" - TMUX_ENV_FILE="$PROJECT_DIR/.swarmforge/tmux-env" - if [[ ! -f "$SESSIONS_FILE" ]]; then - echo "Sessions file not found: $SESSIONS_FILE" >&2 - exit 1 - fi - _do_deliver "$TARGET" "$MESSAGE_FILE" "$PROJECT_DIR" "$SESSIONS_FILE" "$TMUX_SOCKET_FILE" "$TMUX_ENV_FILE" - exit 0 - ;; - send) - shift - exec "$SCRIPT_DIR/send-handoff.sh" "$@" - ;; - receive) - shift - exec "$SCRIPT_DIR/receive-handoff.sh" "$@" - ;; - resend) - shift - exec "$SCRIPT_DIR/resend-handoff.sh" "$@" - ;; - complete) - shift - exec "$SCRIPT_DIR/complete-handoff.sh" "$@" - ;; - esac -fi - -if [[ $# -ne 3 || "${2:-}" != "--file" ]]; then - usage - exit 1 -fi - -TARGET="$1" -MESSAGE_FILE="$3" - -PROJECT_DIR="$(find_project_dir)" -SESSIONS_FILE="$PROJECT_DIR/.swarmforge/sessions.tsv" -TMUX_SOCKET_FILE="$PROJECT_DIR/.swarmforge/tmux-socket" -TMUX_ENV_FILE="$PROJECT_DIR/.swarmforge/tmux-env" - -if [[ ! -f "$SESSIONS_FILE" ]]; then - echo "Sessions file not found: $SESSIONS_FILE" >&2 - exit 1 -fi - -TARGET_SESSION=$(resolve_session "$TARGET" "$SESSIONS_FILE") || { - echo "Unknown target: $TARGET" >&2 - exit 1 -} - -if [[ ! -f "$MESSAGE_FILE" ]]; then - echo "Message file not found: $MESSAGE_FILE" >&2 - exit 1 -fi -MESSAGE="$(< "$MESSAGE_FILE")" - -# Queue the message in pending/ -PENDING_DIR="$(handoff_pending_dir "$PROJECT_DIR" "$TARGET")" -mkdir -p "$PENDING_DIR" -PENDING_FILE="$PENDING_DIR/$(handoff_id_timestamp)-pending.txt" -printf '%s' "$MESSAGE" > "$PENDING_FILE" - -# Check busy marker; if busy, queue will be drained by Stop hook -BUSY_FILE="$(handoff_busy_file "$PROJECT_DIR" "$TARGET")" -if ! ( set -C; > "$BUSY_FILE" ) 2>/dev/null; then - exit 0 -fi - -_do_deliver "$TARGET" "$MESSAGE_FILE" "$PROJECT_DIR" "$SESSIONS_FILE" "$TMUX_SOCKET_FILE" "$TMUX_ENV_FILE" - -rm -f "$PENDING_FILE" diff --git a/swarmforge/scripts/swarm-stop.sh b/swarmforge/scripts/swarm-stop.sh deleted file mode 100755 index b3ab809..0000000 --- a/swarmforge/scripts/swarm-stop.sh +++ /dev/null @@ -1,50 +0,0 @@ -#!/usr/bin/env zsh -set -euo pipefail - -SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)" -source "$SCRIPT_DIR/handoff-lib.sh" - -_swarm_stop_main() { - local role="${SWARMFORGE_ROLE:-}" - if [[ -z "$role" ]]; then - return 0 - fi - - local stdin_json="" - if [[ ! -t 0 ]]; then - stdin_json="$(cat)" - fi - - local cwd="" - if [[ -n "$stdin_json" ]]; then - cwd="$(printf '%s' "$stdin_json" | python3 -c 'import sys,json; print(json.load(sys.stdin).get("cwd",""))' 2>/dev/null || true)" - fi - - local project_dir="" - if [[ -n "$cwd" && -d "$cwd" ]]; then - project_dir="$(handoff_project_dir_from "$cwd")" - else - project_dir="$(handoff_project_dir)" - fi - - local pending_dir busy_file - pending_dir="$(handoff_pending_dir "$project_dir" "$role")" - busy_file="$(handoff_busy_file "$project_dir" "$role")" - - if [[ ! -d "$pending_dir" ]] || [[ -z "$(ls -A "$pending_dir" 2>/dev/null)" ]]; then - rm -f "$busy_file" - return 0 - fi - - local pending_name pending_file pending_content - pending_name="$(ls "$pending_dir" | sort | head -1)" - pending_file="$pending_dir/$pending_name" - - touch "$busy_file" - - "$SCRIPT_DIR/swarm-handoff" deliver "$role" --file "$pending_file" - - rm -f "$pending_file" -} - -_swarm_stop_main diff --git a/swarmforge/scripts/swarmforge.sh b/swarmforge/scripts/swarmforge.sh index d7ae4bb..7bd0e36 100755 --- a/swarmforge/scripts/swarmforge.sh +++ b/swarmforge/scripts/swarmforge.sh @@ -563,15 +563,18 @@ write_persona_skill_file() { } > "$skill_file" } -send_initial_prompt() { +send_initial_grok_prompt() { local session="$1" local display="$2" + local prompt_file="$3" ( sleep 3 - tmux -S "$TMUX_SOCKET" send-keys -t "$(tmux_agent_target "$session" "$display")" -l -- 'Invoke your swarm-persona skill to load your role and begin.' - sleep 0.5 + tmux -S "$TMUX_SOCKET" send-keys -t "$(tmux_agent_target "$session" "$display")" -l -- "$(< "$prompt_file")" + sleep 0.15 tmux -S "$TMUX_SOCKET" send-keys -t "$(tmux_agent_target "$session" "$display")" C-m + sleep 0.05 + tmux -S "$TMUX_SOCKET" send-keys -t "$(tmux_agent_target "$session" "$display")" C-j ) &! } @@ -598,7 +601,7 @@ launch_role() { case "$agent" in claude) - write_worktree_settings "$role_worktree" "$role_advisor" "$role_script_dir/swarm-stop.sh" + write_worktree_settings "$role_worktree" "$role_advisor" local claude_flags="" [[ -n "$role_model" ]] && claude_flags+=" --model ${(q)role_model}" [[ -n "$role_effort" ]] && claude_flags+=" --effort ${(q)role_effort}" @@ -637,6 +640,9 @@ launch_role() { fi tmux -S "$TMUX_SOCKET" send-keys -t "$(tmux_agent_target "$session" "$display")" "$launch_cmd" Enter + if [[ "$agent" == "grok" ]]; then + send_initial_grok_prompt "$session" "$display" "$prompt_file" + fi echo -e " ${CYAN}[${display}]${RESET} started in session ${session}" } From 95443d0950fb2761dbf56d596c9db5e023abd766 Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Tue, 16 Jun 2026 21:43:42 -0300 Subject: [PATCH 49/67] enter-fix --- docs/tool-analysis-crap-dry-mutation.md | 547 ++++++++++++++++++++++++ swarmforge/scripts/handoffd.bb | 10 +- swarmforge/scripts/swarmforge.sh | 4 +- 3 files changed, 551 insertions(+), 10 deletions(-) create mode 100644 docs/tool-analysis-crap-dry-mutation.md diff --git a/docs/tool-analysis-crap-dry-mutation.md b/docs/tool-analysis-crap-dry-mutation.md new file mode 100644 index 0000000..0b3e2ce --- /dev/null +++ b/docs/tool-analysis-crap-dry-mutation.md @@ -0,0 +1,547 @@ +# Code Quality Tool Analysis: CRAP / DRY / Mutation + +**Status:** Research complete — decision pending +**Scope:** JavaScript/TypeScript, Python, Rust source targets +**Goal:** Determine what to reuse, what to build, and in what language + +--- + +## 1. Background + +The engineering constitution (`swarmforge/constitution/articles/engineering.prompt`) mandates +procuring CRAP, mutation, and DRY tools from Uncle Bob's repositories on startup. +Those repositories (`github.com/unclebob/{crap,dry,mutate}4{go,clj,java}`) only cover +Go, Clojure, and Java. This document defines what to do for the remaining stacks. + +--- + +## 2. Tool Family Definitions + +### 2.1 CRAP (Change Risk Anti-Pattern) + +Measures per-function risk as a function of complexity and test coverage. + +**Formula:** `CRAP(m) = CC(m)² × (1 - cov(m)/100)³ + CC(m)` + +Where: +- `CC(m)` = cyclomatic complexity of function `m` (decision points + 1) +- `cov(m)` = percentage of the function's branches (or lines) covered by tests + +**Decision points counted:** `if`, `else if`, ternary (`?:`), `&&`, `||`, `??`, +`for`, `for…in`, `for…of`, `while`, `do…while`, `catch`, each `switch case`. + +**Interpretation:** +- Score 1 = perfect (CC=1, 100% coverage) +- Score ≥ 30 = conventionally "crappy" (high complexity AND low coverage) +- A function with CC=10 and 0% coverage scores 1010; same function at 100% scores 10 + +### 2.2 DRY (Don't Repeat Yourself) + +Detects structurally duplicated code using AST subtree hashing and Jaccard similarity. +Catches semantic duplication (same logic, different variable names), not just copy-paste. + +**Uncle Bob's algorithm:** +1. Parse source file to AST +2. Walk every node, collecting all subtrees at every nesting depth +3. Normalize each subtree: replace all identifier names and literal values with + `_ID` and `_LIT` (operators, control-flow keywords, and structure are preserved) +4. Serialize and hash each normalized subtree +5. Build inverted index: hash → list of (file, function, subtree) tuples +6. For any two functions sharing at least one hash, compute Jaccard similarity: + `|shared hashes| / |union of hashes|` +7. Report pairs above the similarity threshold (default 0.82) + +**Qualification gates:** functions must have ≥ 4 source lines AND ≥ 20 normalized AST nodes. +Below this threshold the signal-to-noise ratio collapses. + +**Key property:** Two functions that do the same thing with completely different variable +names will match. Two functions with an identical copy-paste block surrounded by different +code will not match at the function level (only the block would match if it were extracted). + +### 2.3 Mutation Testing + +Verifies test suite quality by injecting deliberate faults into source code and checking +whether tests detect them. A mutant that survives (tests still pass) indicates a gap in +test assertions. + +**Uncle Bob's operator set:** + +| Category | Mutations | +|----------|-----------| +| Arithmetic | `+` ↔ `-`, `*` → `/` | +| Comparison | `>` ↔ `>=`, `<` ↔ `<=` | +| Equality | `==` ↔ `!=` | +| Boolean | `true` ↔ `false` | +| Logical | `&&` ↔ `||` | +| Constant | `0` ↔ `1` (inline, in expressions) | +| Unary | remove `-a` → `a`, remove `!a` → `a` | +| Null | replace return value with `null` / `None` | + +**Key innovations in Uncle Bob's implementation:** +- **Embedded manifest:** function hashes stored in source file footer comments; + enables differential reruns — only functions changed since last run are retested +- **Coverage gating:** only lines covered by at least one test are mutated; + avoids wasting time on dead code +- **Parallel workers:** isolated copies per worker for concurrent mutation + +--- + +## 3. Uncle Bob Reference Tool Interfaces + +### 3.1 crap4go CLI + +``` +crap4go [--test-command ] [--max-workers ] [path-fragment ...] +``` + +| Flag | Default | Description | +|------|---------|-------------| +| `--test-command` | `go test ./... -coverprofile=target/coverage/coverage.out` | Coverage command; tool appends `-coverprofile` unless `{coverprofile}` placeholder present | +| `--max-workers` | half of logical CPUs | Parallel source-file analysis workers | + +Positional arguments: path fragments — only files whose path contains a fragment are included. +No arguments = all files. + +**Behavior:** deletes stale coverage → runs test command → parses coverage + AST → +computes CRAP per function. + +**Skips:** `_test.go`, `target/`, `vendor/`, `.git/` + +**stdout:** +``` +CRAP Report +=========== +Function Package CC Cov% CRAP +------------------------------------------------------------------------------------- +Widget.Run widget 12 45.0% 130.2 +simple widget 1 100.0% 1.0 +``` +Sort: descending by CRAP score (worst first). N/A last. + +**Coverage file:** Go coverage profile (`target/coverage/coverage.out`) + +### 3.2 dry4go CLI + +``` +dry4go [options] [file-or-directory ...] +``` + +| Flag | Default | Description | +|------|---------|-------------| +| `--threshold` | `0.82` | Minimum Jaccard similarity to report | +| `--min-lines` | `4` | Minimum source lines in a candidate function | +| `--min-nodes` | `20` | Minimum normalized AST nodes | +| `--format` | `text` | `text` or `json` | +| `--json` | — | Alias for `--format json` | + +Directories are recursed; `.git`, `vendor`, `target` excluded. + +**Text output:** +``` +DUPLICATE score=0.89 + internal/billing/invoice.go:12-25 + internal/billing/receipt.go:30-44 +``` + +**JSON output:** +```json +{ + "candidates": [ + { + "score": 0.8909090909090909, + "left": {"file": "internal/billing/invoice.go", "start_line": 12, "end_line": 25}, + "right": {"file": "internal/billing/receipt.go", "start_line": 30, "end_line": 44}, + "left_nodes": 88, + "right_nodes": 91 + } + ] +} +``` + +### 3.3 mutate4go CLI + +``` +mutate4go [flags] path/to/file.go +``` + +One source file per invocation. + +| Flag | Default | Description | +|------|---------|-------------| +| `--scan` | false | Count mutation sites vs. manifest only; no tests | +| `--update-manifest` | false | Rewrite footer manifest without running mutations | +| `--lines ` | — | Restrict to specific line numbers | +| `--since-last-run` | false | Differential: test only changed functions | +| `--mutate-all` | false | Force full mutation ignoring manifest | +| `--reuse-coverage` | false | Skip coverage regeneration | +| `--mutation-warning` | 50 | Warn if mutation count exceeds threshold | +| `--timeout-factor` | — | Multiplier for per-mutation test timeout | +| `--test-command` | `go test ./...` | Override test command | +| `--max-workers` | 1 | Parallel mutation workers | +| `--verbose` | false | Log actions to stderr | + +**Behavior:** +1. Generate coverage → run baseline tests (establish timeout) +2. For each covered mutation site: apply → run tests → restore → record +3. Default to differential mode if footer manifest exists +4. Write updated manifest to end of source file + +**Manifest format:** embedded in source file footer comments; contains last-test-date, +per-function ID, line span, normalized-source hash. + +--- + +## 4. Coverage Data Interface — LCOV Format + +All CRAP implementations (existing and proposed) consume LCOV tracefile format. +This is the universal coverage interchange format produced by Jest, Vitest, c8, nyc, +Istanbul, pytest-cov, coverage.py, cargo-llvm-cov, and cargo-tarpaulin. + +### 4.1 Record Types + +| Record | Syntax | Meaning | +|--------|--------|---------| +| `TN` | `TN:` | Test name (optional; may be empty) | +| `SF` | `SF:` | Opens a source file section | +| `FN` | `FN:[,],` | Function declaration (end_line optional) | +| `FNDA` | `FNDA:,` | Function execution count | +| `FNF` | `FNF:` | Total functions found | +| `FNH` | `FNH:` | Functions with count > 0 | +| `BRDA` | `BRDA:,[e],,` | Branch edge data | +| `BRF` | `BRF:` | Total branch records | +| `BRH` | `BRH:` | Branch records with taken > 0 | +| `DA` | `DA:,[,]` | Line execution count | +| `LH` | `LH:` | Lines with count > 0 | +| `LF` | `LF:` | Total lines found | +| `end_of_record` | `end_of_record` | Closes a source file section | + +### 4.2 BRDA Field Detail + +``` +BRDA:,[e],, +``` +- `line_number`: 1-based line of the branching statement +- `e` prefix (LCOV 2.x): exception-handling branch +- `block`: integer from 0; groups branches of the same conditional +- `branch`: edge index (0 = false/left path, 1 = true/right path for a simple `if`) +- `taken`: `-` (never evaluated / dead code) OR integer execution count + +**Critical distinction:** `taken=0` means the branch was evaluated but never taken. +`taken=-` means the branch was never reached at all. + +### 4.3 Line vs Branch Coverage in CRAP + +The formula requires `cov(m)` — "how much of function m is tested." There are two interpretations: + +**Line coverage (DA records):** +- A line with `count > 0` is "covered" even if only one side of its branch was tested +- Can overstate coverage significantly on functions with compound conditionals +- This is what **crap4js currently uses** (reads only `DA:` records) + +**Branch coverage (BRDA records):** +- Counts each outgoing edge of each conditional independently +- `cov(m) = BRH_in_function / BRF_in_function × 100` +- More accurate to Uncle Bob's intent (he measures decision coverage) +- This is what the **proposed Python implementation should use** + +### 4.4 Minimal Valid Example + +``` +TN:example +SF:/project/src/foo.py +FN:5,25,compute_score +FNDA:3,compute_score +FNF:1 +FNH:1 +BRDA:8,0,0,2 +BRDA:8,0,1,1 +BRDA:15,0,0,- +BRDA:15,0,1,3 +BRF:4 +BRH:3 +DA:5,1 +DA:6,3 +DA:8,3 +DA:9,2 +DA:15,3 +LH:5 +LF:5 +end_of_record +``` + +--- + +## 5. Decision Matrix by Stack + +### 5.1 Mutation Testing + +| Stack | Decision | Tool | Rationale | +|-------|----------|------|-----------| +| JS/TS | **Reuse** | StrykerJS | Mature, AST-based, coverage-gated per-test, mutation switching (single suite run) | +| Python | **Reuse** | mutmut | Mature, widely adopted, `--use-coverage` flag for coverage gating | +| Rust | **Reuse** | cargo-mutants | Active, `--in-diff` for PR-scoped runs, most operator categories covered | + +**Rationale shared across all stacks:** The bottleneck is always test execution time. +A Rust reimplementation of the harness cannot make tests run faster — it would only +be orchestrating the same subprocesses. The existing tools are the right level of abstraction. + +**Known gap — cargo-mutants vs Uncle Bob spec:** +- Inline constant swapping (`0↔1` inside expressions) is not implemented; + cargo-mutants replaces entire function bodies with type defaults instead +- No coverage gating (all code is mutated regardless of coverage) +- Differential runs work via `--in-diff` / `--git-base` (different mechanism, same outcome) + +### 5.2 CRAP + +| Stack | Decision | Tool | Rationale | +|-------|----------|------|-----------| +| JS/TS | **Reuse + patch** | crap4js | Correct formula, lean (1,732 LOC), two known gaps (see below) | +| Python | **Write** | ~200-line Python script | Nothing exists; Python's own `ast` module is the right parser | +| Rust | **Reuse** | cargo-crap | Correct formula, reads lcov.info from llvm-cov/tarpaulin; pre-1.0 but functional | + +**Known gaps in crap4js:** + +1. **Line coverage instead of branch coverage** — reads `DA:` records only; + ignores `BRDA:` records. A function with `if (a && b)` where only the + `false` short-circuit path is tested shows as "covered". This understates CRAP + on functions with compound conditionals. Fix: extend `coverage.ts` to parse + `BRDA:` records and compute `BRH/BRF` per function line range. + +2. **Optional chaining `?.` inflates CC** — every `user?.profile?.name` adds +1 to + cyclomatic complexity per `?.`. Uncle Bob's spec (written for Go/Java/Clojure) has + no such operator. Modern TypeScript code using `?.` for defensive access gets + artificially high CC scores on simple property accessor chains. + Fix: remove `MemberExpression`/`CallExpression` with `optional=true` from + `DECISION_PREDICATES` in `complexity.ts:594-601`. + +**Known gap in cargo-crap:** +- Pre-1.0 (v0.2.x); no historical trending, no per-PR delta, no IDE integration +- Requires `lcov.info` as intermediate; no native `llvm-cov` JSON support + +### 5.3 DRY + +| Stack | Decision | Tool | Rationale | +|-------|----------|------|-----------| +| JS/TS | **Write new** | Rust binary (see §6) | No AST-subtree-Jaccard tool exists; jscpd v5 is token-sequence only | +| Python | **Write new** | Rust binary (see §6) | Nothing exists | +| Rust | **Write new** | Rust binary (see §6) | cargo-dupes is function-level only, 4 stars, pre-production | + +**Why not jscpd v5 for JS/TS?** +jscpd v5 (Rust, 5.1k stars, 150+ languages) is production-ready and catches copy-paste +duplication effectively. It uses tokenization + Rabin-Karp rolling hash over token +sequences. However it detects a different class of duplicates than Uncle Bob's algorithm: + +| Scenario | jscpd v5 | Uncle Bob DRY | +|----------|----------|---------------| +| Copy-paste block (identical variable names) | Detects | Detects | +| Same logic, different variable names | Misses | **Detects** | +| Partial block match above threshold | Detects | May miss (function-level Jaccard) | + +For a monorepo with multiple services, the Uncle Bob variant catches cases where +two teams independently implemented the same logic with different names — jscpd would not. +These are complementary tools; the decision to use Uncle Bob's algorithm is a deliberate +choice, not a gap in jscpd. + +--- + +## 6. Proposed New Tool: DRY Rust Binary + +### 6.1 Scope + +A single Rust binary implementing Uncle Bob's DRY algorithm for JS/TS, Python, and Rust +source targets. One binary, one CLI, one output format. + +### 6.2 AST Parsers + +| Target | Parser | Rationale | +|--------|--------|-----------| +| JS/TypeScript | OXC (`oxc_parser` crate) | 21.6k stars, 3-5x faster than alternatives, spec-compliant, semantic enrichment (scope/binding) | +| Python | `tree-sitter-python` | Standard; Python's own `ast` module has better fidelity but requires CPython subprocess | +| Rust | `syn` crate | The canonical Rust AST parser; better semantic fidelity than tree-sitter-rust for Rust targets | + +### 6.3 Algorithm (per target language) + +1. Walk all source files; for each function/method/closure: + a. Parse to language-native AST + b. Collect all subtrees at every nesting depth + c. Normalize each subtree: replace identifiers → `_ID`, literals → `_LIT` + d. Hash each normalized subtree (FNV-1a or xxHash — fast, collision-resistant) + e. Record `(file, fn_name, start_line, end_line, set)` per function +2. Build inverted index: `hash → Vec` +3. Generate candidate pairs: any two functions sharing ≥ 1 hash +4. For each candidate pair: compute Jaccard `|A ∩ B| / |A ∪ B|` +5. Report pairs with Jaccard ≥ threshold (default 0.82), where both functions + pass the qualification gates (≥ 4 lines, ≥ 20 AST nodes) + +### 6.4 Proposed CLI + +``` +dry [options] [path ...] +``` + +| Flag | Default | Description | +|------|---------|-------------| +| `--threshold` | `0.82` | Minimum Jaccard similarity to report | +| `--min-lines` | `4` | Minimum source lines to qualify | +| `--min-nodes` | `20` | Minimum normalized AST nodes to qualify | +| `--lang` | auto-detect | `js`, `ts`, `py`, `rs` — force language | +| `--format` | `text` | `text` or `json` | +| `--exclude` | — | Glob patterns to exclude | + +Output format mirrors dry4go exactly (text and JSON) for drop-in compatibility. + +### 6.5 Distribution + +- Static binary (no runtime dependency) via `cargo build --release` +- Can be compiled to WASM via `wasm-pack` for npm distribution if needed +- Single binary covers all three target languages + +### 6.6 Scaling (validated) + +At the worst-case Python monorepo service (scoring: 491 files, ~684 functions, ~350 +qualifying after gates): ~61,000 pairs, ~6 million hash comparisons — runs in under +1 second in Python; Rust will be significantly faster. +For JS monorepos with ~9,430 production files, the qualification gates reduce the +candidate set substantially; the inverted index further eliminates non-pairs cheaply. +No MinHash, LSH, or approximation needed at these scales. + +--- + +## 7. Existing Tool Interfaces (for integration) + +### 7.1 StrykerJS (JS/TS Mutation) + +**Config file:** `stryker.config.mjs` + +```js +export default { + testRunner: 'jest', // 'jest' | 'karma' | 'mocha' | 'command' + coverageAnalysis: 'perTest', // 'off' | 'all' | 'perTest' + thresholds: { + high: 80, // green above this + low: 60, // yellow between low and high; red below low + break: null, // exit 1 if score falls below; null = never fail + }, + mutate: ['src/**/*.ts'], + excludedMutations: [], // mutator names or category names +}; +``` + +`coverageAnalysis: 'perTest'` is the correct setting — it gates each mutant to only +the tests that cover its line, equivalent to Uncle Bob's coverage-gated approach. + +**Key mutator categories:** ArithmeticOperator, EqualityOperator, LogicalOperator, +BooleanLiteral, BlockStatement, ConditionalExpression, UnaryOperator, UpdateOperator, +OptionalChaining, StringLiteral, ArrayDeclaration. + +### 7.2 mutmut (Python Mutation) + +**Run:** +``` +mutmut run [--use-coverage] [--paths-to-mutate src/] [--disable-mutation-types ] +``` + +**Mutation types available:** `operator`, `keyword`, `number`, `name`, `string`, +`fstring`, `argument`, `or_test`, `and_test`, `lambdef`, `expr_stmt`, `decorator`, +`annassign` + +**Cache:** `.mutmut-cache` (SQLite, project root) — delete to reset + +**Coverage gating:** `--use-coverage` flag reads coverage.py data; restricts mutations +to covered lines (equivalent to Uncle Bob's coverage gating) + +**Differential runs:** `--use-patch-file ` — mutate only lines in the patch + +**Results:** +``` +mutmut results # print all results +mutmut result-ids survived # list surviving mutant IDs +mutmut show # show diff for a specific mutant +``` + +**Exit codes:** 0 = all killed, 1 = survivors exist or fatal error + +### 7.3 cargo-mutants (Rust Mutation) + +**Key flags:** +``` +cargo mutants [--in-diff ] [--git-base ] [--jobs ] +``` + +- `--in-diff `: mutate only lines present in a unified diff +- `--git-base `: auto-generate diff from `git diff ..HEAD` +- `--jobs `: parallel workers + +**Operator gaps vs Uncle Bob:** inline constant swapping (`0↔1`) not supported; +no coverage gating. + +### 7.4 cargo-crap (Rust CRAP) + +**Workflow:** +``` +cargo llvm-cov --lcov --output-path lcov.info +cargo crap --lcov lcov.info +``` + +Also accepts tarpaulin output. Does not natively read `llvm-cov` JSON. + +--- + +## 8. Open Questions + +1. **Should crap4js be patched or left as-is?** + The two gaps (line-vs-branch coverage, `?.` CC inflation) affect score accuracy. + Patching is low-effort (~50 LOC changes). Decision: patch or accept the deviation? + +2. **Should jscpd v5 be used alongside the proposed DRY binary?** + They catch different things (copy-paste vs semantic duplication). Running both adds + signal but also adds tooling surface area. Decision: one or both? + +3. **Should the Python CRAP script use branch or line coverage?** + Branch coverage is more accurate to Uncle Bob's intent but requires lcov to emit + `BRDA:` records — pytest-cov does this by default; coverage.py does too with + `branch = True` in `.coveragerc`. This is a one-time config addition, low risk. + +4. **What is the output format / threshold for the DRY binary?** + dry4go's threshold of 0.82 was tuned for Go. JS and Python have different idiom + frequencies — the threshold may need calibration on real codebases. + +5. **Should cargo-mutants' missing inline constant swapping be compensated?** + It can be supplemented by a custom mutant configuration. Is the gap material? + +6. **Where do the tools live?** + - Fork and patch crap4js in the addi org? Or submit upstream? + - Is the DRY Rust binary a new repo in the org? Which team owns it? + +7. **CI integration strategy:** + - Run on every commit? Every PR? Only on changed files? + - What are the thresholds that block merge vs warn only? + - Who owns the baseline — per-repo or org-wide? + +--- + +## 9. Knowns + +- crap4js exists at `/Users/gabadi/workspace/addi/crap4js` — audited, correct formula, + production-quality, two patchable gaps +- Uncle Bob's DRY algorithm is O(n²) over qualifying fragments; at all measured repo + scales (worst case: ~350 fragments), brute-force Jaccard runs in under 1 second +- LCOV is the universal coverage interchange format — all major test runners for + JS/TS, Python, and Rust produce it +- Mutation testing's bottleneck is always test execution time; the harness language + is irrelevant to performance +- Rust's OXC parser (21.6k stars, v1.70.0) is production-grade for JS/TS AST work + and WASM-distributable +- Go tree-sitter bindings require CGO, breaking the zero-dependency binary story + +## 10. Unknowns + +- Whether the `?.` CC inflation in crap4js is material in practice on the JS monorepo + (depends on how heavily optional chaining is used) +- Whether dry4go's 0.82 Jaccard threshold is appropriate for JS/Python codebases +- Whether `syn` (Rust AST) or `tree-sitter-rust` is the better choice for DRY + analysis of Rust source (syn has higher fidelity but is Rust-only; tree-sitter-rust + is consistent with the other language parsers) +- Actual build time of the proposed DRY Rust binary and whether WASM compilation + is needed or if a native binary suffices for all CI environments +- Whether cargo-crap's pre-1.0 status is a blocker or acceptable for internal use diff --git a/swarmforge/scripts/handoffd.bb b/swarmforge/scripts/handoffd.bb index f758168..96137f5 100755 --- a/swarmforge/scripts/handoffd.bb +++ b/swarmforge/scripts/handoffd.bb @@ -97,13 +97,9 @@ (when-not (zero? (:exit r)) (throw (ex-info (str "tmux send failed: " text) r))))) (enter! [] - (let [cr (sh "tmux" "-S" socket "send-keys" "-t" session "C-m") - _ (Thread/sleep 50) - lf (sh "tmux" "-S" socket "send-keys" "-t" session "C-j")] - (when-not (zero? (:exit cr)) - (throw (ex-info "tmux send carriage return failed" cr))) - (when-not (zero? (:exit lf)) - (throw (ex-info "tmux send line feed failed" lf)))))] + (let [r (sh "tmux" "-S" socket "send-keys" "-t" session "Enter")] + (when-not (zero? (:exit r)) + (throw (ex-info "tmux send Enter failed" r)))))] (send! "/clear") (Thread/sleep 500) (enter!) diff --git a/swarmforge/scripts/swarmforge.sh b/swarmforge/scripts/swarmforge.sh index 7bd0e36..06f8828 100755 --- a/swarmforge/scripts/swarmforge.sh +++ b/swarmforge/scripts/swarmforge.sh @@ -572,9 +572,7 @@ send_initial_grok_prompt() { sleep 3 tmux -S "$TMUX_SOCKET" send-keys -t "$(tmux_agent_target "$session" "$display")" -l -- "$(< "$prompt_file")" sleep 0.15 - tmux -S "$TMUX_SOCKET" send-keys -t "$(tmux_agent_target "$session" "$display")" C-m - sleep 0.05 - tmux -S "$TMUX_SOCKET" send-keys -t "$(tmux_agent_target "$session" "$display")" C-j + tmux -S "$TMUX_SOCKET" send-keys -t "$(tmux_agent_target "$session" "$display")" Enter ) &! } From fddf7301964b5291bb5180c5207413045fa4cc1c Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Tue, 16 Jun 2026 23:19:51 -0300 Subject: [PATCH 50/67] feat(fork): migrate ADR divergences to fork.bb, port to Babashka format Extract all fork-specific ADR implementations (0006, 0012, 0017, 0018, 0019, 0020, 0021) into swarmforge/scripts/fork.bb, loaded at runtime via (load-file ...) before -main. This isolates fork additions from upstream's swarmforge.bb to minimize future merge conflicts. swarmforge.bb receives targeted additions only: kv-field parsing for model/ effort/advisor per role (ADR 0012), permission-mode auto, load-file hook, and forward-declares for fork.bb symbols to satisfy SCI analysis phase. Co-Authored-By: Claude Sonnet 4.6 --- swarmforge/scripts/fork.bb | 191 +++++++++++++++++++++++++++++++ swarmforge/scripts/swarmforge.bb | 57 +++++++-- 2 files changed, 237 insertions(+), 11 deletions(-) create mode 100644 swarmforge/scripts/fork.bb diff --git a/swarmforge/scripts/fork.bb b/swarmforge/scripts/fork.bb new file mode 100644 index 0000000..eb695fb --- /dev/null +++ b/swarmforge/scripts/fork.bb @@ -0,0 +1,191 @@ +;; Fork-specific extensions loaded into swarmforge namespace via (load-file ...). +;; No ns declaration — evaluated in swarmforge namespace at load time. +;; Add new ADR implementations here to minimize future swarmforge.bb merge conflicts. + +(require '[cheshire.core :as json]) + +;;; ADR 0020 + 0012: Worktree settings (auto-compaction + advisor model) + +(defn write-worktree-settings! + "Write .claude/settings.local.json with auto-compaction keys and optional advisor model." + ([worktree-path] (write-worktree-settings! worktree-path "")) + ([worktree-path advisor-model] + (let [settings-dir (fs/path worktree-path ".claude") + settings-file (fs/path settings-dir "settings.local.json")] + (fs/create-dirs settings-dir) + (let [cfg (try (json/parse-string (slurp (str settings-file)) true) + (catch Exception _ {})) + cfg (-> cfg + (assoc :autoCompactEnabled true) + (assoc-in [:env :CLAUDE_AUTOCOMPACT_PCT_OVERRIDE] "88") + (assoc-in [:env :CLAUDE_CODE_AUTO_COMPACT_WINDOW] "200000")) + cfg (if (seq advisor-model) + (assoc cfg :advisorModel advisor-model) + cfg)] + (spit (str settings-file) (json/generate-string cfg {:pretty true})))))) + +;;; ADR 0017: Inlined prompt bundle + swarm-persona skill + +(defn resolve-prompt-bundle + "Collect all .prompt files referenced transitively from constitution + role prompt." + [working-dir constitution-file roles-dir role] + (let [working-dir-str (str working-dir)] + (loop [queue [(str constitution-file) (str (fs/path roles-dir (str role ".prompt")))] + seen #{} + bundle []] + (if-let [file (first queue)] + (let [rel (str/replace-first file (str working-dir-str "/") "")] + (if (or (contains? seen rel) (not (fs/exists? (fs/path file)))) + (recur (rest queue) seen bundle) + (let [content (slurp file) + refs (->> (re-seq #"swarmforge/[A-Za-z0-9_./-]+\.prompt" content) + distinct + (map #(str working-dir-str "/" %)) + (remove #(contains? seen (str/replace-first % (str working-dir-str "/") "")))) + article-files (when (str/ends-with? file "constitution.prompt") + (let [articles-dir (fs/path (str/replace file "constitution.prompt" "constitution/articles"))] + (when (fs/exists? articles-dir) + (->> (fs/list-dir articles-dir) + (filter #(str/ends-with? (str (fs/file-name %)) ".prompt")) + (map str) + (remove #(contains? seen (str/replace-first % (str working-dir-str "/") ""))))))) + new-queue (concat (rest queue) refs article-files)] + (recur new-queue (conj seen rel) (conj bundle rel))))) + bundle)))) + +(defn write-persona-skill-file! + "Create .claude/skills/swarm-persona/SKILL.md with bundled role+constitution." + [ctx role worktree-path] + (let [working-dir (:working-dir ctx) + skill-dir (fs/path worktree-path ".claude" "skills" "swarm-persona") + skill-file (fs/path skill-dir "SKILL.md") + bundle-files (resolve-prompt-bundle working-dir (:constitution-file ctx) (:roles-dir ctx) role) + knowledge-files ["AGENTS.md" (str ".agents/roles/" role ".md")]] + (fs/create-dirs skill-dir) + (spit (str skill-file) + (str "---\n" + "name: swarm-persona\n" + "description: Load this agent's SwarmForge role, constitution, and operating instructions\n" + "---\n\n" + "\n" + "\n" + "This prompt bundle is pre-resolved. Do not open or re-read any swarmforge/*.prompt files" + " — all relevant instructions are already included below. Project knowledge files" + " (AGENTS.md and your role file under .agents/roles/) are included below when present.\n" + "\n" + (apply str + (for [rel bundle-files + :let [abs (fs/path (str working-dir) rel)] + :when (fs/exists? abs)] + (str "\n" (slurp (str abs)) "\n\n"))) + (apply str + (for [rel knowledge-files + :let [abs (fs/path (str working-dir) rel)] + :when (fs/exists? abs)] + (str "\n" (slurp (str abs)) "\n\n"))) + "\n")))) + +;; Override upstream's write-agent-instruction-file! to use swarm-persona skill pointer. +(defn write-agent-instruction-file! [role prompt-file] + (spit (str prompt-file) + (str "You are the " role " in a SwarmForge multi-agent development swarm. " + "Your full role, constitution, and operating instructions are in your swarm-persona skill. " + "Invoke the swarm-persona skill at the start of every session and before responding to any handoff.\n"))) + +;;; ADR 0006: Sparse checkout to hide QA holdout from non-QA/specifier worktrees + +(defn sparse-checkout-setup! + "Configure sparse checkout to exclude qa-holdout-path for non-QA/specifier roles." + [worktree-path qa-holdout-path role] + (when-not (#{"specifier" "QA"} role) + (process/sh {:continue true} "git" "-C" (str worktree-path) "sparse-checkout" "init" "--no-cone") + (let [git-dir-res (process/sh {:continue true} "git" "-C" (str worktree-path) "rev-parse" "--git-dir") + git-dir (str/trim (:out git-dir-res)) + git-dir-path (if (fs/absolute? (fs/path git-dir)) + (fs/path git-dir) + (fs/path worktree-path git-dir)) + sparse-file (fs/path git-dir-path "info" "sparse-checkout")] + (fs/create-dirs (fs/parent sparse-file)) + (spit (str sparse-file) (str "/*\n!/" qa-holdout-path "/\n"))) + (process/sh {:continue true} "git" "-C" (str worktree-path) "read-tree" "-mu" "HEAD"))) + +;;; ADR 0018: Skill installation + +(defn- parse-pins-file [pins-file] + (into {} (for [line (str/split-lines (slurp (str pins-file))) + :let [line (str/trim line)] + :when (and (seq line) + (not (str/starts-with? line "#")) + (str/includes? line "=")) + :let [sep (str/index-of line "=")] + :when sep] + [(str/trim (subs line 0 sep)) (str/trim (subs line (inc sep)))]))) + +(defn install-skills! + "Install local skills and pinned entire skills into .claude/skills/." + [ctx] + (let [pins-file (fs/path (:script-dir ctx) "install-pins.conf")] + (when (fs/exists? pins-file) + (let [pins (parse-pins-file pins-file) + entire-sha (get pins "ENTIRE_SKILLS_SHA") + skills-src (fs/path (:script-dir ctx) ".." "skills") + skills-dst (fs/path (:working-dir ctx) ".claude" "skills")] + (println (str cyan "Installing skills..." reset)) + (fs/create-dirs (:state-dir ctx)) + (fs/create-dirs skills-dst) + (when (fs/exists? skills-src) + (doseq [skill-dir (->> (fs/list-dir skills-src) (filter fs/directory?))] + (let [skill-name (str (fs/file-name skill-dir)) + dst (fs/path skills-dst skill-name)] + (when (fs/exists? dst) (fs/delete-tree dst)) + (fs/copy-tree skill-dir dst) + (println (str " " green "✓" reset " " skill-name " (local)"))))) + (when entire-sha + (let [tmp-dir (str (fs/create-temp-dir)) + url (str "https://github.com/entireio/skills/archive/" entire-sha ".tar.gz") + result (process/sh {:continue true} "sh" "-c" + (str "curl -fsSL " (sq url) " | tar -xz --strip-components=1 -C " (sq tmp-dir)))] + (if (zero? (:exit result)) + (do + (let [skills-extracted (fs/path tmp-dir "skills")] + (when (fs/exists? skills-extracted) + (doseq [skill-dir (->> (fs/list-dir skills-extracted) (filter fs/directory?))] + (let [skill-name (str (fs/file-name skill-dir)) + dst (fs/path skills-dst skill-name)] + (when (fs/exists? dst) (fs/delete-tree dst)) + (fs/copy-tree skill-dir dst))))) + (fs/delete-tree tmp-dir) + (println (str " " green "✓" reset " entire skills (" (subs entire-sha 0 8) ")")) + (spit (str (fs/path (:state-dir ctx) "skills-installed")) entire-sha)) + (do + (fs/delete-tree tmp-dir) + (println (str " " yellow "⚠" reset " entire skills unavailable (no network?) — proceeding without them")))))))))) + +(defn ensure-skills-installed! + "Install skills if pins changed or first run." + [ctx] + (let [pins-file (fs/path (:script-dir ctx) "install-pins.conf")] + (when (fs/exists? pins-file) + (let [pins (parse-pins-file pins-file) + entire-sha (get pins "ENTIRE_SKILLS_SHA") + sentinel (fs/path (:state-dir ctx) "skills-installed")] + (when-not (and (fs/exists? sentinel) + (= entire-sha (str/trim (slurp (str sentinel))))) + (install-skills! ctx)))))) + +;;; ADR 0021: Curator skill links + +(defn link-curator-skills! + "Symlink .agents/skills/* into .claude/skills/." + [target-path] + (let [agents-skills-dir (fs/path target-path ".agents" "skills") + claude-skills-dir (fs/path target-path ".claude" "skills")] + (when (fs/exists? agents-skills-dir) + (fs/create-dirs claude-skills-dir) + (doseq [skill-dir (->> (fs/list-dir agents-skills-dir) (filter fs/directory?))] + (let [skill-name (str (fs/file-name skill-dir)) + link (fs/path claude-skills-dir skill-name)] + (when-not (fs/exists? link) + (process/sh {:continue true} "ln" "-sfn" + (str "../../.agents/skills/" skill-name) + (str link)))))))) diff --git a/swarmforge/scripts/swarmforge.bb b/swarmforge/scripts/swarmforge.bb index e8cbb57..f858cb0 100755 --- a/swarmforge/scripts/swarmforge.bb +++ b/swarmforge/scripts/swarmforge.bb @@ -14,6 +14,12 @@ (def bold "\u001b[1m") (def reset "\u001b[0m") + +;; Forward-declare symbols defined in fork.bb (loaded at runtime via load-file before -main). +(declare sparse-checkout-setup! link-curator-skills! + write-persona-skill-file! write-worktree-settings! + ensure-skills-installed!) + (defn sh [& args] (apply process/sh args)) @@ -141,11 +147,15 @@ (if (or (str/blank? line) (str/starts-with? line "#")) (recur (next lines) rows roles worktrees) (let [fields (str/split line #"\s+")] - (when (or (< (count fields) 4) (> (count fields) 5)) + (when (< (count fields) 4) (fail! (str red "Error:" reset " Invalid config line " line-no ": " line))) - (let [[keyword role agent worktree receive-mode] fields + (let [[keyword role agent worktree receive-mode & kv-fields] fields agent (str/lower-case agent) - receive-mode (or receive-mode "task")] + receive-mode (or receive-mode "task") + kv-map (into {} (for [kv kv-fields + :let [sep (str/index-of kv "=")] + :when sep] + [(subs kv 0 sep) (subs kv (inc sep))]))] (when-not (= "window" keyword) (fail! (str red "Error:" reset " Unknown config directive on line " line-no ": " keyword))) (when (str/includes? role "_") @@ -171,7 +181,10 @@ :display-name (display-name-for-role role) :worktree-name worktree :worktree-path worktree-path - :receive-mode receive-mode}] + :receive-mode receive-mode + :model (get kv-map "model" "") + :effort (get kv-map "effort" "") + :advisor (get kv-map "advisor" "")}] (recur (next lines) (conj rows row) (conj roles role) @@ -244,7 +257,8 @@ :when (not (#{"none" "master"} worktree-name))] (when-not (or (fs/exists? (fs/path worktree-path ".git")) (fs/directory? (fs/path worktree-path ".git"))) - (sh "git" "-C" (str (:working-dir ctx)) "worktree" "add" "--force" "-B" branch-name (str worktree-path) "HEAD")))) + (sh "git" "-C" (str (:working-dir ctx)) "worktree" "add" "--force" "-B" branch-name (str worktree-path) "HEAD")) + (sparse-checkout-setup! worktree-path (:qa-holdout-path ctx) (:role row)))) (defn prepare-handoff-dirs! [ctx] (doseq [row (:roles ctx) @@ -271,7 +285,8 @@ (fs/copy (:sessions-file ctx) (fs/path role-state-dir "sessions.tsv") {:replace-existing true}) (fs/copy (:roles-file ctx) (fs/path role-state-dir "roles.tsv") {:replace-existing true}) (fs/copy (:tmux-socket-file ctx) (fs/path role-state-dir "tmux-socket") {:replace-existing true}) - (fs/copy (:tmux-env-file ctx) (fs/path role-state-dir "tmux-env") {:replace-existing true})))) + (fs/copy (:tmux-env-file ctx) (fs/path role-state-dir "tmux-env") {:replace-existing true}) + (link-curator-skills! worktree-path)))) (defn check-dependency! [command] (when-not (command-exists? command) @@ -307,10 +322,22 @@ (write-agent-instruction-file! role prompt-file) (cond-> (str base (case agent - "claude" (str "claude --append-system-prompt-file " (sq (str prompt-file)) " --permission-mode acceptEdits -n " (sq (str "SwarmForge " display)) " \"$(cat " (sq (str prompt-file)) ")\"") - "codex" (str "codex -C " (sq (str role-worktree)) " \"$(cat " (sq (str prompt-file)) ")\"") - "copilot" (str "copilot -C " (sq (str role-worktree)) " --name " (sq (str "SwarmForge " display)) " -i \"$(cat " (sq (str prompt-file)) ")\"") - "grok" (str "grok --cwd " (sq (str role-worktree)) " --permission-mode acceptEdits --rules \"$(cat " (sq (str prompt-file)) ")\" --verbatim \"$(cat " (sq (str prompt-file)) ")\""))) + "claude" (str "claude" + (when (seq (:model row)) (str " --model " (sq (:model row)))) + (when (seq (:effort row)) (str " --effort " (sq (:effort row)))) + " --append-system-prompt-file " (sq (str prompt-file)) + " --permission-mode auto -n " (sq (str "SwarmForge " display))) + "codex" (str "codex" + (when (seq (:model row)) (str " -c model=" (sq (:model row)))) + " -C " (sq (str role-worktree)) " \"$(cat " (sq (str prompt-file)) ")\"") + "copilot" (str "copilot" + (when (seq (:model row)) (str " --model " (sq (:model row)))) + (when (seq (:effort row)) (str " --effort " (sq (:effort row)))) + " -C " (sq (str role-worktree)) " --name " (sq (str "SwarmForge " display)) " -i \"$(cat " (sq (str prompt-file)) ")\"") + "grok" (str "grok" + (when (seq (:model row)) (str " --model " (sq (:model row)))) + (when (seq (:effort row)) (str " --effort " (sq (:effort row)))) + " --cwd " (sq (str role-worktree)) " --permission-mode auto --rules \"$(cat " (sq (str prompt-file)) ")\" --verbatim \"$(cat " (sq (str prompt-file)) ")\""))) (= index 0) (str "; exit_code=$?; SWARMFORGE_TERMINAL_BACKEND=" (sq (:terminal-backend ctx)) " nohup " (sq (str (fs/path (:script-dir ctx) "swarm-cleanup.sh"))) @@ -324,6 +351,8 @@ display (:display-name row) prompt-file (fs/path (:prompts-dir ctx) (str (:role row) ".md")) command (launch-command ctx index row)] + (write-persona-skill-file! ctx (:role row) (:worktree-path row)) + (write-worktree-settings! (:worktree-path row) (or (:advisor row) "")) (sh "tmux" "-S" (:tmux-socket ctx) "send-keys" "-t" (tmux-agent-target display (:tmux-pane-base-index ctx) session) command "Enter") @@ -430,7 +459,8 @@ :tmux-socket-file (fs/path state-dir "tmux-socket") :tmux-env-file (fs/path state-dir "tmux-env") :tmux-window-base-index 0 - :tmux-pane-base-index 0})) + :tmux-pane-base-index 0 + :qa-holdout-path (or (System/getenv "SWARMFORGE_QA_HOLDOUT_PATH") "qa-e2e")})) (defn prepare-ctx [ctx] (-> ctx @@ -456,6 +486,9 @@ (let [ctx (prepare-ctx ctx)] (check-backend-dependencies! ctx) (prepare-workspace! ctx) + (ensure-skills-installed! ctx) + (when-not (fs/exists? (fs/path (:state-dir ctx) "setup-complete")) + (fail! (str red "Error:" reset " project is not swarm-ready. Run /setup-swarm first."))) (prepare-worktrees! ctx) (prepare-handoff-dirs! ctx) (let [ctx (assoc ctx :terminal-backend (detect-terminal-backend))] @@ -518,4 +551,6 @@ "--test-agent-start-delay" (println (env-long "SWARMFORGE_AGENT_START_DELAY_MS" 1500)) (run-main! (or (first args) (System/getProperty "user.dir"))))) +(load-file (str (fs/parent *file*) "/fork.bb")) + (apply -main *command-line-args*) From 09502f866ebd207be5a1a5b2faa1ce3af4c07425 Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Wed, 17 Jun 2026 00:10:50 -0300 Subject: [PATCH 51/67] fork(handoffs): remove startup awake notification Co-Authored-By: Claude Sonnet 4.6 --- swarmforge/constitution/articles/handoffs.prompt | 15 --------------- 1 file changed, 15 deletions(-) diff --git a/swarmforge/constitution/articles/handoffs.prompt b/swarmforge/constitution/articles/handoffs.prompt index 310e68a..ed0a0dc 100644 --- a/swarmforge/constitution/articles/handoffs.prompt +++ b/swarmforge/constitution/articles/handoffs.prompt @@ -1,20 +1,5 @@ # Handoff Rules -## Startup Notification -- After reading the constitution and your role prompt at startup, send an awake - notification to the main branch agent, if any, unless you are that agent or - your role prompt says otherwise. -- Write a draft file with: - -```text -type: awake -to: -priority: 50 -``` - -- Run `swarm_handoff.sh `. -- This startup notification is only a presence signal; it does not replace any role-specific handoff rule. - ## Sending Handoffs - Write a draft handoff file with only structured headers, then run `swarm_handoff.sh `. - Use only these message types: From 599be8d02c0bc8a9de61f28cb84903510ffe9c2e Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Wed, 17 Jun 2026 00:52:30 -0300 Subject: [PATCH 52/67] feat(skill): add fork-upstream-sync skill Encodes the merge-base intersection technique for identifying real conflicts, the fork.bb migration pattern, and swarm-forge-specific rules (rtk prefix, gabadi/swarm-forge target, ADR style). Co-Authored-By: Claude Sonnet 4.6 --- .claude/skills/fork-upstream-sync/SKILL.md | 63 ++++++++++++++++++++++ 1 file changed, 63 insertions(+) create mode 100644 .claude/skills/fork-upstream-sync/SKILL.md diff --git a/.claude/skills/fork-upstream-sync/SKILL.md b/.claude/skills/fork-upstream-sync/SKILL.md new file mode 100644 index 0000000..143bc74 --- /dev/null +++ b/.claude/skills/fork-upstream-sync/SKILL.md @@ -0,0 +1,63 @@ +--- +name: fork-upstream-sync +description: Sync this fork with new commits from unclebob/swarm-forge upstream. Identifies real conflicts (only files both sides modified since merge base), checks whether fork changes are superseded, migrates divergences to fork.bb, and opens a PR to gabadi/swarm-forge. Use when upstream has new commits, user says "sync upstream", "upstream has changes", or "merge upstream". +--- + +# Fork Upstream Sync + +## Core rule: only intersecting files are real conflicts + +After fetching upstream, find what each side changed since the merge base: + +```bash +MERGE_BASE=$(rtk git merge-base HEAD upstream/main) +rtk git diff "$MERGE_BASE"..HEAD --name-only # our side +rtk git diff "$MERGE_BASE"..upstream/main --name-only # upstream side +``` + +Files only upstream changed → trivial forward merge, don't mention them. +Files only we changed → no conflict. +**Files in both lists → real conflicts. Analyze each one.** + +## Classify each real conflict + +For every intersecting file, answer in order: + +1. **Superseded?** Read upstream's new version and grep for the intent of our change. If upstream solved it already (even differently), our change is moot — take theirs. +2. **Non-overlapping edits?** If our edit and upstream's are on different lines, take both — no decision needed. +3. **Migration needed?** If upstream rewrote/replaced the file and we have substantive logic in it, extract our logic to `fork.bb` (see below). + +## Migration pattern: fork.bb + +When upstream replaces a script file we've extended, move our logic into `swarmforge/scripts/fork.bb`. This file is loaded by `swarmforge.bb` via `(load-file ...)` and is 100% fork-owned — zero conflict surface on future syncs. + +- Extractable: self-contained functions (settings writers, skill installers, sparse-checkout setup, prompt bundle resolvers) +- Must stay in `swarmforge.bb`: config parsing for new fields, permission-mode flags, setup guards, load-file call itself +- Keep the upstream file edits to small, stable hook call sites only + +## Resolving constitution/articles conflicts + +These files get edited by both sides frequently. Always: +- Take upstream's structural/wording changes +- Preserve our fork-specific rule insertions (check `rtk git diff "$MERGE_BASE"..HEAD -- ` to see exactly what we added) +- Never remove an upstream rule unless there's an explicit ADR for it + +## Doing the merge + +```bash +rtk git checkout -b feat/upstream-sync +rtk git merge upstream/main +# Files to take entirely from upstream: +rtk git checkout --theirs && rtk git add +# Manual resolutions: edit file, then rtk git add +rtk git commit +rtk git push origin feat/upstream-sync +gh pr create -R gabadi/swarm-forge --title "..." --body "..." +``` + +**Never** open a PR against `unclebob/swarm-forge`. Always target `gabadi/swarm-forge`. +`gh` CLI defaults to upstream — always pass `-R gabadi/swarm-forge`. + +## After the PR + +If any migrated divergences aren't yet documented, add an ADR row or manifest entry. ADR house style: divergence + why only, no rejected-options section. From 2a42ec84d81df79e3a7d7bf0104685df8de7a3ac Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Thu, 18 Jun 2026 19:20:26 -0300 Subject: [PATCH 53/67] fix(sync): take upstream extra-args as-is; only intercept advisor=X MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Upstream's extra-cli-args passthrough already handles model/effort — no need to re-implement kv-map parsing for those. Only advisor=X requires special treatment since there is no --advisor CLI flag; it writes advisorModel to .claude/settings.local.json via write-worktree-settings! in fork.bb. parse-config now strips advisor= tokens before building extra-args (upstream approach), and launch-command is simplified back to upstream's structure with only --permission-mode auto preserved (ADR-0019). ADR-0012 updated to reflect the narrowed scope: advisor-only interception. Co-Authored-By: Claude Sonnet 4.6 --- README.md | 13 +++---- .../adr/0012-per-role-model-effort-advisor.md | 35 ++++++----------- swarmforge/scripts/swarmforge.bb | 38 +++++-------------- 3 files changed, 28 insertions(+), 58 deletions(-) diff --git a/README.md b/README.md index 2c7753b..3d27061 100644 --- a/README.md +++ b/README.md @@ -230,7 +230,7 @@ The durable handoff files and lifecycle headers replace the old logbook and rese `swarmforge/swarmforge.conf` defines the swarm window-by-window. Each line has this form: ```conf -window [task|batch] [extra-cli-args...] [key=value...] +window [task|batch] [extra-cli-args...] ``` The optional receive mode defaults to `task`. Use `batch` for roles that should consume all currently queued equal-priority handoffs as one batch. @@ -240,15 +240,14 @@ Any fields after the receive mode are passed directly to the agent CLI as additi ```conf window coder copilot wt-coder --yolo window architect claude wt-arch task --dangerously-skip-permissions +window senior-dev claude wt-senior task --model claude-opus-4-8 --effort high ``` -Optional per-role overrides are expressed as `key=value` pairs after the receive mode (or after extra args): +One special token is intercepted before passthrough: -| Key | Applies to | Effect | -|-----|-----------|--------| -| `model` | all backends | `claude`/`copilot`/`grok`: `--model ` · `codex`: `-c model=""` | -| `effort` | claude, copilot, grok | `--effort ` (skipped for codex) | -| `advisor` | claude only | written as `advisorModel` into the worktree's `.claude/settings.local.json` | +| Token | Applies to | Effect | +|-------|-----------|--------| +| `advisor=` | claude only | writes `advisorModel` into the worktree's `.claude/settings.local.json` instead of being passed as a CLI flag | You can define as many windows as your project needs. Each `role` maps to a corresponding prompt file at `swarmforge/roles/.prompt`, so a config containing `architect`, `coder`, `reviewer`, `research`, and `release` windows would expect: diff --git a/docs/adr/0012-per-role-model-effort-advisor.md b/docs/adr/0012-per-role-model-effort-advisor.md index 758e19d..430073f 100644 --- a/docs/adr/0012-per-role-model-effort-advisor.md +++ b/docs/adr/0012-per-role-model-effort-advisor.md @@ -2,34 +2,23 @@ status: accepted --- -# Per-role model, effort, and advisor in `swarmforge.conf` +# Per-role advisor model in `swarmforge.conf` -Different roles have different compute needs — the architect reasoning about design warrants a more capable model than the coder grinding through an implementation slice. Upstream's only per-role knob is the agent backend (`window `); model, effort, and advisor are absent. The fork adds **optional per-role overrides** without breaking any existing config. +Different roles benefit from different advisor models for in-editor suggestions. Upstream added generic `extra-cli-args` passthrough for per-role agent flags (model, effort, etc.), so those no longer need fork-specific handling. The one remaining fork addition is `advisor=X` — there is no `--advisor` CLI flag; it must be written into the worktree's `.claude/settings.local.json`. -**Syntax: an inline `key=value` tail on the window line.** The existing four fields parse exactly as before; any fields beyond position four are read as `key=value` pairs stored per role. Upstream rejects lines that are not exactly four fields, so this is a genuine parser change, but it is backward compatible — a four-field line still works untouched. +**Syntax: `advisor=` as an extra token on the window line.** It is intercepted before passthrough; all other extra tokens are forwarded to the agent CLI verbatim. ```conf -# before (still valid) -window coder claude coder +# model/effort are raw extra-args (upstream passthrough) +window specifier claude specifier task --model claude-opus-4-8 --effort xhigh advisor=claude-sonnet-4-6 +window coder claude coder task --model claude-sonnet-4-6 -# after (opt-in per role) -window specifier claude specifier model=opus effort=xhigh advisor=sonnet -window coder claude coder model=sonnet effort=high -window architect codex architect model=o3 +# advisor only (no model override needed) +window architect claude architect advisor=claude-sonnet-4-6 ``` -**Three keys, mapped to CLI flags per backend; unsupported keys are silently ignored:** +| Token | Applies to | Effect | +|-------|-----------|--------| +| `advisor=` | claude only | writes `advisorModel` into the worktree's `.claude/settings.local.json` — there is no `--advisor` CLI flag | -| Key | Applies to | Mapping | -|-----|-----------|---------| -| `model` | all backends | `claude`/`copilot`/`grok`: `--model ` · `codex`: `-c model=""` | -| `effort` | claude, copilot, grok | `--effort ` (codex has no effort flag — skipped) | -| `advisor` | claude only | written as `advisorModel` into the worktree's `.claude/settings.local.json` — there is **no** `--advisor` CLI flag (ignored for other backends) | - -**Per-role granularity, not per-backend.** Two `claude` roles can run different models; a global per-backend setting would throw away the value of the role abstraction. **No pre-populated values** ship in the runnable configs — those express topology (roles + worktrees), not opinions about model cost. The feature is fully opt-in: operators add keys only to the lines they care about. - -## Pending implementation - -- `main`: extend `parse_config` in `swarmforge.sh` to accept ≥4 fields and read the `key=value` tail into per-role maps; extend `launch_role` to append the mapped flags per backend when set. (Script lives on `main`; the conf grammar is exercised there.) -- `model`/`effort` map to real CLI flags; `advisor` does **not** — there is no `claude --advisor` flag. It is implemented by writing `advisorModel` into each worktree's `.claude/settings.local.json` (a `write_worktree_advisor` step that shares the read-modify-write seam with ADR 0020). Source: `backup/main-pre-reset:swarmforge.sh` `write_worktree_advisor`. -- Runnable config (`six-pack`) stays topology-only — no keys added. (four-pack is frozen per ADR 0001 / the change manifest.) +**Implementation:** `parse-config` strips `advisor=X` tokens from the trailing fields before building `extra-args`; the extracted value is stored as `:advisor` on the row. `write-worktree-settings!` in `fork.bb` writes it to settings.local.json at launch time. diff --git a/swarmforge/scripts/swarmforge.bb b/swarmforge/scripts/swarmforge.bb index 2418615..9b0dd5b 100755 --- a/swarmforge/scripts/swarmforge.bb +++ b/swarmforge/scripts/swarmforge.bb @@ -157,14 +157,11 @@ receive-mode (if (#{"task" "batch"} (first trailing)) (first trailing) "task") - raw-trailing (if (#{"task" "batch"} (first trailing)) - (rest trailing) - trailing) - kv-map (into {} (for [kv raw-trailing - :let [sep (str/index-of kv "=")] - :when sep] - [(subs kv 0 sep) (subs kv (inc sep))])) - extra-arg-tokens (remove #(str/index-of % "=") raw-trailing) + extra-arg-tokens (if (#{"task" "batch"} (first trailing)) + (rest trailing) + trailing) + advisor (some #(when (str/starts-with? % "advisor=") (subs % 8)) extra-arg-tokens) + extra-arg-tokens (remove #(str/starts-with? % "advisor=") extra-arg-tokens) extra-args (when (seq extra-arg-tokens) (str/join " " extra-arg-tokens))] (when-not (= "window" keyword) @@ -194,9 +191,7 @@ :worktree-path worktree-path :receive-mode receive-mode :extra-args extra-args - :model (get kv-map "model" "") - :effort (get kv-map "effort" "") - :advisor (get kv-map "advisor" "")}] + :advisor (or advisor "")}] (recur (next lines) (conj rows row) (conj roles role) @@ -338,23 +333,10 @@ (write-agent-instruction-file! role prompt-file) (cond-> (str base (case agent - "claude" (str "claude" - (when (seq (:model row)) (str " --model " (sq (:model row)))) - (when (seq (:effort row)) (str " --effort " (sq (:effort row)))) - " --append-system-prompt-file " (sq (str prompt-file)) - " --permission-mode auto -n " (sq (str "SwarmForge " display)) - " " (extra-args-prefix row) "\"$(cat " (sq (str prompt-file)) ")\"") - "codex" (str "codex" - (when (seq (:model row)) (str " -c model=" (sq (:model row)))) - " -C " (sq (str role-worktree)) " " (extra-args-prefix row) "\"$(cat " (sq (str prompt-file)) ")\"") - "copilot" (str "copilot" - (when (seq (:model row)) (str " --model " (sq (:model row)))) - (when (seq (:effort row)) (str " --effort " (sq (:effort row)))) - " -C " (sq (str role-worktree)) " --name " (sq (str "SwarmForge " display)) " " (extra-args-prefix row) "-i \"$(cat " (sq (str prompt-file)) ")\"") - "grok" (str "grok" - (when (seq (:model row)) (str " --model " (sq (:model row)))) - (when (seq (:effort row)) (str " --effort " (sq (:effort row)))) - " --cwd " (sq (str role-worktree)) " --permission-mode auto " (extra-args-prefix row) "--rules \"$(cat " (sq (str prompt-file)) ")\" --verbatim \"$(cat " (sq (str prompt-file)) ")\"")))) + "claude" (str "claude --append-system-prompt-file " (sq (str prompt-file)) " --permission-mode auto -n " (sq (str "SwarmForge " display)) " " (extra-args-prefix row) "\"$(cat " (sq (str prompt-file)) ")\"") + "codex" (str "codex -C " (sq (str role-worktree)) " " (extra-args-prefix row) "\"$(cat " (sq (str prompt-file)) ")\"") + "copilot" (str "copilot -C " (sq (str role-worktree)) " --name " (sq (str "SwarmForge " display)) " " (extra-args-prefix row) "-i \"$(cat " (sq (str prompt-file)) ")\"") + "grok" (str "grok --cwd " (sq (str role-worktree)) " --permission-mode auto " (extra-args-prefix row) "--rules \"$(cat " (sq (str prompt-file)) ")\" --verbatim \"$(cat " (sq (str prompt-file)) ")\"")))) (= index 0) (str "; exit_code=$?; SWARMFORGE_TERMINAL_BACKEND=" (sq (:terminal-backend ctx)) " nohup " (sq (str (fs/path (:script-dir ctx) "swarm-cleanup.sh"))) From adef8c26c2221866c36afa5a99f36574513298c4 Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Thu, 18 Jun 2026 19:25:54 -0300 Subject: [PATCH 54/67] fix(launch): make --permission-mode auto overridable via extra-args MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Hardcoded --permission-mode auto produced a duplicate flag when a user passed --permission-mode in extra-args. Now injected only when absent from extra-args — auto remains the fork default (ADR-0019) but extra-args can override it. Co-Authored-By: Claude Sonnet 4.6 --- swarmforge/scripts/swarmforge.bb | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/swarmforge/scripts/swarmforge.bb b/swarmforge/scripts/swarmforge.bb index 9b0dd5b..25ada67 100755 --- a/swarmforge/scripts/swarmforge.bb +++ b/swarmforge/scripts/swarmforge.bb @@ -329,14 +329,16 @@ base (str "export SWARMFORGE_ROLE=" (sq role) " && export PATH=" (sq (str role-script-dir)) ":$PATH" " && cd " (sq (str role-worktree)) - " && ")] + " && ") + permission-mode (when-not (str/includes? (or (:extra-args row) "") "--permission-mode") + " --permission-mode auto")] (write-agent-instruction-file! role prompt-file) (cond-> (str base (case agent - "claude" (str "claude --append-system-prompt-file " (sq (str prompt-file)) " --permission-mode auto -n " (sq (str "SwarmForge " display)) " " (extra-args-prefix row) "\"$(cat " (sq (str prompt-file)) ")\"") + "claude" (str "claude --append-system-prompt-file " (sq (str prompt-file)) permission-mode " -n " (sq (str "SwarmForge " display)) " " (extra-args-prefix row) "\"$(cat " (sq (str prompt-file)) ")\"") "codex" (str "codex -C " (sq (str role-worktree)) " " (extra-args-prefix row) "\"$(cat " (sq (str prompt-file)) ")\"") "copilot" (str "copilot -C " (sq (str role-worktree)) " --name " (sq (str "SwarmForge " display)) " " (extra-args-prefix row) "-i \"$(cat " (sq (str prompt-file)) ")\"") - "grok" (str "grok --cwd " (sq (str role-worktree)) " --permission-mode auto " (extra-args-prefix row) "--rules \"$(cat " (sq (str prompt-file)) ")\" --verbatim \"$(cat " (sq (str prompt-file)) ")\"")))) + "grok" (str "grok --cwd " (sq (str role-worktree)) permission-mode " " (extra-args-prefix row) "--rules \"$(cat " (sq (str prompt-file)) ")\" --verbatim \"$(cat " (sq (str prompt-file)) ")\"")))) (= index 0) (str "; exit_code=$?; SWARMFORGE_TERMINAL_BACKEND=" (sq (:terminal-backend ctx)) " nohup " (sq (str (fs/path (:script-dir ctx) "swarm-cleanup.sh"))) From 5649837e0d388654026e048075b56ffb98a518b5 Mon Sep 17 00:00:00 2001 From: gabadi Date: Fri, 19 Jun 2026 03:52:09 -0300 Subject: [PATCH 55/67] fix(handoff): state-dir from roles.tsv + implement merge_and_process (#54) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * fix(handoff): resolve state-dir from roles.tsv; add merge_and_process script #46: swarm_handoff.bb state-dir now resolves outbox via SWARMFORGE_ROLE + roles.tsv worktree-path lookup instead of JVM user.dir, matching the path handoffd.bb polls. Silent delivery failure when invoked outside the assigned worktree is eliminated. #44: add merge_and_process.sh — agents were improvising (observed: --theirs) because the script referenced in git_handoff payloads did not exist. Script enforces --no-ff merge and stops with a clear error on conflict; never --theirs/--ours. Also adds constitution bullet: run swarm_handoff.sh from assigned worktree only. Co-Authored-By: Claude Sonnet 4.6 * feat(handoff): sync worktree to trunk before task delivery Eliminates per-role sync boilerplate: ready_for_next_task.bb and ready_for_next_batch.bb now git-fetch + reset --hard to origin/HEAD before printing TASK/BATCH. NO_TASK path is unaffected. This is the enforcement layer fix for #44 — sync happens structurally in the delivery mechanism, not as advisory text in 8 role prompts. Co-Authored-By: Claude Sonnet 4.6 * fix(merge_and_process): merge canonical commit, not branch tip Fetches the sender branch to ensure the commit is reachable, then merges the exact canonical-commit from the git_handoff payload. Previously merged origin/ tip, which could include commits pushed after the handoff was sent. Co-Authored-By: Claude Sonnet 4.6 --------- Co-authored-by: Claude Sonnet 4.6 --- .../constitution/articles/handoffs.prompt | 1 + swarmforge/scripts/merge_and_process.sh | 25 +++++++++++++++++++ swarmforge/scripts/ready_for_next_batch.bb | 18 +++++++++++++ swarmforge/scripts/ready_for_next_task.bb | 18 +++++++++++++ swarmforge/scripts/swarm_handoff.bb | 11 +++++++- 5 files changed, 72 insertions(+), 1 deletion(-) create mode 100755 swarmforge/scripts/merge_and_process.sh diff --git a/swarmforge/constitution/articles/handoffs.prompt b/swarmforge/constitution/articles/handoffs.prompt index ed0a0dc..16f385c 100644 --- a/swarmforge/constitution/articles/handoffs.prompt +++ b/swarmforge/constitution/articles/handoffs.prompt @@ -1,6 +1,7 @@ # Handoff Rules ## Sending Handoffs +- Always run `swarm_handoff.sh` from your assigned worktree. Invoking it from another directory silently delivers to the wrong outbox. - Write a draft handoff file with only structured headers, then run `swarm_handoff.sh `. - Use only these message types: - `awake` diff --git a/swarmforge/scripts/merge_and_process.sh b/swarmforge/scripts/merge_and_process.sh new file mode 100755 index 0000000..2f698ca --- /dev/null +++ b/swarmforge/scripts/merge_and_process.sh @@ -0,0 +1,25 @@ +#!/usr/bin/env bash +set -euo pipefail + +# Merge sender's registered branch into the current worktree. +# Called by agents when a git_handoff payload arrives. +# Usage: merge_and_process +# Never uses --theirs or --ours; on conflict: stops and reports. + +SENDER_ROLE="${1?Usage: merge_and_process }" +CANONICAL_COMMIT="${2?Usage: merge_and_process }" + +SENDER_BRANCH="swarmforge-${SENDER_ROLE}" + +echo "merge_and_process: fetching origin/${SENDER_BRANCH}..." +git fetch origin "${SENDER_BRANCH}" + +echo "merge_and_process: merging ${CANONICAL_COMMIT} (from ${SENDER_BRANCH})..." +if ! git merge --no-ff "${CANONICAL_COMMIT}"; then + echo "" >&2 + echo "CONFLICT: merge of ${CANONICAL_COMMIT} into $(git symbolic-ref --short HEAD) failed." >&2 + echo "Resolve conflicts manually (do NOT use --theirs or --ours), then re-run ready_for_next.sh." >&2 + exit 1 +fi + +echo "merge_and_process: done — $(git rev-parse --short=10 HEAD)" diff --git a/swarmforge/scripts/ready_for_next_batch.bb b/swarmforge/scripts/ready_for_next_batch.bb index e61e2fd..0c3203b 100755 --- a/swarmforge/scripts/ready_for_next_batch.bb +++ b/swarmforge/scripts/ready_for_next_batch.bb @@ -2,6 +2,7 @@ (ns ready-for-next-batch (:require [babashka.fs :as fs] + [clojure.java.shell :as sh] [clojure.string :as str])) (defn inbox-dir [] @@ -96,6 +97,22 @@ (println "BATCH_ITEM:" (inc index)) (print-task file)))) +(defn sync-to-trunk! [] + (let [fetch-result (sh/sh "git" "fetch" "origin")] + (when-not (zero? (:exit fetch-result)) + (binding [*out* *err*] + (println "WARNING: git fetch failed:" (str/trim (:err fetch-result)))))) + (let [branch-result (sh/sh "git" "symbolic-ref" "--short" "refs/remotes/origin/HEAD") + default-branch (when (zero? (:exit branch-result)) + (str/trim (:out branch-result)))] + (if default-branch + (let [reset-result (sh/sh "git" "reset" "--hard" default-branch)] + (when-not (zero? (:exit reset-result)) + (binding [*out* *err*] + (println "WARNING: git reset --hard" default-branch "failed:" (str/trim (:err reset-result)))))) + (binding [*out* *err*] + (println "WARNING: could not resolve default branch; skipping trunk sync"))))) + (defn fail! [status & lines] (binding [*out* *err*] (doseq [line lines] @@ -143,6 +160,7 @@ (set-header! target-file "dequeued_at" (timestamp)))) (when (empty? selected-files) (fail! 2 (str "AMBIGUOUS_TASK_STATE: no tasks selected for batch priority " batch-priority "."))) + (sync-to-trunk!) (print-batch batch-dir)))))))) (-main) diff --git a/swarmforge/scripts/ready_for_next_task.bb b/swarmforge/scripts/ready_for_next_task.bb index 3312621..b4a459a 100755 --- a/swarmforge/scripts/ready_for_next_task.bb +++ b/swarmforge/scripts/ready_for_next_task.bb @@ -2,6 +2,7 @@ (ns ready-for-next-task (:require [babashka.fs :as fs] + [clojure.java.shell :as sh] [clojure.string :as str])) (defn state-dir [] @@ -81,6 +82,22 @@ (println "PAYLOAD:") (print (body file)))) +(defn sync-to-trunk! [] + (let [fetch-result (sh/sh "git" "fetch" "origin")] + (when-not (zero? (:exit fetch-result)) + (binding [*out* *err*] + (println "WARNING: git fetch failed:" (str/trim (:err fetch-result)))))) + (let [branch-result (sh/sh "git" "symbolic-ref" "--short" "refs/remotes/origin/HEAD") + default-branch (when (zero? (:exit branch-result)) + (str/trim (:out branch-result)))] + (if default-branch + (let [reset-result (sh/sh "git" "reset" "--hard" default-branch)] + (when-not (zero? (:exit reset-result)) + (binding [*out* *err*] + (println "WARNING: git reset --hard" default-branch "failed:" (str/trim (:err reset-result)))))) + (binding [*out* *err*] + (println "WARNING: could not resolve default branch; skipping trunk sync"))))) + (defn fail! [status & lines] (binding [*out* *err*] (doseq [line lines] @@ -115,6 +132,7 @@ (fail! 2 (str "AMBIGUOUS_TASK_STATE: target in-process file already exists: " target-file))) (fs/move source-file target-file) (set-header! target-file "dequeued_at" (timestamp)) + (sync-to-trunk!) (print-task target-file)))))))) (-main) diff --git a/swarmforge/scripts/swarm_handoff.bb b/swarmforge/scripts/swarm_handoff.bb index 00dbb59..c8584ed 100755 --- a/swarmforge/scripts/swarm_handoff.bb +++ b/swarmforge/scripts/swarm_handoff.bb @@ -78,8 +78,17 @@ role (exit! 1 "Set SWARMFORGE_ROLE."))) +(defn worktree-path-for-role [role] + (let [lines (str/split-lines (slurp (str (roles-file)))) + line (some (fn [l] + (when (= role (first (str/split l #"\t"))) l)) + lines)] + (if line + (nth (str/split line #"\t") 2) + (exit! 1 (format "Role '%s' not found in roles.tsv" role))))) + (defn state-dir [] - (fs/path (System/getProperty "user.dir") ".swarmforge" "handoffs")) + (fs/path (worktree-path-for-role (sender-role)) ".swarmforge" "handoffs")) (defn timestamp [] (.format java.time.format.DateTimeFormatter/ISO_INSTANT From 36f0add58fe2e304b0be4d1e8d1567e62b6f2b0c Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Fri, 19 Jun 2026 04:06:59 -0300 Subject: [PATCH 56/67] fix(swarmforge): remove extra paren in grok launch-command case The grok case had )))) closing str-grok, case, str-base, AND cond->, leaving (= index 0) and the transform as bare let expressions. The transform's )))) then produced an unmatched ) at the top level, causing a parse error on ./swarm start. Fix: remove one ) so cond-> stays open through (= index 0) and the transform; the transform's )))) correctly closes str-transform, cond->, let, and defn. Co-Authored-By: Claude Sonnet 4.6 --- swarmforge/scripts/swarmforge.bb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/swarmforge/scripts/swarmforge.bb b/swarmforge/scripts/swarmforge.bb index 25ada67..3b01e2f 100755 --- a/swarmforge/scripts/swarmforge.bb +++ b/swarmforge/scripts/swarmforge.bb @@ -338,7 +338,7 @@ "claude" (str "claude --append-system-prompt-file " (sq (str prompt-file)) permission-mode " -n " (sq (str "SwarmForge " display)) " " (extra-args-prefix row) "\"$(cat " (sq (str prompt-file)) ")\"") "codex" (str "codex -C " (sq (str role-worktree)) " " (extra-args-prefix row) "\"$(cat " (sq (str prompt-file)) ")\"") "copilot" (str "copilot -C " (sq (str role-worktree)) " --name " (sq (str "SwarmForge " display)) " " (extra-args-prefix row) "-i \"$(cat " (sq (str prompt-file)) ")\"") - "grok" (str "grok --cwd " (sq (str role-worktree)) permission-mode " " (extra-args-prefix row) "--rules \"$(cat " (sq (str prompt-file)) ")\" --verbatim \"$(cat " (sq (str prompt-file)) ")\"")))) + "grok" (str "grok --cwd " (sq (str role-worktree)) permission-mode " " (extra-args-prefix row) "--rules \"$(cat " (sq (str prompt-file)) ")\" --verbatim \"$(cat " (sq (str prompt-file)) ")\""))) (= index 0) (str "; exit_code=$?; SWARMFORGE_TERMINAL_BACKEND=" (sq (:terminal-backend ctx)) " nohup " (sq (str (fs/path (:script-dir ctx) "swarm-cleanup.sh"))) From fe71357db9f6316e646c24b7dc4bb7bf1e6473ec Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Fri, 19 Jun 2026 04:11:02 -0300 Subject: [PATCH 57/67] fix(swarmforge): handle start subcommand in -main dispatch ./swarm start passes "start" as the first positional arg to swarmforge.bb. The default case treated any first arg as a working directory, so run-main! received "start" and looked for config at /start/swarmforge/swarmforge.conf. Add "start" to the case dispatch, routing it to run-main! with the JVM user.dir (same as bare ./swarm with no args). Co-Authored-By: Claude Sonnet 4.6 --- swarmforge/scripts/swarmforge.bb | 1 + 1 file changed, 1 insertion(+) diff --git a/swarmforge/scripts/swarmforge.bb b/swarmforge/scripts/swarmforge.bb index 3b01e2f..611461c 100755 --- a/swarmforge/scripts/swarmforge.bb +++ b/swarmforge/scripts/swarmforge.bb @@ -561,6 +561,7 @@ (drop 2 args)) "--test-agent-start-delay" (println (env-long "SWARMFORGE_AGENT_START_DELAY_MS" 1500)) "--test-tmux-base-indexes" (test-tmux-base-indexes! (second args)) + "start" (run-main! (System/getProperty "user.dir")) (run-main! (or (first args) (System/getProperty "user.dir"))))) (load-file (str (fs/parent *file*) "/fork.bb")) From 3c6b70749438d476d283170aa7e1a84178f19741 Mon Sep 17 00:00:00 2001 From: gabadi Date: Sat, 20 Jun 2026 22:47:35 -0300 Subject: [PATCH 58/67] fix(handoffs): idle gate for notify! and reframe git_handoff routing rules (#58) * fix(handoffd): gate notify! on recipient idle state to prevent mid-task /clear Adds UserPromptSubmit/Stop hooks to each role's settings.local.json that write/remove a per-worktree agent-running marker. handoffd skips notify! when the marker is present, so a /clear is never injected into a running agent session. File delivery to inbox/new/ remains unconditional; the queued handoff is picked up naturally via done_with_current -> ready_for_next. Closes #56 Co-Authored-By: Claude Sonnet 4.6 * fix(handoffs): reframe git_handoff rules to separate forwarding from back-routing Replaces the single ambiguous "no functional change" block with two distinct rules: downstream forwarding requires a functional change; upstream back-routing (ADR 0004) always uses git_handoff with the sender's branch HEAD, even with no authored functional lines. Updates ADR 0004 to close the pending handoff mechanism implementation item. Closes #57 Co-Authored-By: Claude Sonnet 4.6 --------- Co-authored-by: Claude Sonnet 4.6 --- docs/adr/0004-rework-routes-back.md | 5 ++++- swarmforge/constitution/articles/handoffs.prompt | 6 ++++-- swarmforge/scripts/fork.bb | 4 ++++ swarmforge/scripts/handoffd.bb | 3 ++- 4 files changed, 14 insertions(+), 4 deletions(-) diff --git a/docs/adr/0004-rework-routes-back.md b/docs/adr/0004-rework-routes-back.md index 6add5f7..62e9893 100644 --- a/docs/adr/0004-rework-routes-back.md +++ b/docs/adr/0004-rework-routes-back.md @@ -15,4 +15,7 @@ The trigger is not only a defect. Any finding that an earlier stage's work must ## Pending implementation - How a finding is attributed to an origin stage (the line must be able to trace it back to the spec, test, or design that owns it). -- Where the rule lives in the role prompts (runnable change, `six-pack`). + +## Implementation notes + +- Back-routing always uses `git_handoff` with the sender's current branch HEAD as the commit, even when the sender authored no functional lines. Two distinct rules (forwarding vs. back-routing) replace the old single "no functional change" block in `swarmforge/constitution/articles/handoffs.prompt`. diff --git a/swarmforge/constitution/articles/handoffs.prompt b/swarmforge/constitution/articles/handoffs.prompt index 16f385c..ec2088a 100644 --- a/swarmforge/constitution/articles/handoffs.prompt +++ b/swarmforge/constitution/articles/handoffs.prompt @@ -30,10 +30,12 @@ task: commit: <10-character-commit-abbrev> ``` -- Do not send or forward a `git_handoff` when the received commit produces no - functional project change. Complete the inbound task instead. +- Do not send or forward a `git_handoff` downstream when the received commit + produces no functional project change. Complete the inbound task instead. - Treat manifest-only, audit-only, generated metadata, formatting-only, and other non-functional churn as no forwardable change. +- When routing work back upstream, always use `git_handoff` with your current + branch HEAD as the commit, even if you authored no functional lines. - Preserve the received task name when forwarding work for the same task. If the handoff starts new work, invent a short stable task name. - For `note`, write: diff --git a/swarmforge/scripts/fork.bb b/swarmforge/scripts/fork.bb index eb695fb..473e717 100644 --- a/swarmforge/scripts/fork.bb +++ b/swarmforge/scripts/fork.bb @@ -19,6 +19,10 @@ (assoc :autoCompactEnabled true) (assoc-in [:env :CLAUDE_AUTOCOMPACT_PCT_OVERRIDE] "88") (assoc-in [:env :CLAUDE_CODE_AUTO_COMPACT_WINDOW] "200000")) + marker-path (str worktree-path "/.swarmforge/agent-running") + cfg (-> cfg + (assoc-in [:hooks :UserPromptSubmit] [{:hooks [{:type "command" :command (str "touch " marker-path)}]}]) + (assoc-in [:hooks :Stop] [{:hooks [{:type "command" :command (str "rm -f " marker-path)}]}])) cfg (if (seq advisor-model) (assoc cfg :advisorModel advisor-model) cfg)] diff --git a/swarmforge/scripts/handoffd.bb b/swarmforge/scripts/handoffd.bb index 96137f5..c18b2de 100755 --- a/swarmforge/scripts/handoffd.bb +++ b/swarmforge/scripts/handoffd.bb @@ -141,7 +141,8 @@ (fs/create-dirs (fs/parent target)) (when-not (fs/exists? target) (spit (str target) (render-message (:headers delivered) (:body delivered)))) - (notify! socket (:session role-info))))) + (when-not (fs/exists? (fs/path (:worktree-path role-info) ".swarmforge" "agent-running")) + (notify! socket (:session role-info)))))) (move-with-collision path (fs/path (get-in roles [sender-role :worktree-path]) ".swarmforge" "handoffs" "sent")) From 583f3e0ad336122b3fa6e543629df0eff51ba1db Mon Sep 17 00:00:00 2001 From: gabadi Date: Sun, 21 Jun 2026 02:03:36 -0300 Subject: [PATCH 59/67] feat(skills): add mattpocock/skills selective install with allowlist (#59) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Pin mattpocock/skills and install only grill-with-docs, domain-modeling, and grilling — the skills needed for autonomous intentional-knowledge capture during the specifier phase. Uses MATTPOCOCK_SKILLS_INCLUDE in install-pins.conf to filter the archive, keeping unrelated skills out. Sentinel tracks mattpocock SHA separately from entireio so either can upgrade independently. Co-authored-by: Claude Sonnet 4.6 --- swarmforge/scripts/fork.bb | 37 +++++++++++++++++++++++++--- swarmforge/scripts/install-pins.conf | 4 +++ 2 files changed, 37 insertions(+), 4 deletions(-) diff --git a/swarmforge/scripts/fork.bb b/swarmforge/scripts/fork.bb index 473e717..f72a2de 100644 --- a/swarmforge/scripts/fork.bb +++ b/swarmforge/scripts/fork.bb @@ -126,12 +126,15 @@ [(str/trim (subs line 0 sep)) (str/trim (subs line (inc sep)))]))) (defn install-skills! - "Install local skills and pinned entire skills into .claude/skills/." + "Install local skills and pinned entire and mattpocock skills into .claude/skills/." [ctx] (let [pins-file (fs/path (:script-dir ctx) "install-pins.conf")] (when (fs/exists? pins-file) (let [pins (parse-pins-file pins-file) entire-sha (get pins "ENTIRE_SKILLS_SHA") + mattpocock-sha (get pins "MATTPOCOCK_SKILLS_SHA") + mattpocock-include (when-let [inc (get pins "MATTPOCOCK_SKILLS_INCLUDE")] + (set (map str/trim (str/split inc #",")))) skills-src (fs/path (:script-dir ctx) ".." "skills") skills-dst (fs/path (:working-dir ctx) ".claude" "skills")] (println (str cyan "Installing skills..." reset)) @@ -163,7 +166,28 @@ (spit (str (fs/path (:state-dir ctx) "skills-installed")) entire-sha)) (do (fs/delete-tree tmp-dir) - (println (str " " yellow "⚠" reset " entire skills unavailable (no network?) — proceeding without them")))))))))) + (println (str " " yellow "⚠" reset " entire skills unavailable (no network?) — proceeding without them")))))) + (when mattpocock-sha + (let [tmp-dir (str (fs/create-temp-dir)) + url (str "https://github.com/mattpocock/skills/archive/" mattpocock-sha ".tar.gz") + result (process/sh {:continue true} "sh" "-c" + (str "curl -fsSL " (sq url) " | tar -xz --strip-components=1 -C " (sq tmp-dir)))] + (if (zero? (:exit result)) + (do + (let [skills-extracted (fs/path tmp-dir "skills" "engineering")] + (when (fs/exists? skills-extracted) + (doseq [skill-dir (->> (fs/list-dir skills-extracted) (filter fs/directory?)) + :let [skill-name (str (fs/file-name skill-dir))] + :when (or (nil? mattpocock-include) (contains? mattpocock-include skill-name))] + (let [dst (fs/path skills-dst skill-name)] + (when (fs/exists? dst) (fs/delete-tree dst)) + (fs/copy-tree skill-dir dst))))) + (fs/delete-tree tmp-dir) + (println (str " " green "✓" reset " mattpocock skills (" (subs mattpocock-sha 0 8) ")")) + (spit (str (fs/path (:state-dir ctx) "mattpocock-skills-installed")) mattpocock-sha)) + (do + (fs/delete-tree tmp-dir) + (println (str " " yellow "⚠" reset " mattpocock skills unavailable (no network?) — proceeding without them")))))))))) (defn ensure-skills-installed! "Install skills if pins changed or first run." @@ -172,9 +196,14 @@ (when (fs/exists? pins-file) (let [pins (parse-pins-file pins-file) entire-sha (get pins "ENTIRE_SKILLS_SHA") - sentinel (fs/path (:state-dir ctx) "skills-installed")] + mattpocock-sha (get pins "MATTPOCOCK_SKILLS_SHA") + sentinel (fs/path (:state-dir ctx) "skills-installed") + mattpocock-sentinel (fs/path (:state-dir ctx) "mattpocock-skills-installed")] (when-not (and (fs/exists? sentinel) - (= entire-sha (str/trim (slurp (str sentinel))))) + (= entire-sha (str/trim (slurp (str sentinel)))) + (or (nil? mattpocock-sha) + (and (fs/exists? mattpocock-sentinel) + (= mattpocock-sha (str/trim (slurp (str mattpocock-sentinel))))))) (install-skills! ctx)))))) ;;; ADR 0021: Curator skill links diff --git a/swarmforge/scripts/install-pins.conf b/swarmforge/scripts/install-pins.conf index 2235b98..ff74631 100644 --- a/swarmforge/scripts/install-pins.conf +++ b/swarmforge/scripts/install-pins.conf @@ -3,3 +3,7 @@ # entireio/skills — installed to .claude/skills/ in the target project ENTIRE_SKILLS_SHA=4c9a02513c3ec6ebabd9a9dc6bd8240854a218ac + +# mattpocock/skills — selective install; only listed skills are copied +MATTPOCOCK_SKILLS_SHA=6eeb81b5fcfeeb5bd531dd47ab2f9f2bbea27461 +MATTPOCOCK_SKILLS_INCLUDE=grill-with-docs,domain-modeling,grilling From 1b4766070b9e7d8663f4e600c8cd013197e21033 Mon Sep 17 00:00:00 2001 From: gabadi Date: Sun, 21 Jun 2026 02:46:00 -0300 Subject: [PATCH 60/67] fix(swarm): apply drywall execution learnings to handoff protocol (#61) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * fix(swarm): apply drywall execution learnings to handoff protocol - merge_and_process: drop unconditional git fetch; SHA is already in the shared object store for local worktrees - handoffs: replace commit placeholder with positive generation command - handoffs: commit findings before routing back upstream - handoffs: send outbound handoff before agent-retro, not after Co-Authored-By: Claude Sonnet 4.6 * fix(swarm): add task completion ordering rules to workflow.prompt Universal rules: send outbound handoff before agent-retro, commit findings before routing back. Placed in workflow.prompt so six-pack inherits them on next merge from main. Co-Authored-By: Claude Sonnet 4.6 * fix(swarm): write draft before retro, queue after — not send before retro Correct order: write handoff draft → agent-retro (reviews draft) → swarm_handoff.sh (queues it) → done_with_current.sh. Co-Authored-By: Claude Sonnet 4.6 * fix(swarm): correct task completion order — queue handoff then retro Order: swarm_handoff.sh (queue) → agent-retro → done_with_current.sh. Previous commit had it backwards (draft → retro → queue). Co-Authored-By: Claude Sonnet 4.6 * fix(swarm): restore original phrasing for task completion ordering Co-Authored-By: Claude Sonnet 4.6 --------- Co-authored-by: Claude Sonnet 4.6 --- swarmforge/constitution/articles/handoffs.prompt | 8 ++++---- swarmforge/constitution/articles/workflow.prompt | 4 ++++ swarmforge/scripts/merge_and_process.sh | 7 +------ 3 files changed, 9 insertions(+), 10 deletions(-) diff --git a/swarmforge/constitution/articles/handoffs.prompt b/swarmforge/constitution/articles/handoffs.prompt index ec2088a..9258b66 100644 --- a/swarmforge/constitution/articles/handoffs.prompt +++ b/swarmforge/constitution/articles/handoffs.prompt @@ -27,15 +27,15 @@ type: git_handoff to: [,...] priority: NN task: -commit: <10-character-commit-abbrev> +commit: $(git rev-parse --short=10 HEAD) ``` - Do not send or forward a `git_handoff` downstream when the received commit produces no functional project change. Complete the inbound task instead. - Treat manifest-only, audit-only, generated metadata, formatting-only, and other non-functional churn as no forwardable change. -- When routing work back upstream, always use `git_handoff` with your current - branch HEAD as the commit, even if you authored no functional lines. +- When routing work back upstream, commit your findings file first so the commit + carries context, then send `git_handoff` with that HEAD as the commit. - Preserve the received task name when forwarding work for the same task. If the handoff starts new work, invent a short stable task name. - For `note`, write: @@ -66,7 +66,7 @@ message: current batch in helper-delivered order. - Use only the task information printed by the helper scripts. - If a tmux wake-up arrives while already working on a task, ignore it. -- When the task or batch is fully complete, run `agent-retro`, then run `done_with_current.sh`. +- When the task or batch is fully complete, send any outbound handoff first, then run `agent-retro`, then run `done_with_current.sh`. - `note` handoffs are tasks too; after reading or acting on a note, run `done_with_current.sh` before accepting any other handoff. - If `done_with_current.sh` prints `TASK: `, treat the printed `PAYLOAD` diff --git a/swarmforge/constitution/articles/workflow.prompt b/swarmforge/constitution/articles/workflow.prompt index d92cc4a..ee61199 100644 --- a/swarmforge/constitution/articles/workflow.prompt +++ b/swarmforge/constitution/articles/workflow.prompt @@ -13,5 +13,9 @@ ## Failure Conditions - If the expected git layout or assigned worktree is missing, stop and report instead of silently working in the wrong place. +## Task Completion +- When work is done, send any outbound handoff first, then run `agent-retro`. +- When routing back upstream, commit your findings file before sending `git_handoff` so the commit carries context. + ## Idle Gate - Wait for a handoff. Do not act without one. diff --git a/swarmforge/scripts/merge_and_process.sh b/swarmforge/scripts/merge_and_process.sh index 2f698ca..4da7043 100755 --- a/swarmforge/scripts/merge_and_process.sh +++ b/swarmforge/scripts/merge_and_process.sh @@ -9,12 +9,7 @@ set -euo pipefail SENDER_ROLE="${1?Usage: merge_and_process }" CANONICAL_COMMIT="${2?Usage: merge_and_process }" -SENDER_BRANCH="swarmforge-${SENDER_ROLE}" - -echo "merge_and_process: fetching origin/${SENDER_BRANCH}..." -git fetch origin "${SENDER_BRANCH}" - -echo "merge_and_process: merging ${CANONICAL_COMMIT} (from ${SENDER_BRANCH})..." +echo "merge_and_process: merging ${CANONICAL_COMMIT} (from ${SENDER_ROLE})..." if ! git merge --no-ff "${CANONICAL_COMMIT}"; then echo "" >&2 echo "CONFLICT: merge of ${CANONICAL_COMMIT} into $(git symbolic-ref --short HEAD) failed." >&2 From 2514f443a4378fcd2f8723602dfc54a8331ddabf Mon Sep 17 00:00:00 2001 From: gabadi Date: Sun, 21 Jun 2026 03:08:40 -0300 Subject: [PATCH 61/67] fix(fork): search all mattpocock skill subdirs, not just engineering/ (#63) grilling lives in skills/productivity/, not skills/engineering/. Search all top-level subdirs under skills/ so MATTPOCOCK_SKILLS_INCLUDE works regardless of which subdir the skill is in. Co-authored-by: Claude Sonnet 4.6 --- swarmforge/scripts/fork.bb | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/swarmforge/scripts/fork.bb b/swarmforge/scripts/fork.bb index f72a2de..e90553f 100644 --- a/swarmforge/scripts/fork.bb +++ b/swarmforge/scripts/fork.bb @@ -174,9 +174,10 @@ (str "curl -fsSL " (sq url) " | tar -xz --strip-components=1 -C " (sq tmp-dir)))] (if (zero? (:exit result)) (do - (let [skills-extracted (fs/path tmp-dir "skills" "engineering")] - (when (fs/exists? skills-extracted) - (doseq [skill-dir (->> (fs/list-dir skills-extracted) (filter fs/directory?)) + (let [skills-root (fs/path tmp-dir "skills")] + (when (fs/exists? skills-root) + (doseq [subdir (->> (fs/list-dir skills-root) (filter fs/directory?)) + skill-dir (->> (fs/list-dir subdir) (filter fs/directory?)) :let [skill-name (str (fs/file-name skill-dir))] :when (or (nil? mattpocock-include) (contains? mattpocock-include skill-name))] (let [dst (fs/path skills-dst skill-name)] From 801b4862ea148a9616adf2268f558001bc6d9c43 Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Sun, 21 Jun 2026 03:39:30 -0300 Subject: [PATCH 62/67] fix(swarm): drop initial user message for claude agents on startup Passing the prompt file as both --append-system-prompt-file and a positional arg caused Claude to respond at startup before any handoff arrived. Drop the positional arg; the system prompt is sufficient and the agent now stays idle until handoffd delivers the first wake-up. Co-Authored-By: Claude Sonnet 4.6 --- swarmforge/scripts/swarmforge.bb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/swarmforge/scripts/swarmforge.bb b/swarmforge/scripts/swarmforge.bb index 611461c..191fb17 100755 --- a/swarmforge/scripts/swarmforge.bb +++ b/swarmforge/scripts/swarmforge.bb @@ -335,7 +335,7 @@ (write-agent-instruction-file! role prompt-file) (cond-> (str base (case agent - "claude" (str "claude --append-system-prompt-file " (sq (str prompt-file)) permission-mode " -n " (sq (str "SwarmForge " display)) " " (extra-args-prefix row) "\"$(cat " (sq (str prompt-file)) ")\"") + "claude" (str "claude --append-system-prompt-file " (sq (str prompt-file)) permission-mode " -n " (sq (str "SwarmForge " display)) " " (extra-args-prefix row)) "codex" (str "codex -C " (sq (str role-worktree)) " " (extra-args-prefix row) "\"$(cat " (sq (str prompt-file)) ")\"") "copilot" (str "copilot -C " (sq (str role-worktree)) " --name " (sq (str "SwarmForge " display)) " " (extra-args-prefix row) "-i \"$(cat " (sq (str prompt-file)) ")\"") "grok" (str "grok --cwd " (sq (str role-worktree)) permission-mode " " (extra-args-prefix row) "--rules \"$(cat " (sq (str prompt-file)) ")\" --verbatim \"$(cat " (sq (str prompt-file)) ")\""))) From 5a8122738a73b4986cca1ea689ebc4ca3f490a96 Mon Sep 17 00:00:00 2001 From: gabadi Date: Sun, 21 Jun 2026 05:17:22 -0300 Subject: [PATCH 63/67] docs(adr): specifier gates on frontier intent, not formal spec (#64) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ADR 0022 — moves the human gate from full spec review to a conversational frontier brief. Everything after confirmation is autonomous. Co-authored-by: Claude Sonnet 4.6 --- docs/adr/0022-specifier-frontier-gate.md | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) create mode 100644 docs/adr/0022-specifier-frontier-gate.md diff --git a/docs/adr/0022-specifier-frontier-gate.md b/docs/adr/0022-specifier-frontier-gate.md new file mode 100644 index 0000000..faaca2f --- /dev/null +++ b/docs/adr/0022-specifier-frontier-gate.md @@ -0,0 +1,17 @@ +--- +status: accepted +--- + +# Specifier gates on frontier intent, not on formal spec + +Upstream gates human review on the full formal spec — Gherkin + QA suite — before handing off to the coder. The specifier writes everything, then asks for approval. + +The fork moves the gate earlier: the specifier drafts a **frontier brief** (one-sentence intent, 2–4 prose scenarios, and explicit fog exclusions) and gets confirmation before generating any formal artifact. Everything after confirmation — `grill-with-docs`, Gherkin with headers, QA suite, handoff — runs autonomously. + +**Why:** Gherkin and its header sections are a mechanical encoding of confirmed intent, not a new decision. Reviewing the encoding adds no human judgment — only friction. The frontier brief is the decision point. The formal spec is how that decision is expressed. + +**Fog of war:** the brief names only what is knowable and decidable at the frontier. Uncertain items are listed as "not in scope" — explicitly excluded, not silently omitted — and become `Does NOT:` entries in the Gherkin SCOPE header. The specifier does not plan past the fog. + +**Contradiction rule:** during autonomous generation, the specifier may add scenarios that are natural consequences of confirmed behavior. The only trigger to surface to the user is a contradiction or mismatch — something that makes the confirmed brief inconsistent or impossible to implement as specified. + +**Handoff** is fully automatic after brief confirmation; there is no second human gate. From 7a809d1c70b3f44e135f69bbfae0ed052127dffe Mon Sep 17 00:00:00 2001 From: gabadi Date: Sun, 21 Jun 2026 13:00:34 -0300 Subject: [PATCH 64/67] fix(swarm): apply drywall execution learnings to main branch (#66) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * fix(swarm): include remedy in commit-abbrev validation error Agents receiving the 10-char rejection had no guidance on how to fix it, causing repeated rediscovery of git rev-parse --short=10 each session. Co-Authored-By: Claude Sonnet 4.6 * fix(fork): propagate swarm allow-rules into worktree settings.local.json Worktrees live outside the main project directory tree, so Claude Code never finds the project's .claude/settings.json. write-worktree-settings! now writes the integrator/specifier allow-rules into each worktree's settings.local.json on every launch, including restarts. Co-Authored-By: Claude Sonnet 4.6 * fix(swarm): apply drywall execution learnings to main branch - workflow.prompt: act don't narrate in autonomous sessions - fork.bb: clarify swarm-persona is pre-loaded by harness, not re-invokable Co-Authored-By: Claude Sonnet 4.6 * fix(swarm): simplify agent instruction and drop autonomous sessions rule - fork.bb: drop swarm-persona invoke directive — harness already sends it via wake message; telling the agent to invoke it again causes double-load - workflow.prompt: revert autonomous sessions section — too vague and not actionable as a constitution rule Co-Authored-By: Claude Sonnet 4.6 * test(fork): add fork.bb extension test runner - test/fork_runner.bb: standalone bb script verifying all fork.bb overrides (write-agent-instruction-file!, write-worktree-settings!, write-persona-skill-file!, link-curator-skills!) exist and produce correct output - bb.edn: add fork-test task (run: bb fork-test) Co-Authored-By: Claude Sonnet 4.6 * ci: add GitHub Actions workflow with pinned action hashes Runs upstream tests (bb test) and fork extension tests (bb fork-test) on push/PR to main and six-pack. Babashka v1.12.218 pinned by version URL; actions/checkout pinned to commit hash. Co-Authored-By: Claude Sonnet 4.6 * ci: install zsh — scripts use #!/usr/bin/env zsh Co-Authored-By: Claude Sonnet 4.6 --------- Co-authored-by: Claude Sonnet 4.6 --- .github/workflows/ci.yml | 27 +++++++++ bb.edn | 4 +- swarmforge/scripts/fork.bb | 7 ++- swarmforge/scripts/swarm_handoff.bb | 2 +- test/fork_runner.bb | 88 +++++++++++++++++++++++++++++ 5 files changed, 124 insertions(+), 4 deletions(-) create mode 100644 .github/workflows/ci.yml create mode 100644 test/fork_runner.bb diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml new file mode 100644 index 0000000..a925e70 --- /dev/null +++ b/.github/workflows/ci.yml @@ -0,0 +1,27 @@ +name: CI + +on: + push: + branches: [main, six-pack] + pull_request: + +jobs: + test: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2 + + - name: Install babashka + run: | + curl -fsSLO https://github.com/babashka/babashka/releases/download/v1.12.218/babashka-1.12.218-linux-amd64-static.tar.gz + tar xzf babashka-1.12.218-linux-amd64-static.tar.gz + sudo mv bb /usr/local/bin/ + + - name: Install dependencies + run: sudo apt-get install -y tmux zsh + + - name: Run upstream tests + run: bb test + + - name: Run fork extension tests + run: bb fork-test diff --git a/bb.edn b/bb.edn index def1fe3..1b99978 100644 --- a/bb.edn +++ b/bb.edn @@ -7,4 +7,6 @@ (require 'swarmforge.script-test) (let [{:keys [fail error]} (clojure.test/run-tests 'swarmforge.handoff-test 'swarmforge.script-test)] - (System/exit (+ fail error))))}}} + (System/exit (+ fail error))))} + fork-test {:doc "Run fork extension tests (fork.bb overrides)" + :task (shell "bb" "test/fork_runner.bb")}}} diff --git a/swarmforge/scripts/fork.bb b/swarmforge/scripts/fork.bb index e90553f..ac6e3eb 100644 --- a/swarmforge/scripts/fork.bb +++ b/swarmforge/scripts/fork.bb @@ -23,6 +23,10 @@ cfg (-> cfg (assoc-in [:hooks :UserPromptSubmit] [{:hooks [{:type "command" :command (str "touch " marker-path)}]}]) (assoc-in [:hooks :Stop] [{:hooks [{:type "command" :command (str "rm -f " marker-path)}]}])) + swarm-allow ["Bash(gh pr merge*)" "Bash(git reset --hard origin/*)"] + existing-allow (get-in cfg [:permissions :allow] []) + cfg (assoc-in cfg [:permissions :allow] + (into existing-allow (remove (set existing-allow) swarm-allow))) cfg (if (seq advisor-model) (assoc cfg :advisorModel advisor-model) cfg)] @@ -93,8 +97,7 @@ (defn write-agent-instruction-file! [role prompt-file] (spit (str prompt-file) (str "You are the " role " in a SwarmForge multi-agent development swarm. " - "Your full role, constitution, and operating instructions are in your swarm-persona skill. " - "Invoke the swarm-persona skill at the start of every session and before responding to any handoff.\n"))) + "Your full role, constitution, and operating instructions are in your swarm-persona skill.\n"))) ;;; ADR 0006: Sparse checkout to hide QA holdout from non-QA/specifier worktrees diff --git a/swarmforge/scripts/swarm_handoff.bb b/swarmforge/scripts/swarm_handoff.bb index c8584ed..8c8fa00 100755 --- a/swarmforge/scripts/swarm_handoff.bb +++ b/swarmforge/scripts/swarm_handoff.bb @@ -220,7 +220,7 @@ (cond (str/blank? commit) [nil "Missing required header 'commit' for git_handoff."] (not (re-matches #"[0-9a-fA-F]{10}" commit)) - [nil (format "Header 'commit' must be exactly 10 hexadecimal characters; got '%s'." commit)] + [nil (format "Header 'commit' must be exactly 10 hexadecimal characters; got '%s'. Run: git rev-parse --short=10 HEAD" commit)] :else (canonical-commit commit)) [nil nil]) git-errors (cond-> [] diff --git a/test/fork_runner.bb b/test/fork_runner.bb new file mode 100644 index 0000000..fc89a25 --- /dev/null +++ b/test/fork_runner.bb @@ -0,0 +1,88 @@ +#!/usr/bin/env bb +;; Fork extension tests — verify fork.bb overrides exist and produce correct output. +;; Run: bb test/fork_runner.bb + +(require '[babashka.fs :as fs] + '[babashka.process :as process] + '[clojure.string :as str]) + +;; Stubs for swarmforge.bb constants used only in install-skills! (not under test here). +(def cyan "") (def green "") (def yellow "") (def reset "") +(defn sq [v] (str "'" v "'")) + +(load-file (str (fs/cwd) "/swarmforge/scripts/fork.bb")) + +(def failures (atom [])) + +(defn check [label ok?] + (if ok? + (println (str " ok " label)) + (do (println (str " FAIL " label)) + (swap! failures conj label)))) + +;;; write-agent-instruction-file! + +(let [tmp (str (fs/create-temp-file {:prefix "test-instr" :suffix ".md"}))] + (write-agent-instruction-file! "coder" tmp) + (let [content (slurp tmp)] + (check "agent-instruction: contains role identity" + (str/includes? content "You are the coder in a SwarmForge multi-agent development swarm.")) + (check "agent-instruction: points to swarm-persona skill" + (str/includes? content "swarm-persona skill")) + (check "agent-instruction: no Invoke directive (double-load guard)" + (not (str/includes? content "Invoke")))) + (fs/delete (fs/path tmp))) + +;;; write-worktree-settings! + +(let [tmp (str (fs/create-temp-dir {:prefix "test-wt-"}))] + (write-worktree-settings! tmp) + (let [content (slurp (str (fs/path tmp ".claude" "settings.local.json")))] + (check "worktree-settings: autoCompactEnabled" (str/includes? content "autoCompactEnabled")) + (check "worktree-settings: CLAUDE_AUTOCOMPACT_PCT_OVERRIDE" (str/includes? content "CLAUDE_AUTOCOMPACT_PCT_OVERRIDE")) + (check "worktree-settings: CLAUDE_CODE_AUTO_COMPACT_WINDOW" (str/includes? content "CLAUDE_CODE_AUTO_COMPACT_WINDOW")) + (check "worktree-settings: UserPromptSubmit hook" (str/includes? content "UserPromptSubmit")) + (check "worktree-settings: Stop hook" (str/includes? content "Stop")) + (check "worktree-settings: gh pr merge allow rule" (str/includes? content "gh pr merge")) + (check "worktree-settings: git reset allow rule" (str/includes? content "git reset --hard origin/"))) + (fs/delete-tree (fs/path tmp))) + +;;; write-persona-skill-file! (exercises resolve-prompt-bundle transitively) + +(let [root (str (fs/create-temp-dir {:prefix "test-persona-root-"})) + wt (str (fs/create-temp-dir {:prefix "test-persona-wt-"}))] + (fs/create-dirs (fs/path root "swarmforge" "constitution" "articles")) + (spit (str (fs/path root "swarmforge" "constitution.prompt")) "# Constitution\n") + (spit (str (fs/path root "swarmforge" "constitution" "articles" "workflow.prompt")) "# Workflow\n") + (fs/create-dirs (fs/path root "swarmforge" "roles")) + (spit (str (fs/path root "swarmforge" "roles" "coder.prompt")) "# Coder\n") + (let [ctx {:working-dir (fs/path root) + :constitution-file (fs/path root "swarmforge" "constitution.prompt") + :roles-dir (fs/path root "swarmforge" "roles")} + skill-file (str (fs/path wt ".claude" "skills" "swarm-persona" "SKILL.md"))] + (write-persona-skill-file! ctx "coder" wt) + (let [content (slurp skill-file)] + (check "persona-skill: SKILL.md created" (fs/exists? (fs/path skill-file))) + (check "persona-skill: name: swarm-persona" (str/includes? content "name: swarm-persona")) + (check "persona-skill: bundles role file" (str/includes? content "swarmforge/roles/coder.prompt")) + (check "persona-skill: bundles constitution article" (str/includes? content "swarmforge/constitution")))) + (fs/delete-tree (fs/path root)) + (fs/delete-tree (fs/path wt))) + +;;; link-curator-skills! + +(let [tmp (str (fs/create-temp-dir {:prefix "test-curator-"}))] + (fs/create-dirs (fs/path tmp ".agents" "skills" "my-skill")) + (spit (str (fs/path tmp ".agents" "skills" "my-skill" "SKILL.md")) "test\n") + (link-curator-skills! tmp) + (check "link-curator: symlink created in .claude/skills/" + (fs/exists? (fs/path tmp ".claude" "skills" "my-skill"))) + (fs/delete-tree (fs/path tmp))) + +;;; Report + +(println) +(if (empty? @failures) + (do (println (str "All " "fork.bb extension tests passed.")) (System/exit 0)) + (do (println (str (count @failures) " failure(s): " (str/join ", " @failures))) + (System/exit 1))) From b4a48231dcd380f3feae8f544cd65587de739998 Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Sun, 21 Jun 2026 23:33:09 -0300 Subject: [PATCH 65/67] docs(plan): rewrite tool analysis as actionable plan with current status - Add status table showing done/todo per stack and tool family - Separate what-exists (crap4js v0.1.0, drywall v0.1.0, cargo-crap) from what-to-build (crap4py, mutate binary, mutate4py) - Document install commands for released tools - Move reference material (LCOV format, Uncle Bob CLIs) to appendix - Record key decisions and rationale (why port mutation, why crap4py is new not a port, why one Rust binary covers JS/TS + Rust) - Flag missing pieces: constitution wiring, repo locations, CI pattern Co-Authored-By: Claude Sonnet 4.6 --- docs/tool-analysis-crap-dry-mutation.md | 602 ++++++------------------ 1 file changed, 133 insertions(+), 469 deletions(-) diff --git a/docs/tool-analysis-crap-dry-mutation.md b/docs/tool-analysis-crap-dry-mutation.md index 0b3e2ce..48b39b7 100644 --- a/docs/tool-analysis-crap-dry-mutation.md +++ b/docs/tool-analysis-crap-dry-mutation.md @@ -1,547 +1,211 @@ -# Code Quality Tool Analysis: CRAP / DRY / Mutation +# Code Quality Tools: CRAP / DRY / Mutation -**Status:** Research complete — decision pending -**Scope:** JavaScript/TypeScript, Python, Rust source targets -**Goal:** Determine what to reuse, what to build, and in what language +**Scope:** JavaScript/TypeScript, Python, Rust source targets +**Goal:** Cover all three tool families for all three stacks, matching Uncle Bob's Go/Clj/Java tools --- -## 1. Background +## Status -The engineering constitution (`swarmforge/constitution/articles/engineering.prompt`) mandates -procuring CRAP, mutation, and DRY tools from Uncle Bob's repositories on startup. -Those repositories (`github.com/unclebob/{crap,dry,mutate}4{go,clj,java}`) only cover -Go, Clojure, and Java. This document defines what to do for the remaining stacks. +| Stack | CRAP | DRY | Mutation | +|-------|------|-----|----------| +| **JS/TS** | `crap4js` v0.1.0 — **done** | `drywall` — **done** | `mutate` Rust binary — **todo** | +| **Python** | `crap4py` — **todo** | `drywall` — **done** | `mutate4py` — **todo** | +| **Rust** | `cargo-crap` — **reuse** | `drywall` — **done** | `mutate4rs` Rust binary — **todo** | --- -## 2. Tool Family Definitions +## 1. What Exists Today -### 2.1 CRAP (Change Risk Anti-Pattern) +### crap4js v0.1.0 +- **Install:** `npm install --save-dev github:gabadi/crap4js#v0.1.0` +- **Source:** `github.com/gabadi/crap4js` +- Branch coverage (BRDA) and `?.` CC exclusion are both implemented +- Distributed via GitHub releases — no npm registry -Measures per-function risk as a function of complexity and test coverage. +### drywall v0.1.0 +- **Install:** `cargo install --git https://github.com/gabadi/drywall --tag v0.1.0` +- **Source:** `github.com/gabadi/drywall` +- Single Rust binary covering JS/TS (OXC), Python (tree-sitter-python), Rust (syn) +- Implements Uncle Bob's AST subtree Jaccard algorithm +- Drop-in CLI compatible with dry4go -**Formula:** `CRAP(m) = CC(m)² × (1 - cov(m)/100)³ + CC(m)` - -Where: -- `CC(m)` = cyclomatic complexity of function `m` (decision points + 1) -- `cov(m)` = percentage of the function's branches (or lines) covered by tests - -**Decision points counted:** `if`, `else if`, ternary (`?:`), `&&`, `||`, `??`, -`for`, `for…in`, `for…of`, `while`, `do…while`, `catch`, each `switch case`. - -**Interpretation:** -- Score 1 = perfect (CC=1, 100% coverage) -- Score ≥ 30 = conventionally "crappy" (high complexity AND low coverage) -- A function with CC=10 and 0% coverage scores 1010; same function at 100% scores 10 - -### 2.2 DRY (Don't Repeat Yourself) - -Detects structurally duplicated code using AST subtree hashing and Jaccard similarity. -Catches semantic duplication (same logic, different variable names), not just copy-paste. - -**Uncle Bob's algorithm:** -1. Parse source file to AST -2. Walk every node, collecting all subtrees at every nesting depth -3. Normalize each subtree: replace all identifier names and literal values with - `_ID` and `_LIT` (operators, control-flow keywords, and structure are preserved) -4. Serialize and hash each normalized subtree -5. Build inverted index: hash → list of (file, function, subtree) tuples -6. For any two functions sharing at least one hash, compute Jaccard similarity: - `|shared hashes| / |union of hashes|` -7. Report pairs above the similarity threshold (default 0.82) - -**Qualification gates:** functions must have ≥ 4 source lines AND ≥ 20 normalized AST nodes. -Below this threshold the signal-to-noise ratio collapses. - -**Key property:** Two functions that do the same thing with completely different variable -names will match. Two functions with an identical copy-paste block surrounded by different -code will not match at the function level (only the block would match if it were extracted). - -### 2.3 Mutation Testing - -Verifies test suite quality by injecting deliberate faults into source code and checking -whether tests detect them. A mutant that survives (tests still pass) indicates a gap in -test assertions. - -**Uncle Bob's operator set:** - -| Category | Mutations | -|----------|-----------| -| Arithmetic | `+` ↔ `-`, `*` → `/` | -| Comparison | `>` ↔ `>=`, `<` ↔ `<=` | -| Equality | `==` ↔ `!=` | -| Boolean | `true` ↔ `false` | -| Logical | `&&` ↔ `||` | -| Constant | `0` ↔ `1` (inline, in expressions) | -| Unary | remove `-a` → `a`, remove `!a` → `a` | -| Null | replace return value with `null` / `None` | - -**Key innovations in Uncle Bob's implementation:** -- **Embedded manifest:** function hashes stored in source file footer comments; - enables differential reruns — only functions changed since last run are retested -- **Coverage gating:** only lines covered by at least one test are mutated; - avoids wasting time on dead code -- **Parallel workers:** isolated copies per worker for concurrent mutation +### cargo-crap +- Reuse as-is (pre-1.0 but functional) +- Requires lcov.info from `cargo llvm-cov --lcov` --- -## 3. Uncle Bob Reference Tool Interfaces - -### 3.1 crap4go CLI - -``` -crap4go [--test-command ] [--max-workers ] [path-fragment ...] -``` - -| Flag | Default | Description | -|------|---------|-------------| -| `--test-command` | `go test ./... -coverprofile=target/coverage/coverage.out` | Coverage command; tool appends `-coverprofile` unless `{coverprofile}` placeholder present | -| `--max-workers` | half of logical CPUs | Parallel source-file analysis workers | - -Positional arguments: path fragments — only files whose path contains a fragment are included. -No arguments = all files. +## 2. What to Build -**Behavior:** deletes stale coverage → runs test command → parses coverage + AST → -computes CRAP per function. +### 2.1 crap4py — Python CRAP script (~200 LOC) -**Skips:** `_test.go`, `target/`, `vendor/`, `.git/` +New implementation for Python source. Not a port of crap4go (which analyzes Go source) — +crap4py analyzes Python source using Python's own `ast` module. -**stdout:** -``` -CRAP Report -=========== -Function Package CC Cov% CRAP -------------------------------------------------------------------------------------- -Widget.Run widget 12 45.0% 130.2 -simple widget 1 100.0% 1.0 -``` -Sort: descending by CRAP score (worst first). N/A last. +**Inputs:** +- Python source files (walked from a root directory) +- LCOV tracefile (from pytest-cov or coverage.py with `branch = True` in `.coveragerc`) -**Coverage file:** Go coverage profile (`target/coverage/coverage.out`) +**Output:** same column format as crap4go — Function, Module, CC, Cov%, CRAP — sorted worst first -### 3.2 dry4go CLI +**CC decision points in Python AST:** +`If`, `IfExp` (ternary), `BoolOp` (`and`/`or`), `For`, `While`, `ExceptHandler`, each `match case` -``` -dry4go [options] [file-or-directory ...] -``` +**Branch coverage:** reads `BRDA:` records from LCOV; `cov(m) = BRH_in_range / BRF_in_range × 100` -| Flag | Default | Description | -|------|---------|-------------| -| `--threshold` | `0.82` | Minimum Jaccard similarity to report | -| `--min-lines` | `4` | Minimum source lines in a candidate function | -| `--min-nodes` | `20` | Minimum normalized AST nodes | -| `--format` | `text` | `text` or `json` | -| `--json` | — | Alias for `--format json` | +### 2.2 mutate — Rust binary (JS/TS + Rust) -Directories are recursed; `.git`, `vendor`, `target` excluded. +One binary, two language targets. Ports mutate4go's algorithm to JS/TS and Rust. +OXC parser is already used in drywall — this reuses the same investment. -**Text output:** ``` -DUPLICATE score=0.89 - internal/billing/invoice.go:12-25 - internal/billing/receipt.go:30-44 +mutate --lang [flags] path/to/file ``` -**JSON output:** -```json -{ - "candidates": [ - { - "score": 0.8909090909090909, - "left": {"file": "internal/billing/invoice.go", "start_line": 12, "end_line": 25}, - "right": {"file": "internal/billing/receipt.go", "start_line": 30, "end_line": 44}, - "left_nodes": 88, - "right_nodes": 91 - } - ] -} -``` - -### 3.3 mutate4go CLI - -``` -mutate4go [flags] path/to/file.go -``` - -One source file per invocation. - | Flag | Default | Description | |------|---------|-------------| -| `--scan` | false | Count mutation sites vs. manifest only; no tests | -| `--update-manifest` | false | Rewrite footer manifest without running mutations | -| `--lines ` | — | Restrict to specific line numbers | -| `--since-last-run` | false | Differential: test only changed functions | -| `--mutate-all` | false | Force full mutation ignoring manifest | +| `--test-command` | (required) | Test command to run | +| `--since-last-run` | false | Differential: skip functions whose hash matches manifest | +| `--mutate-all` | false | Force full run, ignore manifest | | `--reuse-coverage` | false | Skip coverage regeneration | -| `--mutation-warning` | 50 | Warn if mutation count exceeds threshold | -| `--timeout-factor` | — | Multiplier for per-mutation test timeout | -| `--test-command` | `go test ./...` | Override test command | +| `--lcov` | — | Path to pre-generated LCOV file | | `--max-workers` | 1 | Parallel mutation workers | +| `--scan` | false | Count mutation sites only, no tests | | `--verbose` | false | Log actions to stderr | -**Behavior:** -1. Generate coverage → run baseline tests (establish timeout) -2. For each covered mutation site: apply → run tests → restore → record -3. Default to differential mode if footer manifest exists -4. Write updated manifest to end of source file +### 2.3 mutate4py — Python mutation script -**Manifest format:** embedded in source file footer comments; contains last-test-date, -per-function ID, line span, normalized-source hash. - ---- - -## 4. Coverage Data Interface — LCOV Format - -All CRAP implementations (existing and proposed) consume LCOV tracefile format. -This is the universal coverage interchange format produced by Jest, Vitest, c8, nyc, -Istanbul, pytest-cov, coverage.py, cargo-llvm-cov, and cargo-tarpaulin. - -### 4.1 Record Types - -| Record | Syntax | Meaning | -|--------|--------|---------| -| `TN` | `TN:` | Test name (optional; may be empty) | -| `SF` | `SF:` | Opens a source file section | -| `FN` | `FN:[,],` | Function declaration (end_line optional) | -| `FNDA` | `FNDA:,` | Function execution count | -| `FNF` | `FNF:` | Total functions found | -| `FNH` | `FNH:` | Functions with count > 0 | -| `BRDA` | `BRDA:,[e],,` | Branch edge data | -| `BRF` | `BRF:` | Total branch records | -| `BRH` | `BRH:` | Branch records with taken > 0 | -| `DA` | `DA:,[,]` | Line execution count | -| `LH` | `LH:` | Lines with count > 0 | -| `LF` | `LF:` | Total lines found | -| `end_of_record` | `end_of_record` | Closes a source file section | - -### 4.2 BRDA Field Detail +Same algorithm as the Rust binary, for Python source. +Uses Python's `ast` module. Reads LCOV from pytest-cov / coverage.py. ``` -BRDA:,[e],, +mutate4py [flags] path/to/file.py ``` -- `line_number`: 1-based line of the branching statement -- `e` prefix (LCOV 2.x): exception-handling branch -- `block`: integer from 0; groups branches of the same conditional -- `branch`: edge index (0 = false/left path, 1 = true/right path for a simple `if`) -- `taken`: `-` (never evaluated / dead code) OR integer execution count -**Critical distinction:** `taken=0` means the branch was evaluated but never taken. -`taken=-` means the branch was never reached at all. +Same flags as `mutate`. -### 4.3 Line vs Branch Coverage in CRAP +--- -The formula requires `cov(m)` — "how much of function m is tested." There are two interpretations: +## 3. Key Decisions -**Line coverage (DA records):** -- A line with `count > 0` is "covered" even if only one side of its branch was tested -- Can overstate coverage significantly on functions with compound conditionals -- This is what **crap4js currently uses** (reads only `DA:` records) +### Why port mutation instead of reusing StrykerJS / mutmut / cargo-mutants -**Branch coverage (BRDA records):** -- Counts each outgoing edge of each conditional independently -- `cov(m) = BRH_in_function / BRF_in_function × 100` -- More accurate to Uncle Bob's intent (he measures decision coverage) -- This is what the **proposed Python implementation should use** +All three existing tools lack the property that makes mutate4go useful in practice: +an embedded-in-source manifest. -### 4.4 Minimal Valid Example +| Property | mutate4go | StrykerJS | mutmut | cargo-mutants | +|----------|-----------|-----------|--------|---------------| +| Manifest stored in source file | Yes | No | No | No | +| Survives repo clone | Yes | No | No | No | +| Team-shared automatically | Yes | No | No | No | +| Zero CI setup for incremental | Yes | No | No | No | +| Incremental granularity | per-function hash | per-mutant position | per-function hash | line (external diff) | -``` -TN:example -SF:/project/src/foo.py -FN:5,25,compute_score -FNDA:3,compute_score -FNF:1 -FNH:1 -BRDA:8,0,0,2 -BRDA:8,0,1,1 -BRDA:15,0,0,- -BRDA:15,0,1,3 -BRF:4 -BRH:3 -DA:5,1 -DA:6,3 -DA:8,3 -DA:9,2 -DA:15,3 -LH:5 -LF:5 -end_of_record -``` +The manifest is embedded as comments in the source file footer, committed with the code. +Any developer who pulls the repo gets differential reruns automatically. StrykerJS's +incremental JSON and mutmut's SQLite cache are external artifacts that each require +explicit CI cache configuration and provide nothing to developers working locally. ---- +### Why crap4py is not a port of crap4go -## 5. Decision Matrix by Stack - -### 5.1 Mutation Testing - -| Stack | Decision | Tool | Rationale | -|-------|----------|------|-----------| -| JS/TS | **Reuse** | StrykerJS | Mature, AST-based, coverage-gated per-test, mutation switching (single suite run) | -| Python | **Reuse** | mutmut | Mature, widely adopted, `--use-coverage` flag for coverage gating | -| Rust | **Reuse** | cargo-mutants | Active, `--in-diff` for PR-scoped runs, most operator categories covered | - -**Rationale shared across all stacks:** The bottleneck is always test execution time. -A Rust reimplementation of the harness cannot make tests run faster — it would only -be orchestrating the same subprocesses. The existing tools are the right level of abstraction. - -**Known gap — cargo-mutants vs Uncle Bob spec:** -- Inline constant swapping (`0↔1` inside expressions) is not implemented; - cargo-mutants replaces entire function bodies with type defaults instead -- No coverage gating (all code is mutated regardless of coverage) -- Differential runs work via `--in-diff` / `--git-base` (different mechanism, same outcome) - -### 5.2 CRAP - -| Stack | Decision | Tool | Rationale | -|-------|----------|------|-----------| -| JS/TS | **Reuse + patch** | crap4js | Correct formula, lean (1,732 LOC), two known gaps (see below) | -| Python | **Write** | ~200-line Python script | Nothing exists; Python's own `ast` module is the right parser | -| Rust | **Reuse** | cargo-crap | Correct formula, reads lcov.info from llvm-cov/tarpaulin; pre-1.0 but functional | - -**Known gaps in crap4js:** - -1. **Line coverage instead of branch coverage** — reads `DA:` records only; - ignores `BRDA:` records. A function with `if (a && b)` where only the - `false` short-circuit path is tested shows as "covered". This understates CRAP - on functions with compound conditionals. Fix: extend `coverage.ts` to parse - `BRDA:` records and compute `BRH/BRF` per function line range. - -2. **Optional chaining `?.` inflates CC** — every `user?.profile?.name` adds +1 to - cyclomatic complexity per `?.`. Uncle Bob's spec (written for Go/Java/Clojure) has - no such operator. Modern TypeScript code using `?.` for defensive access gets - artificially high CC scores on simple property accessor chains. - Fix: remove `MemberExpression`/`CallExpression` with `optional=true` from - `DECISION_PREDICATES` in `complexity.ts:594-601`. - -**Known gap in cargo-crap:** -- Pre-1.0 (v0.2.x); no historical trending, no per-PR delta, no IDE integration -- Requires `lcov.info` as intermediate; no native `llvm-cov` JSON support - -### 5.3 DRY - -| Stack | Decision | Tool | Rationale | -|-------|----------|------|-----------| -| JS/TS | **Write new** | Rust binary (see §6) | No AST-subtree-Jaccard tool exists; jscpd v5 is token-sequence only | -| Python | **Write new** | Rust binary (see §6) | Nothing exists | -| Rust | **Write new** | Rust binary (see §6) | cargo-dupes is function-level only, 4 stars, pre-production | - -**Why not jscpd v5 for JS/TS?** -jscpd v5 (Rust, 5.1k stars, 150+ languages) is production-ready and catches copy-paste -duplication effectively. It uses tokenization + Rabin-Karp rolling hash over token -sequences. However it detects a different class of duplicates than Uncle Bob's algorithm: - -| Scenario | jscpd v5 | Uncle Bob DRY | -|----------|----------|---------------| -| Copy-paste block (identical variable names) | Detects | Detects | -| Same logic, different variable names | Misses | **Detects** | -| Partial block match above threshold | Detects | May miss (function-level Jaccard) | - -For a monorepo with multiple services, the Uncle Bob variant catches cases where -two teams independently implemented the same logic with different names — jscpd would not. -These are complementary tools; the decision to use Uncle Bob's algorithm is a deliberate -choice, not a gap in jscpd. +crap4go analyzes **Go** source. crap4py analyzes **Python** source. They implement the +same CRAP formula but are completely separate tools using their language's native AST. +There is no Python port of crap4go to reuse — it does not exist. ---- +### Why one mutate binary covers JS/TS and Rust -## 6. Proposed New Tool: DRY Rust Binary +OXC (JS/TS parser) and `syn` (Rust parser) are both Rust crates. Building them in one +binary reuses the OXC investment already made for drywall and avoids distributing two +separate binaries. -### 6.1 Scope +--- -A single Rust binary implementing Uncle Bob's DRY algorithm for JS/TS, Python, and Rust -source targets. One binary, one CLI, one output format. +## 4. Mutation Algorithm -### 6.2 AST Parsers +Same for all three ports. Matches mutate4go's implementation. -| Target | Parser | Rationale | -|--------|--------|-----------| -| JS/TypeScript | OXC (`oxc_parser` crate) | 21.6k stars, 3-5x faster than alternatives, spec-compliant, semantic enrichment (scope/binding) | -| Python | `tree-sitter-python` | Standard; Python's own `ast` module has better fidelity but requires CPython subprocess | -| Rust | `syn` crate | The canonical Rust AST parser; better semantic fidelity than tree-sitter-rust for Rust targets | +**Per run:** +1. Parse source file → walk functions → normalize each (identifiers → `_ID`, literals → `_LIT`) → hash (FNV-1a) +2. Read embedded manifest from file footer; skip functions whose hash matches +3. For changed functions: read LCOV → for each covered mutation site → apply operator → run test command → restore +4. Write updated manifest to file footer -### 6.3 Algorithm (per target language) +**Operator set** (Uncle Bob's spec): -1. Walk all source files; for each function/method/closure: - a. Parse to language-native AST - b. Collect all subtrees at every nesting depth - c. Normalize each subtree: replace identifiers → `_ID`, literals → `_LIT` - d. Hash each normalized subtree (FNV-1a or xxHash — fast, collision-resistant) - e. Record `(file, fn_name, start_line, end_line, set)` per function -2. Build inverted index: `hash → Vec` -3. Generate candidate pairs: any two functions sharing ≥ 1 hash -4. For each candidate pair: compute Jaccard `|A ∩ B| / |A ∪ B|` -5. Report pairs with Jaccard ≥ threshold (default 0.82), where both functions - pass the qualification gates (≥ 4 lines, ≥ 20 AST nodes) +| Category | Mutations | +|----------|-----------| +| Arithmetic | `+` ↔ `-`, `*` → `/` | +| Comparison | `>` ↔ `>=`, `<` ↔ `<=` | +| Equality | `==` ↔ `!=` | +| Boolean | `true` ↔ `false` | +| Logical | `&&` ↔ `||` | +| Constant | `0` ↔ `1` (inline, in expressions) | +| Unary | remove `-a` → `a`, remove `!a` → `a` | +| Null | replace return value with `null` / `None` | -### 6.4 Proposed CLI +**Manifest format** (identical across all three, language-native comments): +```python +# mutate4py-manifest: version=1 +# fn:compute_score hash=a3f9c1d2 lines=5-25 tested=2026-06-21 +# fn:validate_input hash=b7e2a4f1 lines=30-48 tested=2026-06-21 ``` -dry [options] [path ...] -``` - -| Flag | Default | Description | -|------|---------|-------------| -| `--threshold` | `0.82` | Minimum Jaccard similarity to report | -| `--min-lines` | `4` | Minimum source lines to qualify | -| `--min-nodes` | `20` | Minimum normalized AST nodes to qualify | -| `--lang` | auto-detect | `js`, `ts`, `py`, `rs` — force language | -| `--format` | `text` | `text` or `json` | -| `--exclude` | — | Glob patterns to exclude | - -Output format mirrors dry4go exactly (text and JSON) for drop-in compatibility. - -### 6.5 Distribution - -- Static binary (no runtime dependency) via `cargo build --release` -- Can be compiled to WASM via `wasm-pack` for npm distribution if needed -- Single binary covers all three target languages - -### 6.6 Scaling (validated) -At the worst-case Python monorepo service (scoring: 491 files, ~684 functions, ~350 -qualifying after gates): ~61,000 pairs, ~6 million hash comparisons — runs in under -1 second in Python; Rust will be significantly faster. -For JS monorepos with ~9,430 production files, the qualification gates reduce the -candidate set substantially; the inverted index further eliminates non-pairs cheaply. -No MinHash, LSH, or approximation needed at these scales. +```typescript +// mutate4js-manifest: version=1 +// fn:computeScore hash=a3f9c1d2 lines=5-25 tested=2026-06-21 +``` --- -## 7. Existing Tool Interfaces (for integration) - -### 7.1 StrykerJS (JS/TS Mutation) +## 5. Open Questions -**Config file:** `stryker.config.mjs` - -```js -export default { - testRunner: 'jest', // 'jest' | 'karma' | 'mocha' | 'command' - coverageAnalysis: 'perTest', // 'off' | 'all' | 'perTest' - thresholds: { - high: 80, // green above this - low: 60, // yellow between low and high; red below low - break: null, // exit 1 if score falls below; null = never fail - }, - mutate: ['src/**/*.ts'], - excludedMutations: [], // mutator names or category names -}; -``` +- Whether dry4go's 0.82 Jaccard threshold needs calibration for JS/Python codebases +- Whether `syn` or `tree-sitter-rust` is the better choice for mutate4rs normalization + (`syn` has higher semantic fidelity; `tree-sitter-rust` is consistent with the other parsers in drywall) +- Whether cargo-crap's pre-1.0 status is a blocker or acceptable for internal use -`coverageAnalysis: 'perTest'` is the correct setting — it gates each mutant to only -the tests that cover its line, equivalent to Uncle Bob's coverage-gated approach. +--- -**Key mutator categories:** ArithmeticOperator, EqualityOperator, LogicalOperator, -BooleanLiteral, BlockStatement, ConditionalExpression, UnaryOperator, UpdateOperator, -OptionalChaining, StringLiteral, ArrayDeclaration. +## Appendix A — LCOV Format Reference -### 7.2 mutmut (Python Mutation) +LCOV tracefile format is produced by Jest, Vitest, c8, nyc, pytest-cov, coverage.py, +cargo-llvm-cov, and cargo-tarpaulin. -**Run:** -``` -mutmut run [--use-coverage] [--paths-to-mutate src/] [--disable-mutation-types ] -``` +| Record | Syntax | Meaning | +|--------|--------|---------| +| `SF` | `SF:` | Opens a source file section | +| `FN` | `FN:[,],` | Function declaration | +| `FNDA` | `FNDA:,` | Function execution count | +| `BRDA` | `BRDA:,,,` | Branch edge (`taken='-'` means unreachable) | +| `BRF` | `BRF:` | Total branch records | +| `BRH` | `BRH:` | Branch records with taken > 0 | +| `DA` | `DA:,` | Line execution count | +| `end_of_record` | `end_of_record` | Closes a source file section | -**Mutation types available:** `operator`, `keyword`, `number`, `name`, `string`, -`fstring`, `argument`, `or_test`, `and_test`, `lambdef`, `expr_stmt`, `decorator`, -`annassign` +CRAP uses branch coverage: `cov(m) = BRH_in_function / BRF_in_function × 100`. +Dead branches (`taken='-'`) are excluded from both numerator and denominator. -**Cache:** `.mutmut-cache` (SQLite, project root) — delete to reset +**coverage.py requirement:** add `branch = True` to `.coveragerc` to emit BRDA records. -**Coverage gating:** `--use-coverage` flag reads coverage.py data; restricts mutations -to covered lines (equivalent to Uncle Bob's coverage gating) +--- -**Differential runs:** `--use-patch-file ` — mutate only lines in the patch +## Appendix B — Uncle Bob Reference CLIs -**Results:** +### crap4go ``` -mutmut results # print all results -mutmut result-ids survived # list surviving mutant IDs -mutmut show # show diff for a specific mutant +crap4go [--test-command ] [--max-workers ] [path-fragment ...] ``` +Deletes stale coverage → runs test command → parses LCOV + AST → prints CRAP per function, sorted worst first. -**Exit codes:** 0 = all killed, 1 = survivors exist or fatal error - -### 7.3 cargo-mutants (Rust Mutation) - -**Key flags:** +### dry4go ``` -cargo mutants [--in-diff ] [--git-base ] [--jobs ] +dry4go [--threshold 0.82] [--min-lines 4] [--min-nodes 20] [--format text|json] [path ...] ``` -- `--in-diff `: mutate only lines present in a unified diff -- `--git-base `: auto-generate diff from `git diff ..HEAD` -- `--jobs `: parallel workers - -**Operator gaps vs Uncle Bob:** inline constant swapping (`0↔1`) not supported; -no coverage gating. - -### 7.4 cargo-crap (Rust CRAP) - -**Workflow:** +### mutate4go ``` -cargo llvm-cov --lcov --output-path lcov.info -cargo crap --lcov lcov.info +mutate4go [--since-last-run] [--mutate-all] [--scan] [--test-command ] [--max-workers 1] path/to/file.go ``` - -Also accepts tarpaulin output. Does not natively read `llvm-cov` JSON. - ---- - -## 8. Open Questions - -1. **Should crap4js be patched or left as-is?** - The two gaps (line-vs-branch coverage, `?.` CC inflation) affect score accuracy. - Patching is low-effort (~50 LOC changes). Decision: patch or accept the deviation? - -2. **Should jscpd v5 be used alongside the proposed DRY binary?** - They catch different things (copy-paste vs semantic duplication). Running both adds - signal but also adds tooling surface area. Decision: one or both? - -3. **Should the Python CRAP script use branch or line coverage?** - Branch coverage is more accurate to Uncle Bob's intent but requires lcov to emit - `BRDA:` records — pytest-cov does this by default; coverage.py does too with - `branch = True` in `.coveragerc`. This is a one-time config addition, low risk. - -4. **What is the output format / threshold for the DRY binary?** - dry4go's threshold of 0.82 was tuned for Go. JS and Python have different idiom - frequencies — the threshold may need calibration on real codebases. - -5. **Should cargo-mutants' missing inline constant swapping be compensated?** - It can be supplemented by a custom mutant configuration. Is the gap material? - -6. **Where do the tools live?** - - Fork and patch crap4js in the addi org? Or submit upstream? - - Is the DRY Rust binary a new repo in the org? Which team owns it? - -7. **CI integration strategy:** - - Run on every commit? Every PR? Only on changed files? - - What are the thresholds that block merge vs warn only? - - Who owns the baseline — per-repo or org-wide? - ---- - -## 9. Knowns - -- crap4js exists at `/Users/gabadi/workspace/addi/crap4js` — audited, correct formula, - production-quality, two patchable gaps -- Uncle Bob's DRY algorithm is O(n²) over qualifying fragments; at all measured repo - scales (worst case: ~350 fragments), brute-force Jaccard runs in under 1 second -- LCOV is the universal coverage interchange format — all major test runners for - JS/TS, Python, and Rust produce it -- Mutation testing's bottleneck is always test execution time; the harness language - is irrelevant to performance -- Rust's OXC parser (21.6k stars, v1.70.0) is production-grade for JS/TS AST work - and WASM-distributable -- Go tree-sitter bindings require CGO, breaking the zero-dependency binary story - -## 10. Unknowns - -- Whether the `?.` CC inflation in crap4js is material in practice on the JS monorepo - (depends on how heavily optional chaining is used) -- Whether dry4go's 0.82 Jaccard threshold is appropriate for JS/Python codebases -- Whether `syn` (Rust AST) or `tree-sitter-rust` is the better choice for DRY - analysis of Rust source (syn has higher fidelity but is Rust-only; tree-sitter-rust - is consistent with the other language parsers) -- Actual build time of the proposed DRY Rust binary and whether WASM compilation - is needed or if a native binary suffices for all CI environments -- Whether cargo-crap's pre-1.0 status is a blocker or acceptable for internal use +Embedded manifest in source file footer. Differential by default when manifest exists. From 9374b7acc28a41238efadca6ed5d9256fa29a466 Mon Sep 17 00:00:00 2001 From: gabadi Date: Mon, 22 Jun 2026 23:40:35 -0300 Subject: [PATCH 66/67] fix(handoffd): notify only on new inbox write, not on duplicate delivery (#69) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * fix(handoff): remove merge_and_process.sh — upstream uses LLM directive Upstream's git_handoff body calls `merge_and_process ` as a natural language directive to the agent, not a shell command. No script exists upstream. We added merge_and_process.sh by mistake. Co-Authored-By: Claude Sonnet 4.6 * fix(handoffd): notify only on new inbox write, not on duplicate delivery Move notify! inside the when-not (fs/exists? target) guard so it fires only when a new inbox file is actually written. Previously notify! could fire on daemon restart if a handoff was already in the inbox (write skipped, notification still sent), causing a spurious /clear + /swarm-persona sequence to interrupt an agent mid-task. Co-Authored-By: Claude Sonnet 4.6 --------- Co-authored-by: Claude Sonnet 4.6 --- swarmforge/scripts/handoffd.bb | 6 +++--- swarmforge/scripts/merge_and_process.sh | 20 -------------------- 2 files changed, 3 insertions(+), 23 deletions(-) delete mode 100755 swarmforge/scripts/merge_and_process.sh diff --git a/swarmforge/scripts/handoffd.bb b/swarmforge/scripts/handoffd.bb index c18b2de..438c799 100755 --- a/swarmforge/scripts/handoffd.bb +++ b/swarmforge/scripts/handoffd.bb @@ -140,9 +140,9 @@ delivered (add-delivery-headers message recipient)] (fs/create-dirs (fs/parent target)) (when-not (fs/exists? target) - (spit (str target) (render-message (:headers delivered) (:body delivered)))) - (when-not (fs/exists? (fs/path (:worktree-path role-info) ".swarmforge" "agent-running")) - (notify! socket (:session role-info)))))) + (spit (str target) (render-message (:headers delivered) (:body delivered))) + (when-not (fs/exists? (fs/path (:worktree-path role-info) ".swarmforge" "agent-running")) + (notify! socket (:session role-info))))))) (move-with-collision path (fs/path (get-in roles [sender-role :worktree-path]) ".swarmforge" "handoffs" "sent")) diff --git a/swarmforge/scripts/merge_and_process.sh b/swarmforge/scripts/merge_and_process.sh deleted file mode 100755 index 4da7043..0000000 --- a/swarmforge/scripts/merge_and_process.sh +++ /dev/null @@ -1,20 +0,0 @@ -#!/usr/bin/env bash -set -euo pipefail - -# Merge sender's registered branch into the current worktree. -# Called by agents when a git_handoff payload arrives. -# Usage: merge_and_process -# Never uses --theirs or --ours; on conflict: stops and reports. - -SENDER_ROLE="${1?Usage: merge_and_process }" -CANONICAL_COMMIT="${2?Usage: merge_and_process }" - -echo "merge_and_process: merging ${CANONICAL_COMMIT} (from ${SENDER_ROLE})..." -if ! git merge --no-ff "${CANONICAL_COMMIT}"; then - echo "" >&2 - echo "CONFLICT: merge of ${CANONICAL_COMMIT} into $(git symbolic-ref --short HEAD) failed." >&2 - echo "Resolve conflicts manually (do NOT use --theirs or --ours), then re-run ready_for_next.sh." >&2 - exit 1 -fi - -echo "merge_and_process: done — $(git rev-parse --short=10 HEAD)" From b34cc4f647d4aef186be8c478177aed4a7f749b5 Mon Sep 17 00:00:00 2001 From: 2-gabadi Date: Tue, 23 Jun 2026 04:11:28 -0300 Subject: [PATCH 67/67] feat(setup-swarm): broaden git reset allow-rule to all forms The integrator/specifier auto-mode classifier blocks `git reset --hard` when the target is not a remote ref (e.g. HEAD or a sha). Broaden the pre-authorized allow-rule from `git reset --hard origin/*` to `git reset --hard*` so unattended in-role resets never prompt. Co-Authored-By: Claude Opus 4.8 (1M context) --- swarmforge/skills/setup-swarm/SKILL.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/swarmforge/skills/setup-swarm/SKILL.md b/swarmforge/skills/setup-swarm/SKILL.md index b0f2238..4c1b460 100644 --- a/swarmforge/skills/setup-swarm/SKILL.md +++ b/swarmforge/skills/setup-swarm/SKILL.md @@ -70,7 +70,7 @@ Write minimal allow-rules to `.claude/settings.json` so the integrator and speci "permissions": { "allow": [ "Bash(gh pr merge*)", - "Bash(git reset --hard origin/*)" + "Bash(git reset --hard*)" ] } } @@ -82,7 +82,7 @@ import json, pathlib p = pathlib.Path('.claude/settings.json') cfg = json.loads(p.read_text()) if p.exists() else {} cfg.setdefault('permissions', {}).setdefault('allow', []) -for rule in ['Bash(gh pr merge*)', 'Bash(git reset --hard origin/*)']: +for rule in ['Bash(gh pr merge*)', 'Bash(git reset --hard*)']: if rule not in cfg['permissions']['allow']: cfg['permissions']['allow'].append(rule) p.parent.mkdir(exist_ok=True)