A workflow seed for guiding AI agents through long-running, layered engineering work. Each iteration is one git commit that lands code and state together, so a fresh session can cold-start from the repo alone and pick up exactly where the last loop ended.
This repo is the seed — two instruction files that an agent reads to bootstrap and then operate a workstream. It is not itself a project; copy or reference these files when starting a new one.
Apply when work is:
- Layered + iterative — a primitive refined against a spec, a runtime growing op-by-op, a framework with settling contracts.
- Checkpointed across sessions — context resets between runs; the next session must reconstruct state from git + tracked files.
- Quality-gated, not vibed — every closed task names a verified behavior (Definition of Done with explicit Claims).
Do not apply to throwaway scripts, reactive bug queues without a roadmap, or research prototypes you intend to discard.
refinement-loop/
├── README.md # this file
└── instructions/
├── INIT.md # one-shot bootstrap walk-through
└── SEED.md # the loop protocol reference
A seven-question script the agent runs once when adopting the protocol on a new (or existing) workstream. The questions cover:
| # | Topic | What it decides |
|---|---|---|
| Q0 | Git state precheck | Reuse existing repo, or git init -b main |
| Q1 | Spec maturity | Starting campaign: idea → C0, partial → C1, locked → C2 |
| Q2 | Communication channels | Whether to create INBOX.md + carved-out cross-tree paths |
| Q3 | Agent file name | CLAUDE.md / AGENTS.md / .cursorrules / custom |
| Q4 | Tooling hooks | Build / test / validate / single-file-run commands |
| Q5 | Hierarchy depth | Small (consolidate SPEC into PLAN) vs full-depth (separate) |
| Q6 | First campaign + phase | Seeds the campaign table and first phase header |
After answers are confirmed, INIT emits the artifact spine
(SPEC.md, PLAN.md, TASKS.md, <agent>.md, .gitignore,
archive/), makes the seed commit, verifies the spine reads back
correctly, and deletes itself. Tasks are not populated during
init — that is the first real loop's job.
The durable rulebook the agent re-reads as needed. Defines:
The hierarchy
Spec / north-star (locked, partial, or absent)
└─ Campaign (a verifiable layer of completeness)
└─ Phase (themed sequential work with gates)
└─ Task DAG (per phase)
└─ DoD with Claims (per task — what is verified)
└─ Loop (one commit)
Tracked artifacts
| File | Owns | Update cadence |
|---|---|---|
SPEC.md |
North-star outcome + spec source pointer | Rare (scope or amendment) |
PLAN.md |
Campaign list + active phase DAG | Each phase / campaign rotate |
TASKS.md |
Current phase's task DAG | Each loop |
<agent>.md |
Loop scratchpad — gitignored | Live, as needed |
archive/ |
Frozen phases, campaign outcomes, amendments | Rotation events only |
The five-movement loop (every iteration, one commit)
- Cold-start (≤5 min) — read
git log, back-fill any(this commit)hash placeholders, re-confirm active phase, find the next claimable leaf, read inboxes if any. - Identify + plan — pick the leaf, run a spec-coverage check,
draft DoD with a one-line
Risk:, mark[~]. - Implement — smallest correct change for the DoD; mid-loop
discoveries go to
Discovered, not into the active task. - Verify and collapse — walk the Claims literally; each needs a passing assertion. Collapse closed tasks to one-liners.
- Commit — subtree-isolation check (no foreign paths), stage
explicit paths only (never
-A), follow the commit-message conventions table.
The DoD contract. A task is [x] only when every Claim has a
passing assertion. The DoD template carries Why / Deps / Scope / Risk / Claims / Out of scope / DoD fields. Out of scope is
load-bearing — it pre-empts mid-loop scope creep.
Granularity — three patterns with explicit gates: split a task before claim when verification layers + cross-cutting + new test shape all appear; batch 2–3 small coupled tasks into one loop; batch-full-phase only when risk is clean across every task and the design is fully specified.
Pre-rotation triage. Before rotating a phase or campaign:
re-read the end-state, walk Discovered for thematic fits, walk
inboxes, verify spec coverage. Phases close on outcome, not
calendar.
Spec contradiction surfacing. When in-flight work reveals the
spec is wrong or impractical, stop and write an amendment file
under archive/amendments/; user accepts (commit a separate
amend SPEC change) or rejects (file lives under rejected/).
The bar rises by stage: low in C0–C1, high in C2, very high in
C3+.
Discovered is a triage queue, not a backlog. Cap ~12; entries are promoted to numbered tasks, dropped with reason, or kept with a date.
Communication channels (optional). When a workstream has
siblings or a parent, three channels coordinate: per-pair
INBOX.md, shared append-only CROSS.md, and a single
<bridge>/ composition site. Cross-tree writes are restricted to
a carved-out path list documented in <agent>.md.
- Copy or reference
instructions/SEED.mdandinstructions/INIT.mdinto the workstream you want to bootstrap. - Have the agent read
INIT.mdand run the seven-question walk-through. - Once
INIT.mdself-deletes, the workstream is in compliance withSEED.mdand ready for its first real loop.
SEED.md stays alongside the workstream as the durable reference.
INIT.md only exists during bootstrap.