FrkAk · FrkAk · Jun 19, 2026 · Jun 12, 2026 · Jun 12, 2026 · Jun 12, 2026
@@ -119,8 +119,8 @@ In Codex, Cursor, and Antigravity each workflow is a skill invoked by slash comm
 
 | Component | What it does |
 | --- | --- |
-| **`/piyaz:composer` skill** | End-to-end task orchestrator. Picks the highest-value ready task (or one named ref), drives it through research → plan → implement → propagate via three dispatched subagents per task in clean per-phase contexts, loops until queue empty or user stops. Requires `/goal` harness for backlog mode (composer emits it on first turn; user pastes). |
-| **Composer subagents** | `piyaz:composer-researcher` gathers grounded context and refines the task; `piyaz:composer-planner` writes the unabridged implementation plan; `piyaz:composer-implementer` ships the code, opens a PR, and marks the task done. |
+| **`/piyaz:composer` skill** | End-to-end task orchestrator. Picks the highest-value ready task (or one named ref), drives it through research → plan → implement → review → propagate via a per-task workflow that dispatches phase subagents in clean per-phase contexts, merges the PR and continues when the user authorizes it, and loops until queue empty or user stops. Requires `/goal` harness for backlog mode (composer emits it on first turn; user pastes). |
+| **Composer subagents** | `piyaz:composer-researcher` gathers grounded context and refines the task; `piyaz:composer-planner` writes the unabridged implementation plan; `piyaz:composer-implementer` ships the code, opens a PR, and marks the task `in_review`; `piyaz:review` returns the verdict that drives the bounded fix loop. |
 | **`piyaz:decompose-task` agent** | Splits an existing oversize task in an active project into 2 to N children, rewires every dependency edge touching the parent, cancels the parent with rationale citing the children. Composer's oversize handler routes here. |
 | **`piyaz:decompose-feature` agent** | Adds a new feature or capability cluster to an active project. Reuses existing categories and tag vocabulary; creates 5 to 20 tasks plus internal and integration edges. |
 
@@ -169,7 +169,7 @@ Piyaz ships as a Next.js web app plus vendor-native plugins for Claude Code, Cod
 ❯ Priority is urgent, draft ACs are enough, and monorepo detection should ask the user.
 ```
 
-**Drive end-to-end (Claude Code).** Once a project is active and tasks are ready, composer can take over. Pick the next task off the critical path, research it in context, plan it, implement it, open the PR, propagate the result, and loop:
+**Drive end-to-end (Claude Code).** Once a project is active and tasks are ready, composer can take over. Pick the next task off the critical path, research it in context, plan it, implement it, open the PR, review and fix until it is ready, propagate the result (and merge when you authorize it), and loop:
 
 ```text
 ❯ /piyaz:composer
@@ -181,7 +181,7 @@ Or take one specific task all the way to a PR:
 ❯ /piyaz:composer PYZ-101
 ```
 
-Composer dispatches three subagents per task in clean per-phase contexts (researcher → planner → implementer). The orchestrator stays out of the work itself and only picks tasks, hands off, and propagates.
+Composer runs a per-task workflow that dispatches phase subagents in clean per-phase contexts (researcher → planner → implementer → review), with a bounded fix loop until the PR is ready. The orchestrator stays out of the work itself: it picks tasks, resolves gates, merges when authorized, and propagates.
 
 **Tune in the UI.** Inspect edges, read execution records, and edit descriptions, ACs, tags, or dependencies directly. The agent loop and the UI write to the same store, so edits land by the next tool call.
 

@@ -23,7 +23,8 @@
       "!cloudflare-env.d.ts",
       "!bun.lock",
       "!migrations/**",
-      "!drizzle/**"
+      "!drizzle/**",
+      "!plugins/**/workflows/**"
     ]
   },
   "formatter": {

@@ -88,6 +88,10 @@ export function formatDecisions(decisions: Decision[]): string {
  * - all checked: "All criteria met:" label followed by the checked list
  * - mixed: "Remaining:" section first (primacy for pending work), then "Done:"
  *
+ * Each line carries the criterion's backticked id so agents can target the
+ * documented by-id rewrite (`acceptanceCriteria=[{id, text}]`) without
+ * appending duplicates.
+ *
  * @param criteria - Array of acceptance criteria.
  * @returns Formatted checklist string, possibly grouped by checked state.
  */
@@ -97,8 +101,9 @@ export function formatCriteria(criteria: AcceptanceCriterion[]): string {
   const remaining = criteria.filter((c) => !c.checked);
   const done = criteria.filter((c) => c.checked);
   const renderRemaining = () =>
-    remaining.map((c) => `- [ ] ${c.text}`).join("\n");
-  const renderDone = () => done.map((c) => `- [x] ${c.text}`).join("\n");
+    remaining.map((c) => `- [ ] \`${c.id}\` ${c.text}`).join("\n");
+  const renderDone = () =>
+    done.map((c) => `- [x] \`${c.id}\` ${c.text}`).join("\n");
 
   if (done.length === 0) return renderRemaining();
   if (remaining.length === 0) return `All criteria met:\n${renderDone()}`;

@@ -686,7 +686,7 @@ export function createMcpServer(ctx: AuthContext): McpServer {
     {
       name: "piyaz",
       title: "Piyaz",
-      version: "0.1.0",
+      version: "0.1.1",
       websiteUrl: "https://www.piyaz.ai",
       icons: [
         {

@@ -1,5 +1,5 @@
 {
   "name": "piyaz",
-  "version": "0.1.0",
+  "version": "0.1.1",
   "description": "Persistent context network for coding projects. Tracks tasks, dependencies, and decisions across sessions."
 }
@@ -0,0 +1,144 @@
+# Reviewer rules (composer Phase 4 extract)
+
+Slim extract of the canonical piyaz references for the review agent.
+Mirrors: `skills/piyaz/references/conventions.md` §1,
+`skills/piyaz/references/lifecycle.md` §2.2, §2.3, §2.4, §3, and
+`skills/piyaz/references/artifacts.md` §1 (`executionRecord`,
+`decisions`), §6. Headings carry their canonical file and section number
+so citations like `lifecycle §2.2` resolve unambiguously. When editing a
+mirrored section, edit BOTH files.
+
+The reviewer verifies the Completion Protocol was honored; it does not
+execute it. §2.2 and §2.3 below are what the implementer was required to
+do; §3 is what the orchestrator runs after your verdict, fed by your
+downstream-impact list.
+
+---
+
+## conventions §1 — The Iron Law of grounding
+
+```
+Never write what you cannot cite or do not know.
+```
+
+Applies wherever an agent generates `executionRecord`, `decisions`, `description`, or `files`. For the reviewer it applies to the verdict: every finding cites a real file path and line, every AC evaluation cites the diff or the executionRecord. When uncertain, write less. A short, true verdict is more valuable than a rich, fabricated one.
+
+---
+
+## lifecycle §2.2 — Populate the required fields
+
+`executionRecord`, `decisions`, `files`, `acceptanceCriteria`, plus `prUrl` when a PR was opened (backend upserts a `task_links` row with `kind='pull_request'` so the review subagent and detail UI can resolve the PR). The MCP server returns `_hints` if any are missing.
+
+For pure spec-review / docs / decision-only / Piyaz-only refinement tasks that touched no repo files, `files=[]` is the correct positive answer to "what changed in the repo?", not the absence of an answer.
+
+## lifecycle §2.3 — Open a PR if the work changed code (what the implementer owed)
+
+If `files` is non-empty AND the work was a real code change (not research, not decision-only, not Piyaz-only refinement), the implementer must have opened a PR:
+
+- PR body follows the repo's PR template when one exists (`.github/PULL_REQUEST_TEMPLATE.md` and variants), the canonical concise default otherwise.
+- The `taskRef` appears in `[BRACKETS]` (e.g. `[MYMR-83]`) exactly once, for the ONE primary task the PR builds. Bracket form triggers Piyaz PR-status tracking. Related tasks are referenced as plain links, no brackets.
+- Summary maps from `executionRecord` (2 to 3 sentences); test plan maps from checked `acceptanceCriteria`; notes-for-reviewer maps from `decisions`.
+- Sections are concise; empty optional sections beat fabricated content.
+
+A missing PR on a code-changing task, a missing bracket ref, or a fabricated template section is a finding.
+
+## lifecycle §2.4 — Skip the PR for these task types
+
+A missing PR is legitimate (not a finding) for:
+
+- Research / investigation tasks (no code change).
+- Decision-only tasks.
+- Pure-Piyaz refinement tasks (no repo changes).
+- Tasks the user explicitly said "no PR" on.
+- Data and BA work without a code repo (dashboard tweaks, workbooks, metric sign-offs, ad-hoc SQL attached to a ticket). The deliverable lives outside git; the artifact link or path belongs in `executionRecord` and `files`. When the data work IS in a git repo (a dbt project, a versioned SQL or notebook repo), the standard PR rules apply.
+
+---
+
+## lifecycle §3 — Propagate after every change (Iron Law)
+
+```
+A change that does not propagate did not happen.
+```
+
+The graph is Piyaz's value. Skip once and it lies: ready tasks that aren't ready, blockers pointing at shipped work, every future session picking the wrong next step.
+
+After any status change or significant refinement:
+
+1. `piyaz_query type='edges'` on the changed task. Current relationships.
+2. `piyaz_analyze type='downstream'`. Who depends on this task.
+3. For each downstream task, evaluate:
+   - Do edge notes need updating to reflect new decisions?
+   - Are there NEW relationships revealed by this change?
+   - Are there STALE relationships that no longer hold?
+   - Do downstream descriptions need updating based on the decisions made?
+4. Create, update, or remove edges as needed.
+
+The reviewer does not execute propagation. Your downstream-impact list names the edges that will need attention; the orchestrator (or the human) executes the rewires.
+
+---
+
+## artifacts §1 — Task artifact quality
+
+### `executionRecord` (only on `in_review`, `done`, and `cancelled`)
+
+The implementer writes this field at the `in_review` transition; you verify it against the diff.
+
+- **Length:** 3 to 5 sentences.
+- **Distinct from `description`:** description = scope + role; executionRecord = HOW it was built (or WHY it was abandoned).
+- **Include:** function names, file paths, endpoints, data formats.
+- **Exclude:** debugging stories, false starts, filler.
+- **For `cancelled`:** rationale (why abandoned), approaches tried, decisions learned. Same shape as a done record, just for non-shipping outcomes.
+- **Draft tasks must NOT carry an `executionRecord`.** That field implies the task shipped.
+
+### `decisions`
+
+One-liner per decision. Format: **CHOICE + WHY**.
+
+```
+GOOD (web): "Chose Redis for refresh tokens. Need fast revocation lookups."
+GOOD (sim): "Use std::vector for the Queue backing storage. Cheap front() lookup, fast tail insert; spec is silent on container choice."
+
+BAD: "Used Drizzle"
+BAD: "We picked Redis because it's good"
+BAD: "Decided to do it that way"
+```
+
+Never invent. An implementer `decisions` entry that is not grounded in the diff, the plan, or the conversation is a finding.
+
+---
+
+## artifacts §6 — Markdown formatting and tone
+
+Applies to everything you write into the verdict.
+
+### Structure
+
+- Bullet lists (`-`) for 3 or more items. Never run-on prose.
+- Backticks for code references: file paths, function names, endpoints, variables, package names.
+- Paragraph breaks between distinct topics.
+
+### Tone: never sound like AI
+
+**Do not use:**
+
+- Em dashes (the `—` character). Use periods, commas, parentheses, or colons.
+- Hedging openers: "I think", "perhaps", "seems to", "might be", "arguably".
+- Enthusiasm: "Great question", "Awesome", "Exciting", "Love this".
+- Throat-clearing: "Let me dive into", "I hope this helps", "Here's the thing", "To be honest".
+- Marketing words: "comprehensive", "robust", "powerful", "leverage", "utilize", "ensure", "facilitate", "seamless", "game-changer", "best-in-class".
+- Adverb-heavy openers: "Importantly", "Crucially", "Notably", "Essentially", "Basically".
+- Empty filler: "It's worth noting that", "It should be mentioned", "As a matter of fact".
+- Performative summaries at the end: "I hope this helps!", "Let me know if you need anything else!"
+
+**Do:**
+
+- Subject, verb, object.
+- Active voice.
+- Concrete over abstract. "Adds 50ms p99" beats "improves performance".
+- Specific over vague. "Stripe webhook handler" beats "payment integration".
+- Cut adverbs.
+- One idea per sentence.
+
+### Length
+
+Concision over padding. No filler, no repetition. The rule is "no fluff", not "no length".
@@ -150,7 +150,7 @@ You handle most Piyaz interactions inline. The four agents are escalations for h
 | Decompose a project: large, multi-domain, or sensitive | Dispatch **`piyaz:decompose`** for the gated 4-phase pipeline |
 | Split a single existing oversize task into children within an active project ("split this task", "decompose HGT-17", composer's oversize handler) | Dispatch **`piyaz:decompose-task`** for the gated split + edge-rewiring + parent-cancel pipeline |
 | Add a new feature or capability cluster to an active project ("add a feature for X", "decompose this idea into tasks", "extend the project with Y") | Dispatch **`piyaz:decompose-feature`** for the gated feature-addition pipeline |
-| Drive tasks end-to-end through research + plan + implement + review + propagate ("ship the backlog", "run the next task", "compose through my queue", "loop through piyaz tasks", a named task ref to take all the way to a PR) | Suggest user invoke **`/piyaz:composer`** (backlog mode) or **`/piyaz:composer <taskRef>`** (single-task mode). Composer is a slash-command skill that orchestrates four dispatched subagents per task in clean per-phase contexts; the user has to type the slash command (and paste the `/goal` harness composer emits on first turn) for it to start. |
+| Drive tasks end-to-end through research + plan + implement + review + propagate ("ship the backlog", "run the next task", "compose through my queue", "loop through piyaz tasks", a named task ref to take all the way to a PR) | Suggest user invoke **`/piyaz:composer`** (backlog mode), **`/piyaz:composer <taskRef>`** (single-task mode), or **`/piyaz:composer rework <taskRef|pr-url>`** (round GitHub review feedback back through the fix loop). Composer is a slash-command skill that orchestrates four dispatched subagents per task in clean per-phase contexts; the user has to type the slash command for it to start; composer then runs continuously and stops on structural conditions (queue drained, failure budget, user stop). |
 | Review an `in_review` task or a PR by URL ("review LNS-12", "review this PR", "review `<PR URL>`", "what does the review subagent think of LNS-12") | Dispatch **`piyaz:review`** for a five-lens structured verdict (`approve` / `request-changes` / `block`). The verdict is advisory; HOTL still owns the `in_review → done` transition on GitHub. |
 | Status, next task, mark done, plan a draft, refine, dispatch, create or delete task | Handle inline. **Do not** dispatch `piyaz:manage` for these; they are day-to-day. |
 | Strategic review, rebalance the graph, audit dependencies, prune orphans, connect missing edges, audit blockers, consolidate categories or tags, graph-health check, "is this project on track?" | Dispatch **`piyaz:manage`** for deep CTO mode |
@@ -189,7 +189,7 @@ Lead with slim tools.
    - `piyaz_analyze type='plannable'`. Drafts ready to plan.
    - Pick one on the critical path. **§ Plan a draft task**.
 
-**For end-to-end automation across the queue:** suggest `/piyaz:composer` (backlog mode). Composer picks the highest-value ready task each iteration, drives it through research + plan + implement + propagate via dispatched subagents in clean per-phase contexts, then loops until the queue is empty or the user stops. The user paces it via `/goal` (composer emits the harness on first turn; user pastes it). Use this when the user wants the queue shipped without picking each task manually; use the inline picker above when the user wants per-task agency.
+**For end-to-end automation across the queue:** suggest `/piyaz:composer` (backlog mode). Composer picks the highest-value ready task each iteration, drives it through research + plan + implement + review + propagate via dispatched subagents in clean per-phase contexts, then loops until the queue is empty or the user stops. When HOTL requests changes on a composer PR instead of merging, `/piyaz:composer rework <taskRef|pr-url>` rounds that feedback back through the fix loop. It runs continuously without per-task check-ins, gates only on genuine decisions (oversize tasks, proposed rewrites, open questions), runs a bounded review→fix loop per task, and stops structurally when the queue drains or the user says stop. Use this when the user wants the queue shipped without picking each task manually; use the inline picker above when the user wants per-task agency.
 
 ### Refine a task
 

@@ -4,6 +4,8 @@ Quality bar for everything an agent writes into Piyaz: titles, descriptions, acc
 
 Agents read this file when about to create, refine, or audit an artifact. The Iron Law of grounding (`conventions.md` §1) applies at every step.
 
+> Sections of this file are mirrored by the composer phase extracts in the claude-code plugin (`plugins/claude-code/skills/composer/references/`); when you edit a mirrored section, update those extracts and bump the pin in their `sources.json`.
+
 ## Contents
 
 - §1 Task artifact quality: title, description, acceptanceCriteria, executionRecord, decisions, files
@@ -140,7 +142,7 @@ BAD:
 
 Single-AC tasks are rejected. Tasks with vague ACs ("works correctly", "is complete", "performs well") are rejected.
 
-### `executionRecord` (only on `done` and `cancelled`)
+### `executionRecord` (only on `in_review`, `done`, and `cancelled`)
 
 - **Length:** 3 to 5 sentences.
 - **Distinct from `description`:** description = scope + role; executionRecord = HOW it was built (or WHY it was abandoned).
@@ -186,7 +188,7 @@ Never invent. If a decision is not grounded in conversation, code, or the artifa
 
 ## 2. Tag dimensions and first-class fields
 
-Every task, in every status, must carry tags across the three tag dimensions below. Reuse existing tags from `piyaz_query type='overview'` before coining new ones.
+Every task, in every status, must carry tags across the three tag dimensions below. Reuse existing tags from `piyaz_query type='meta'` before coining new ones.
 
 | Dimension | Count | Vocabulary |
 |---|---|---|
@@ -409,29 +411,6 @@ The text you write into Piyaz is read by other engineers. It must read like an e
 - Cut adverbs.
 - One idea per sentence.
 
-### Em-dash replacements
-
-```
-BAD  (web):     "Custom auth — months of work — is off the table."
-GOOD:           "Custom auth is off the table. Months of work, easy to leak data."
-
-BAD  (web):     "The API uses Bearer tokens — validated against the users table."
-GOOD:           "The API validates Bearer tokens against the users table."
-
-BAD  (sim):     "Rejected — see line 42 of the spec."
-GOOD:           "Rejected. See line 42 of the spec."
-
-BAD  (agentic): "The agent loop dispatches tools — validated against the
-                 registry — then streams the model output."
-GOOD:           "The agent loop validates each tool against the registry
-                 before dispatching, then streams the model output."
-
-BAD  (firmware):"BMP280 returns 0xFF — the i2c clock-stretch fix is not
-                 backported."
-GOOD:           "BMP280 returns 0xFF. The i2c clock-stretch fix is not
-                 backported."
-```
-
 ### Length
 
 Concision over padding. No filler, no AI throat-clearing, no repetition. But do not sacrifice clarity for brevity. If a task genuinely needs 6 to 8 sentences in its description because the architecture has multiple components, the bug has a complex cause, or the research question is multi-part, write them. The rule is "no fluff", not "no length". A 6-sentence description that helps a reader is better than a 2-sentence one that loses them.