Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
1727d84
feat(#401): TaskCreated empirical probe hook
michael-wojcik Apr 20, 2026
7aba150
feat(#401): variety_scorer teachback_mode/gates + threshold constants
michael-wojcik Apr 20, 2026
b3c0ade
feat(#401): export TEACHBACK_* constants from shared/__init__
michael-wojcik Apr 20, 2026
819307d
feat(#401): shared/teachback_example.py rejection templates
michael-wojcik Apr 20, 2026
3ce44d7
feat(#401): orchestrate.md variety scoring + propagation to agent tasks
michael-wojcik Apr 20, 2026
34bf9f3
refactor(#401): hoist _read_task_json from handoff_gate to shared/tas…
michael-wojcik Apr 20, 2026
4fe8684
feat(#401): variety scoring in comPACT/rePACT/plan-mode/peer-review/i…
michael-wojcik Apr 20, 2026
5395bd0
docs(#401): pact-ct-teachback + pact-variety + pact-s1-autonomy updates
michael-wojcik Apr 20, 2026
c309f1c
feat(#401): scripts/check_teachback_phase2_readiness.py diagnostic
michael-wojcik Apr 20, 2026
5528c0c
feat(#401): task_schema_validator TaskCreated hook
michael-wojcik Apr 20, 2026
a001df7
feat(#401): hooks.json bootstrap_gate < teachback_gate ordering drift…
michael-wojcik Apr 20, 2026
b564979
feat(#401): handoff_gate variety_dimensions sum check
michael-wojcik Apr 20, 2026
9afdda7
feat(#401): teachback_gate advisory mode + shared/teachback_scan
michael-wojcik Apr 20, 2026
7b52abd
feat(#401): teachback_idle_guard TeammateIdle hook + 4 journal event …
michael-wojcik Apr 20, 2026
2cb74a6
feat(#401): teachback state machine in teammate-bootstrap + peer_inje…
michael-wojcik Apr 20, 2026
b439ccf
docs(#401): strengthen task_schema_validator disk-read discipline cit…
michael-wojcik Apr 20, 2026
62301f5
feat(#401): close #7 Y1/Y2/Y3 — full content validation + state-trans…
michael-wojcik Apr 20, 2026
9fef521
docs(#401): move TaskCreated stdin investigation to pact-plugin/refer…
michael-wojcik Apr 20, 2026
2813184
test(#401): close teachback_idle_guard coverage gaps (81% -> 95%)
michael-wojcik Apr 20, 2026
42c50db
test(#401): comprehensive gate + validator coverage + counter-test-by…
michael-wojcik Apr 20, 2026
1fba699
test(#401): close teachback_scan coverage gaps (90% -> 96%) + item 3
michael-wojcik Apr 20, 2026
80ac049
fix(#401): emit teachback_gate_advisory on legacy teachback_check path
michael-wojcik Apr 20, 2026
1c9b210
fix(#401): strip role-marker chars from deny-reason placeholders
michael-wojcik Apr 20, 2026
f44343b
test(#401): close 4 coverage gaps (M1 signal carve-out + M3 drift exp…
michael-wojcik Apr 20, 2026
0898983
fix(#401): strengthen path sanitization + scanner fail-safe + content…
michael-wojcik Apr 20, 2026
45c7ff7
fix(#401): harden idle_guard sidecar + sanitize teammate_name + journ…
michael-wojcik Apr 20, 2026
9633eb3
fix(#401): close active-path bypass + Unicode normalize + emission gap
michael-wojcik Apr 20, 2026
4c4be3d
fix(#401): cycle-4 symmetric gate + task_id strip + content_invalid t…
michael-wojcik Apr 20, 2026
1e47462
fix(#401): replace enumerated-range Unicode strip with default-ignora…
michael-wojcik Apr 20, 2026
23bf746
fix(#401): replace DI approximation with full Unicode Default_Ignorab…
michael-wojcik Apr 20, 2026
aa436d8
fix(#401): cycle-5 mode drift test + error-msg + trigger split + doub…
michael-wojcik Apr 20, 2026
c34df56
fix(#401): simplify _normalize to single-pass DI strip (drop redundan…
michael-wojcik Apr 20, 2026
af53e5a
fix(#401): cycle-6 cleanup — auto_downgrade emission + docstring accu…
michael-wojcik Apr 20, 2026
ae824b5
fix(#401): cycle-7 integration-emit test + Unicode version label + te…
michael-wojcik Apr 20, 2026
3a1d0de
fix(#401): cycle-8 carve-out doc + 3 defense-in-depth sibling-symmetr…
michael-wojcik Apr 20, 2026
efa4180
docs(#401): cycle-9 universal teachback imperative — generative conte…
michael-wojcik Apr 20, 2026
af1f155
docs(#401): cycle-9b sharpen teachback imperative to hard-rule mandat…
michael-wojcik Apr 20, 2026
6a1a54f
docs(#401): cycle-10 blocking teachback semantics — teammate waits fo…
michael-wojcik Apr 20, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions pact-plugin/commands/comPACT.md
Original file line number Diff line number Diff line change
Expand Up @@ -168,7 +168,8 @@ See also: [Communication Charter](../protocols/pact-communication-charter.md) fo
When the task contains multiple independent items, invoke multiple specialists together with boundary context:

For each specialist needed:
1. `TaskCreate(subject="{specialist}: {sub-task}", description="comPACT mode (concurrent): You are one of [N] specialists working concurrently.\nYou are working in a git worktree at [worktree_path].\nNote: `CLAUDE.md` is gitignored and does not exist in worktrees. Do NOT edit or create `CLAUDE.md` — the orchestrator manages it separately. If your task mentions updating `CLAUDE.md`, flag it in your handoff instead.\n\nYOUR SCOPE: [specific sub-task]\nOTHER AGENTS' SCOPE: [what others handle]\n\nWork directly from this task description.\nIf upstream task IDs are provided, read via `TaskGet` for prior decisions.\nCheck docs/plans/, docs/preparation/, docs/architecture/ briefly if they exist.\nDo not create new documentation artifacts in docs/.\nStay within your assigned scope.\n\nTesting: New unit tests for logic changes. Fix broken existing tests. Run test suite before handoff.\n\nIf you hit a blocker, STOP and `SendMessage` it to the lead.\n\nTask: [this agent's specific sub-task]")`
1. `TaskCreate(subject="{specialist}: {sub-task}", description="comPACT mode (concurrent): You are one of [N] specialists working concurrently.\nYou are working in a git worktree at [worktree_path].\nNote: `CLAUDE.md` is gitignored and does not exist in worktrees. Do NOT edit or create `CLAUDE.md` — the orchestrator manages it separately. If your task mentions updating `CLAUDE.md`, flag it in your handoff instead.\n\nYOUR SCOPE: [specific sub-task]\nOTHER AGENTS' SCOPE: [what others handle]\n\nWork directly from this task description.\nIf upstream task IDs are provided, read via `TaskGet` for prior decisions.\nCheck docs/plans/, docs/preparation/, docs/architecture/ briefly if they exist.\nDo not create new documentation artifacts in docs/.\nStay within your assigned scope.\n\nTesting: New unit tests for logic changes. Fix broken existing tests. Run test suite before handoff.\n\nIf you hit a blocker, STOP and `SendMessage` it to the lead.\n\nTask: [this agent's specific sub-task]", metadata={"variety": {"novelty": N, "scope": N, "uncertainty": N, "risk": N, "total": N}, "required_scope_items": ["item-1", "item-2", ...], "phase": "CODE"})`
- Score this specialist's variety per the [orchestrate.md Per-Agent Variety Scoring section](orchestrate.md#per-agent-variety-scoring-dispatch-time). Each concurrent specialist gets its own score — scopes differ.
2. `TaskUpdate(taskId, owner="{specialist-name}")`
3. **Journal event**: Write `agent_dispatch` before spawning each specialist:
```bash
Expand Down Expand Up @@ -207,7 +208,8 @@ Use a single specialist agent only when:
- Conventions haven't been established yet (run one first to set patterns)

**Dispatch the specialist**:
1. `TaskCreate(subject="{specialist}: {task}", description="comPACT mode: Work directly from this task description.\nYou are working in a git worktree at [worktree_path].\nNote: `CLAUDE.md` is gitignored and does not exist in worktrees. Do NOT edit or create `CLAUDE.md` — the orchestrator manages it separately. If your task mentions updating `CLAUDE.md`, flag it in your handoff instead.\nIf upstream task IDs are provided, read via `TaskGet` for prior decisions.\nCheck docs/plans/, docs/preparation/, docs/architecture/ briefly if they exist.\nDo not create new documentation artifacts in docs/.\nFocus on the task at hand.\n\nTesting: New unit tests for logic changes (optional for trivial changes). Fix broken existing tests. Run test suite before handoff.\n\n> Smoke vs comprehensive tests: These are verification tests. Comprehensive coverage is TEST phase work.\n\nIf you hit a blocker, STOP and `SendMessage` it to the lead.\n\nTask: [user's task description]")`
1. `TaskCreate(subject="{specialist}: {task}", description="comPACT mode: Work directly from this task description.\nYou are working in a git worktree at [worktree_path].\nNote: `CLAUDE.md` is gitignored and does not exist in worktrees. Do NOT edit or create `CLAUDE.md` — the orchestrator manages it separately. If your task mentions updating `CLAUDE.md`, flag it in your handoff instead.\nIf upstream task IDs are provided, read via `TaskGet` for prior decisions.\nCheck docs/plans/, docs/preparation/, docs/architecture/ briefly if they exist.\nDo not create new documentation artifacts in docs/.\nFocus on the task at hand.\n\nTesting: New unit tests for logic changes (optional for trivial changes). Fix broken existing tests. Run test suite before handoff.\n\n> Smoke vs comprehensive tests: These are verification tests. Comprehensive coverage is TEST phase work.\n\nIf you hit a blocker, STOP and `SendMessage` it to the lead.\n\nTask: [user's task description]", metadata={"variety": {"novelty": N, "scope": N, "uncertainty": N, "risk": N, "total": N}, "required_scope_items": ["item-1", "item-2", ...], "phase": "CODE"})`
- Score this specialist's variety per the [orchestrate.md Per-Agent Variety Scoring section](orchestrate.md#per-agent-variety-scoring-dispatch-time). For low-variety single-specialist tasks, `required_scope_items` may be a single-item list.
2. `TaskUpdate(taskId, owner="{specialist-name}")`
3. **Journal event**: Write `agent_dispatch` before spawning:
```bash
Expand Down
4 changes: 2 additions & 2 deletions pact-plugin/commands/imPACT.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ These are orchestrator-side operations (agents report blockers via `SendMessage`
1. `TaskGet(blocker_id)` — understand the blocker context
2. Triage: redo prior phase? need specialist? need user?
3. On resolution path chosen:
- If delegating: `TaskCreate` resolution agent task
- If delegating: `TaskCreate(subject="{agent}: resolve {blocker}", description="...", metadata={"variety": {"novelty": N, "scope": N, "uncertainty": N, "risk": N, "total": N}, "required_scope_items": [...], "phase": "CODE"})` — resolution-agent tasks follow the same Per-Agent Variety Scoring contract as orchestrate.md agent dispatches. Score the resolution task (not the original blocked task); resolution scope may be narrower or wider depending on the triage outcome.
- If self-resolving: proceed directly
4. On resolution complete: `TaskUpdate(blocker_id, status="completed")`
5. Blocked agent task is now unblocked
Expand Down Expand Up @@ -156,6 +156,6 @@ When imPACT decides to redo a prior phase (e.g., "redo ARCHITECT because the des
2. **Create a new retry phase task**: `TaskCreate("ARCHITECT (retry): {feature-slug}")`
3. **Set retry task to `in_progress`**
4. **Block the current phase** (the one that hit the blocker): `TaskUpdate(currentPhaseId, addBlockedBy=[retryPhaseId])`
5. **Dispatch agent(s)** for the retry phase
5. **Dispatch agent(s)** for the retry phase via the same Per-Agent Variety Scoring contract from [orchestrate.md](orchestrate.md#per-agent-variety-scoring-dispatch-time): `TaskCreate(subject="{agent}: {retry-description}", description="...", metadata={"variety": {...}, "required_scope_items": [...], "phase": "<PHASE>"})`. Re-scored retry tasks often have higher uncertainty than the original (the blocker revealed an unknown) — score honestly.
6. **On retry completion**: `TaskUpdate(retryPhaseId, status="completed")` — unblocks the current phase
7. **Retry the current phase** with a new agent task using the updated outputs (re-dispatched agents will teachback their understanding before starting)
55 changes: 51 additions & 4 deletions pact-plugin/commands/orchestrate.md
Original file line number Diff line number Diff line change
Expand Up @@ -388,6 +388,49 @@ When a phase is skipped but a coder encounters a decision that would have been h

---

## Per-Agent Variety Scoring (Dispatch-Time)

Before each agent TaskCreate below, score the **agent-task's own variety** — not the feature variety inherited from the top-level assessment. An agent task frequently has variety different from the feature: a single-preparer research-only task within a high-variety feature may itself be medium variety; a single-architect design task within a low-variety feature may itself be high variety.

**Imperative**: every agent TaskCreate at a dispatch site below MUST include `metadata.variety` + `metadata.required_scope_items` at creation. This propagates the feature-variety hard gate down to the agent level so that `task_schema_validator.py` (TaskCreated hook) and `teachback_gate.py` (PreToolUse) have the data they need. The rejection message from `task_schema_validator.py` will instruct corrections if any field is missing.

**Score each dimension 1-4** using the same dimensions from the Task Variety Assessment above (Novelty, Scope, Uncertainty, Risk). Sum → `total`. Example invocation (inline Python):

```bash
python3 -c "
from shared.variety_scorer import gates_for_score, teachback_mode_for_score
import json
variety = {'novelty': 2, 'scope': 2, 'uncertainty': 1, 'risk': 2, 'total': 7}
print(json.dumps({
'mode': teachback_mode_for_score(variety['total']),
'gates': gates_for_score(variety['total']),
}))
"
```

**`required_scope_items`**: list of discrete scope items the agent must address in their teachback or handoff (each item is a short imperative phrase). For low-variety agent tasks, this may be a single-item list. For high-variety tasks, provide enough items (≥ 2) to trigger the full protocol in the gate — see TERMINOLOGY-LOCK.md and CONTENT-SCHEMAS.md for protocol-level semantics.

**`phase`**: the PACT phase name (`PREPARE` | `ARCHITECT` | `CODE` | `TEST`). The gate reads this for Q1 citation-strictness decisions.

**Carve-outs (do NOT write variety metadata)**:

- **Auditor** (`metadata.completion_type = "signal"`) — signal tasks bypass the gate by carve-out predicate
- **Secretary** (agent name in `_EXEMPT_AGENTS`) — exempt regardless of variety
- **Signal tasks** (blocker, algedonic, skipped, stalled, terminated) — bypass by predicate

The Per-Phase sections below call out each exception explicitly.

### Imperative-with-Explanation Framing (Q5 Phase 1)

The task description written into each agent TaskCreate MUST use imperative-with-explanation framing (NOT advisory "Please consider..." or "It would be helpful if..."). Open with the imperative verb, follow with the rationale.

- **Do**: `"Research authentication options for the dashboard. Rationale: the plan's open questions include 3 research items requiring evidence before architect decisions."`
- **Don't**: `"If you could look into authentication options, that would help us figure out the right approach."`

Advisory framing softens the protocol requirement and has been empirically observed to produce discretionary teammate compliance (honest-but-careless drift — see pact-ct-teachback.md Honest-Reframe section). The teachback gate binds on metadata presence + content shape, so task-description wording is not gate-enforced; but consistent framing across commands keeps the ritual floor in place.

---

### PREPARE Phase → `pact-preparer`

**Phase skip decision flow passed (all 3 layers)?** → Mark PREPARE `completed` with skip metadata and proceed to ARCHITECT phase.
Expand All @@ -397,8 +440,9 @@ When a phase is skipped but a coder encounters a decision that would have been h
- "Open Questions > Require Further Research"

**Dispatch `pact-preparer`**:
1. `TaskCreate(subject="preparer: research {feature}", description="CONTEXT: ...\nMISSION: ...\nINSTRUCTIONS: ...\nGUIDELINES: ...")`
1. `TaskCreate(subject="preparer: research {feature}", description="CONTEXT: ...\nMISSION: ...\nINSTRUCTIONS: ...\nGUIDELINES: ...", metadata={"variety": {"novelty": N, "scope": N, "uncertainty": N, "risk": N, "total": N}, "required_scope_items": ["item-1", "item-2", ...], "phase": "PREPARE"})`
- Include task description, plan sections (if any), and "Reference the approved plan at `docs/plans/{slug}-plan.md` for full context."
- Score preparer variety per the Per-Agent Variety Scoring section above. Populate `required_scope_items` with the discrete research questions/items the preparer must address.
2. `TaskUpdate(taskId, owner="preparer")`
3. **Journal event**: Write `agent_dispatch` before spawning:
```bash
Expand Down Expand Up @@ -485,11 +529,12 @@ When detection fires (score >= threshold), follow the evaluation response protoc
- "Interface Contracts"

**Dispatch `pact-architect`**:
1. `TaskCreate(subject="architect: design {feature}", description="CONTEXT: ...\nMISSION: ...\nINSTRUCTIONS: ...\nGUIDELINES: ...")`
1. `TaskCreate(subject="architect: design {feature}", description="CONTEXT: ...\nMISSION: ...\nINSTRUCTIONS: ...\nGUIDELINES: ...", metadata={"variety": {"novelty": N, "scope": N, "uncertainty": N, "risk": N, "total": N}, "required_scope_items": ["item-1", "item-2", ...], "phase": "ARCHITECT"})`
- Include task description, where to find PREPARE outputs (e.g., "Read `docs/preparation/{feature}.md`"), plan sections (if any), and plan reference.
- Include upstream task reference: "Preparer task: #{taskId} — read via `TaskGet` for research decisions and context."
- Do not read phase output files yourself or paste their content into the task description.
- If PREPARE was skipped: pass the plan's Preparation Phase section instead.
- Score architect variety per the Per-Agent Variety Scoring section above. Populate `required_scope_items` with the discrete design decisions the architect must address.
2. `TaskUpdate(taskId, owner="architect")`
3. **Journal event**: Write `agent_dispatch` before spawning:
```bash
Expand Down Expand Up @@ -606,13 +651,14 @@ JSON
**Dispatch coder(s)**:

For each coder needed:
1. `TaskCreate(subject="{coder-type}: implement {scope}", description="CONTEXT: ...\nMISSION: ...\nINSTRUCTIONS: ...\nGUIDELINES: ...")`
1. `TaskCreate(subject="{coder-type}: implement {scope}", description="CONTEXT: ...\nMISSION: ...\nINSTRUCTIONS: ...\nGUIDELINES: ...", metadata={"variety": {"novelty": N, "scope": N, "uncertainty": N, "risk": N, "total": N}, "required_scope_items": ["item-1", "item-2", ...], "phase": "CODE"})`
- Include task description, where to find ARCHITECT outputs (e.g., "Read `docs/architecture/{feature}.md`"), plan sections (if any), plan reference.
- Include upstream task references: "Architect task: #{taskId} — read via `TaskGet` for design decisions." If multiple coders are dispatched concurrently, include peer names: "Your peers on this phase: {other-coder-names}."
- Do not read phase output files yourself or paste their content into the task description.
- If ARCHITECT was skipped: pass the plan's Architecture Phase section instead.
- If PREPARE/ARCHITECT were skipped, include: "PREPARE and/or ARCHITECT were skipped based on existing context. Minor decisions (naming, local structure) are yours to make. For moderate decisions (interface shape, error patterns), decide and implement but flag the decision with your rationale in the handoff so it can be validated. Major decisions affecting other components are blockers—don't implement, escalate."
- Include: "Smoke Testing: Run the test suite before completing. If your changes break existing tests, fix them. Your tests are verification tests—enough to confirm your implementation works. Comprehensive coverage (edge cases, integration, E2E, adversarial) is TEST phase work."
- Score this coder's variety per the Per-Agent Variety Scoring section above (each coder in a parallel set gets its own score — their scopes differ). Populate `required_scope_items` with the discrete implementation items this coder must address.
2. `TaskUpdate(taskId, owner="{coder-name}")`
3. **Journal event**: Write `agent_dispatch` before spawning each coder:
```bash
Expand Down Expand Up @@ -735,9 +781,10 @@ Execute the [CONSOLIDATE Phase protocol](../protocols/pact-scope-phases.md#conso
- "Coverage Targets"

**Dispatch `pact-test-engineer`**:
1. `TaskCreate(subject="test-engineer: test {feature}", description="CONTEXT: ...\nMISSION: ...\nINSTRUCTIONS: ...\nGUIDELINES: ...")`
1. `TaskCreate(subject="test-engineer: test {feature}", description="CONTEXT: ...\nMISSION: ...\nINSTRUCTIONS: ...\nGUIDELINES: ...", metadata={"variety": {"novelty": N, "scope": N, "uncertainty": N, "risk": N, "total": N}, "required_scope_items": ["item-1", "item-2", ...], "phase": "TEST"})`
- Include task description, coder task references (e.g., "Coder tasks: #{id1}, #{id2} — read via `TaskGet` for implementation decisions and flagged uncertainties"), plan sections (if any), plan reference.
- Include: "You own ALL substantive testing: unit tests, integration, E2E, edge cases."
- Score test-engineer variety per the Per-Agent Variety Scoring section above. Populate `required_scope_items` with the discrete test categories (unit/integration/E2E/edge/security) that must be addressed.
2. `TaskUpdate(taskId, owner="test-engineer")`
3. **Journal event**: Write `agent_dispatch` before spawning:
```bash
Expand Down
5 changes: 3 additions & 2 deletions pact-plugin/commands/peer-review.md
Original file line number Diff line number Diff line change
Expand Up @@ -143,7 +143,8 @@ Select the domain coder based on PR focus:
**Dispatch reviewers**:

For each reviewer:
1. `TaskCreate(subject="{reviewer-type}: review {feature}", description="Review this PR. Focus: [domain-specific review criteria]...")`
1. `TaskCreate(subject="{reviewer-type}: review {feature}", description="Review this PR. Focus: [domain-specific review criteria]...", metadata={"variety": {"novelty": N, "scope": N, "uncertainty": N, "risk": N, "total": N}, "required_scope_items": ["item-1", "item-2", ...], "phase": "TEST"})`
- Score this reviewer's variety per the [orchestrate.md Per-Agent Variety Scoring section](orchestrate.md#per-agent-variety-scoring-dispatch-time). Peer review is where "teachback OPTIONAL" softening has been empirically observed (RISK-MAP.md §F11) — the teachback gate enforces the ritual at metadata presence, not task-description wording. Score honestly; do NOT soften variety to bypass the gate. Use `phase: "TEST"` since review is quality-shaped.
2. `TaskUpdate(taskId, owner="{reviewer-name}")`
3. Spawn the reviewer with the canonical dispatch form. The `prompt` MUST lead with the `YOUR PACT ROLE: teammate ({reviewer-name})` marker on its own line and include the `Skill("PACT:teammate-bootstrap")` YOUR FIRST ACTION directive so routing defense-in-depth delivers the teammate bootstrap at spawn:

Expand Down Expand Up @@ -186,7 +187,7 @@ This is the **primary memory trigger** — fires unconditionally at reviewer dis
Each reviewer should state their understanding of the PR's intent before diving into review. This catches cases where a reviewer misunderstands the purpose and produces irrelevant findings.

**Mechanism**: Include in each reviewer's task description:
> "Before reviewing, send a teachback message to the lead stating your understanding of what this PR is trying to accomplish and what you'll focus on in your domain. Format: `[{sender}→lead] Teachback: I understand this PR is [intent]. Reviewing with focus on [domain focus]. Proceeding unless corrected.` Non-blocking — proceed with review after sending."
> "Before reviewing, send a teachback message to the lead stating your understanding of what this PR is trying to accomplish and what you'll focus on in your domain. Format: `[{sender}→lead] Teachback: I understand this PR is [intent]. Reviewing with focus on [domain focus]. Halting until you send teachback_approved.` Do NOT begin reviewing until the lead writes `teachback_approved` to your task metadata."

This uses the same teachback mechanism as agent handoffs. Background: [pact-ct-teachback.md](../protocols/pact-ct-teachback.md).

Expand Down
Loading