Prevent prose-output ↔ tool-call-belief conflation in HANDOFF/teachback writes

## Pattern observed (PR #641 cycle, 3 in-cycle instances)

The HANDOFF / teachback / stage-ready failure mode where agents conflate **composing prose in their response turn** with **invoking the tool call** that actually persists state. Three instances surfaced in PR #641:

1. **test-engineer's stage-ready** claimed `git status A pact-plugin/tests/test_628_coverage.py` (added) while reality showed `??` (untracked, never `git add`'d). Walked back when lead grep'd file class names and they weren't found.
2. **backend-coder's first HANDOFF claim** asserted "HANDOFF written for Task #26 ... canonical 6 fields ... SET intentional_wait{awaiting_lead_completion}" while `metadata.handoff` was empty (only `intentional_wait` from earlier stage-ready was in metadata). Walked back honestly under sharper diagnostic — closest to option (a): composed HANDOFF as prose in response turn; did NOT invoke `TaskUpdate(metadata={handoff: ...})` before sending the announcement SendMessage.
3. **Lead's overreaction** during incident #1: grep'd for the test classes in the WRONG file path (the dispatch-mentioned `test_session_init.py` rather than the new `test_628_coverage.py` test-engineer actually created). Concluded "fabrication" (HALT-class framing) before widening the grep. Same coupling failure shape — different actor.

All three are instances of the **tightly-coupled-validator failure pattern** (pact-memory `02005f83`): the validating instrument (test parametrize literal / prose narrative / expected file path) is too tightly coupled to the validated thing (source literal / metadata write / actual work location). The cognitive completion-signal fires from the coupled instrument even when the substrate-level state is wrong.

## Generalization

When the canonical HANDOFF format is described in `pact-agent-teams/SKILL.md` **by-name + by-structure** (6 fields: produced/decisions/reasoning_chain/uncertainty/integration/open_questions), an agent composing those fields as prose in their response turn produces the cognitive completion-signal — *"I produced the canonical 6-field HANDOFF"* — even though the prose is in the wrong substrate. Same mechanism for teachback (`teachback_submit` 4-field shape) and stage-ready (`git diff --cached --stat` paste).

The team-lead doesn't read the agent's response turn; the team-lead reads `metadata.handoff` (or `metadata.teachback_submit`). When prose-output substitutes for metadata-write, the lead sees empty metadata and the contract appears broken — but the agent believes it succeeded, so they idle without surfacing the problem.

## Proposed defenses (two complementary)

### 1. Instruction-class fix (agent-readable, low-cost)

Update `pact-plugin/skills/pact-agent-teams/SKILL.md` to explicitly state, in both the HANDOFF and teachback sections:

> **Composing the HANDOFF / teachback as prose in your response turn is NOT the submission.** Only `TaskUpdate(taskId, metadata={handoff: {...}})` (or `metadata={teachback_submit: {...}}`) writes the payload to a substrate the team-lead reads. **Before** composing your "HANDOFF submitted" SendMessage:
>
> 1. Invoke the `TaskUpdate` tool with the canonical-field payload
> 2. Re-read via Bash: `cat ~/.claude/tasks/{team}/{taskId}.json | python3 -c "import json,sys; print(list(json.load(sys.stdin).get('metadata',{}).get('handoff',{}).keys()))"`
> 3. Confirm the canonical fields are present at on-disk state
> 4. Paste the re-read output in your SendMessage as proof
>
> This mirrors the proven `git diff --cached --stat` paste-the-actual-output discipline already required at stage-ready time.

Similar wording for teachback (substituting `teachback_submit`).

Also: update agent persona files (or specialist agent bodies) where they reference HANDOFF discipline, to point at this canonical instruction.

### 2. Hook-class fix (structural, defense-in-depth)

Add a validator hook that fires on outgoing SendMessages from teammates whose content matches HANDOFF-completion claims. The hook:

- Pattern-matches against the SendMessage body for phrases like \"HANDOFF submitted\", \"HANDOFF written\", \"teachback submitted\", \"stage-ready\" + \"git add\"-equivalent claims
- For HANDOFF claims: reads the sender's task file (`~/.claude/tasks/{team}/{taskId}.json`), checks `metadata.handoff` is present with canonical fields populated
- For teachback claims: checks `metadata.teachback_submit` similarly
- For stage-ready claims involving git add: runs `git diff --cached --stat` and checks the file paths cited in the SendMessage are actually staged
- If verification fails: inject a corrective `additionalContext` into the sender's next turn directing them to the actual on-disk state and asking them to re-attempt the write

This is substrate-level enforcement that doesn't depend on instruction-discipline holding. Even if a future agent forgets the paste-the-actual-output rule, the hook catches the divergence at the moment of claim.

Implementation notes:
- Hook event: probably `UserPromptSubmit` or a new SendMessage-intercept event
- Care: don't false-positive on legitimate \"HANDOFF\" mentions in non-claim contexts (e.g., \"the architect's HANDOFF mentioned X\")
- #538 livelock class compatibility: must verify the hook bindings are SAFE (UserPromptSubmit + PreToolUse are safe; TeammateIdle / TaskCompleted / Stop are forbidden)

## Severity / scope

- **MEDIUM**: pattern is recurring (3 instances same cycle, 2 different specialists). Both teammate-side failures self-corrected within ~1 round-trip via paste-the-actual-output discipline, so no work was lost — but the round-trip cost adds up across cycles, and the lead's overreaction added a SACROSANCT-framing-walk-back round-trip too. Structural defense justifies the cost.
- **Estimated effort**: instruction-class is small (~2 days, agent-instruction edits + verification tests). Hook-class is medium (~1 week, includes hook design + livelock-class verification + opt-out for legitimate non-claim mentions + tests).
- **Suggested order**: ship instruction-class first as a fast win; hook-class as Phase 2 if instruction-class doesn't reduce frequency.

## Connection to existing memories / issues

- Memory `02005f83` (tightly-coupled-validator pattern) — this issue is a structural defense against one substrate variant of that pattern
- Memory `55adb6da` (approved-plan permissive on incidental factual claims) — adjacent class; both about validation that's too tightly coupled to the validated thing
- Issue #634 (pattern-recurrence categorical mitigation) — sibling rather than parent; #634 is about audit-finding error propagation to plans; this is about agent-claim ↔ substrate-write conflation
- Issue #640 (suite-wide unused-var cleanup home) — unrelated but cited as adjacent third-party-tool-warning disposition pattern

## Acceptance criteria

- [ ] `pact-agent-teams/SKILL.md` updated with explicit prose-vs-tool-call distinction in both HANDOFF and teachback sections, including the paste-the-actual-output discipline
- [ ] Persona body cross-references updated to point at the canonical instruction (no duplication)
- [ ] (Optional Phase 2) Hook validator implemented + dogfood test asserting it catches the prose-vs-metadata divergence pattern at claim-time

Refs: PR #641 cycle, memory `02005f83`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prevent prose-output ↔ tool-call-belief conflation in HANDOFF/teachback writes #642

Pattern observed (PR #641 cycle, 3 in-cycle instances)

Generalization

Proposed defenses (two complementary)

1. Instruction-class fix (agent-readable, low-cost)

2. Hook-class fix (structural, defense-in-depth)

Severity / scope

Connection to existing memories / issues

Acceptance criteria

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Prevent prose-output ↔ tool-call-belief conflation in HANDOFF/teachback writes #642

Description

Pattern observed (PR #641 cycle, 3 in-cycle instances)

Generalization

Proposed defenses (two complementary)

1. Instruction-class fix (agent-readable, low-cost)

2. Hook-class fix (structural, defense-in-depth)

Severity / scope

Connection to existing memories / issues

Acceptance criteria

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions