You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The HANDOFF / teachback / stage-ready failure mode where agents conflate composing prose in their response turn with invoking the tool call that actually persists state. Three instances surfaced in PR #641:
test-engineer's stage-ready claimed git status A pact-plugin/tests/test_628_coverage.py (added) while reality showed ?? (untracked, never git add'd). Walked back when lead grep'd file class names and they weren't found.
backend-coder's first HANDOFF claim asserted "HANDOFF written for Task [P0] Fix pact-code-analyzer: dependencies, tests, security, and structure #26 ... canonical 6 fields ... SET intentional_wait{awaiting_lead_completion}" while metadata.handoff was empty (only intentional_wait from earlier stage-ready was in metadata). Walked back honestly under sharper diagnostic — closest to option (a): composed HANDOFF as prose in response turn; did NOT invoke TaskUpdate(metadata={handoff: ...}) before sending the announcement SendMessage.
Lead's overreaction during incident Claude Code: not strictly follow framework #1: grep'd for the test classes in the WRONG file path (the dispatch-mentioned test_session_init.py rather than the new test_628_coverage.py test-engineer actually created). Concluded "fabrication" (HALT-class framing) before widening the grep. Same coupling failure shape — different actor.
All three are instances of the tightly-coupled-validator failure pattern (pact-memory 02005f83): the validating instrument (test parametrize literal / prose narrative / expected file path) is too tightly coupled to the validated thing (source literal / metadata write / actual work location). The cognitive completion-signal fires from the coupled instrument even when the substrate-level state is wrong.
Generalization
When the canonical HANDOFF format is described in pact-agent-teams/SKILL.mdby-name + by-structure (6 fields: produced/decisions/reasoning_chain/uncertainty/integration/open_questions), an agent composing those fields as prose in their response turn produces the cognitive completion-signal — "I produced the canonical 6-field HANDOFF" — even though the prose is in the wrong substrate. Same mechanism for teachback (teachback_submit 4-field shape) and stage-ready (git diff --cached --stat paste).
The team-lead doesn't read the agent's response turn; the team-lead reads metadata.handoff (or metadata.teachback_submit). When prose-output substitutes for metadata-write, the lead sees empty metadata and the contract appears broken — but the agent believes it succeeded, so they idle without surfacing the problem.
Update pact-plugin/skills/pact-agent-teams/SKILL.md to explicitly state, in both the HANDOFF and teachback sections:
Composing the HANDOFF / teachback as prose in your response turn is NOT the submission. Only TaskUpdate(taskId, metadata={handoff: {...}}) (or metadata={teachback_submit: {...}}) writes the payload to a substrate the team-lead reads. Before composing your "HANDOFF submitted" SendMessage:
Invoke the TaskUpdate tool with the canonical-field payload
Confirm the canonical fields are present at on-disk state
Paste the re-read output in your SendMessage as proof
This mirrors the proven git diff --cached --stat paste-the-actual-output discipline already required at stage-ready time.
Similar wording for teachback (substituting teachback_submit).
Also: update agent persona files (or specialist agent bodies) where they reference HANDOFF discipline, to point at this canonical instruction.
2. Hook-class fix (structural, defense-in-depth)
Add a validator hook that fires on outgoing SendMessages from teammates whose content matches HANDOFF-completion claims. The hook:
Pattern-matches against the SendMessage body for phrases like "HANDOFF submitted", "HANDOFF written", "teachback submitted", "stage-ready" + "git add"-equivalent claims
For HANDOFF claims: reads the sender's task file (~/.claude/tasks/{team}/{taskId}.json), checks metadata.handoff is present with canonical fields populated
For teachback claims: checks metadata.teachback_submit similarly
For stage-ready claims involving git add: runs git diff --cached --stat and checks the file paths cited in the SendMessage are actually staged
If verification fails: inject a corrective additionalContext into the sender's next turn directing them to the actual on-disk state and asking them to re-attempt the write
This is substrate-level enforcement that doesn't depend on instruction-discipline holding. Even if a future agent forgets the paste-the-actual-output rule, the hook catches the divergence at the moment of claim.
Implementation notes:
Hook event: probably UserPromptSubmit or a new SendMessage-intercept event
Care: don't false-positive on legitimate "HANDOFF" mentions in non-claim contexts (e.g., "the architect's HANDOFF mentioned X")
MEDIUM: pattern is recurring (3 instances same cycle, 2 different specialists). Both teammate-side failures self-corrected within ~1 round-trip via paste-the-actual-output discipline, so no work was lost — but the round-trip cost adds up across cycles, and the lead's overreaction added a SACROSANCT-framing-walk-back round-trip too. Structural defense justifies the cost.
Estimated effort: instruction-class is small (~2 days, agent-instruction edits + verification tests). Hook-class is medium (~1 week, includes hook design + livelock-class verification + opt-out for legitimate non-claim mentions + tests).
Suggested order: ship instruction-class first as a fast win; hook-class as Phase 2 if instruction-class doesn't reduce frequency.
Connection to existing memories / issues
Memory 02005f83 (tightly-coupled-validator pattern) — this issue is a structural defense against one substrate variant of that pattern
Memory 55adb6da (approved-plan permissive on incidental factual claims) — adjacent class; both about validation that's too tightly coupled to the validated thing
pact-agent-teams/SKILL.md updated with explicit prose-vs-tool-call distinction in both HANDOFF and teachback sections, including the paste-the-actual-output discipline
Persona body cross-references updated to point at the canonical instruction (no duplication)
(Optional Phase 2) Hook validator implemented + dogfood test asserting it catches the prose-vs-metadata divergence pattern at claim-time
Pattern observed (PR #641 cycle, 3 in-cycle instances)
The HANDOFF / teachback / stage-ready failure mode where agents conflate composing prose in their response turn with invoking the tool call that actually persists state. Three instances surfaced in PR #641:
git status A pact-plugin/tests/test_628_coverage.py(added) while reality showed??(untracked, nevergit add'd). Walked back when lead grep'd file class names and they weren't found.metadata.handoffwas empty (onlyintentional_waitfrom earlier stage-ready was in metadata). Walked back honestly under sharper diagnostic — closest to option (a): composed HANDOFF as prose in response turn; did NOT invokeTaskUpdate(metadata={handoff: ...})before sending the announcement SendMessage.test_session_init.pyrather than the newtest_628_coverage.pytest-engineer actually created). Concluded "fabrication" (HALT-class framing) before widening the grep. Same coupling failure shape — different actor.All three are instances of the tightly-coupled-validator failure pattern (pact-memory
02005f83): the validating instrument (test parametrize literal / prose narrative / expected file path) is too tightly coupled to the validated thing (source literal / metadata write / actual work location). The cognitive completion-signal fires from the coupled instrument even when the substrate-level state is wrong.Generalization
When the canonical HANDOFF format is described in
pact-agent-teams/SKILL.mdby-name + by-structure (6 fields: produced/decisions/reasoning_chain/uncertainty/integration/open_questions), an agent composing those fields as prose in their response turn produces the cognitive completion-signal — "I produced the canonical 6-field HANDOFF" — even though the prose is in the wrong substrate. Same mechanism for teachback (teachback_submit4-field shape) and stage-ready (git diff --cached --statpaste).The team-lead doesn't read the agent's response turn; the team-lead reads
metadata.handoff(ormetadata.teachback_submit). When prose-output substitutes for metadata-write, the lead sees empty metadata and the contract appears broken — but the agent believes it succeeded, so they idle without surfacing the problem.Proposed defenses (two complementary)
1. Instruction-class fix (agent-readable, low-cost)
Update
pact-plugin/skills/pact-agent-teams/SKILL.mdto explicitly state, in both the HANDOFF and teachback sections:Similar wording for teachback (substituting
teachback_submit).Also: update agent persona files (or specialist agent bodies) where they reference HANDOFF discipline, to point at this canonical instruction.
2. Hook-class fix (structural, defense-in-depth)
Add a validator hook that fires on outgoing SendMessages from teammates whose content matches HANDOFF-completion claims. The hook:
~/.claude/tasks/{team}/{taskId}.json), checksmetadata.handoffis present with canonical fields populatedmetadata.teachback_submitsimilarlygit diff --cached --statand checks the file paths cited in the SendMessage are actually stagedadditionalContextinto the sender's next turn directing them to the actual on-disk state and asking them to re-attempt the writeThis is substrate-level enforcement that doesn't depend on instruction-discipline holding. Even if a future agent forgets the paste-the-actual-output rule, the hook catches the divergence at the moment of claim.
Implementation notes:
UserPromptSubmitor a new SendMessage-intercept eventSeverity / scope
Connection to existing memories / issues
02005f83(tightly-coupled-validator pattern) — this issue is a structural defense against one substrate variant of that pattern55adb6da(approved-plan permissive on incidental factual claims) — adjacent class; both about validation that's too tightly coupled to the validated thingcontent_sans_routingvariable-name fossil at claude_md_manager.py L557 #634 (pattern-recurrence categorical mitigation) — sibling rather than parent; Rename or removecontent_sans_routingvariable-name fossil at claude_md_manager.py L557 #634 is about audit-finding error propagation to plans; this is about agent-claim ↔ substrate-write conflationAcceptance criteria
pact-agent-teams/SKILL.mdupdated with explicit prose-vs-tool-call distinction in both HANDOFF and teachback sections, including the paste-the-actual-output disciplineRefs: PR #641 cycle, memory
02005f83