You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Either bug alone breaks merges. Together they make the merge guard a no-op security theater: when it does block, the failure is misleading; when it appears to work, it's because operators learned workarounds.
This umbrella issue bundles Option B — fix both #665 and #676 in a single PR — plus the surrounding gaps that fixing the immediate bugs does NOT address.
Scope
Phase 1 — Pair the two bug fixes (the merge guard works end-to-end) ✅ DONE (PR #697)
merge_guard_pre _GH_PR_NUMBER_RE matches wrong digit on heredoc bodies and stderr redirects #665: fix merge_guard_pre._GH_PR_NUMBER_RE greedy-quantifier so it matches the FIRST PR-number-shaped digit sequence after gh pr (or use a stricter parse that locates the subcommand and reads its first positional arg). Add tests covering: heredoc bodies with version numbers, dates, test counts; trailing 2>&1; cross-flag positions.
task_lifecycle_gate.py:321 — keep the defensive tool_response or tool_output or {} fallback (no harm), add comment explaining WHY + extract to shared extract_tool_response() helper threaded into all 4 PostToolUse hooks (delivered in PR fix(merge-guard): pair #665 + #676 fixes; bump to 4.1.8 (#677 PR #1) #697 cycle-2 commit e6ae33ee).
After Phase 1: the merge guard works as designed. AskUserQuestion + "Yes, merge" + gh pr merge flow succeeds without manual intervention. ✅ verified in v4.1.8 ship.
Phase 3 — Captured-fixture hardening for ALL hooks ⏳ NOT STARTED
The root cause of #676 is that test_merge_guard.py's 13+ fixtures use tool_output — a fictional shape — and the code matches the fixtures. Synthetic fixtures don't catch platform-shape drift. Only fixtures captured from real Claude Code stdin during actual hook execution will detect when the platform schema changes underneath us.
Each hook gets a captured-from-production fixture in pact-plugin/tests/fixtures/{hook_name}_stdin.json representing the actual stdin shape Claude Code sends. The #612 shim runs on a separate scratch branch, captures during a fresh-startup PACT session, then is removed.
Each hook's test suite gets at least one parametrized "fixture-load + dispatch" test that exercises the hook against the captured fixture.
CI guards: a parametrized integration test over every PostToolUse hook that fails if a hook reads a stdin field that's not in the captured fixture's shape.
This is the only durable defense against the next silent platform-schema drift. Needs separate plan-mode + scratch branches per hook; cannot run inside the same session as the fix.
Phase 4 — Hook contract documentation ⏳ NOT STARTED — tracked as #704
The convention "PostToolUse hooks read tool_response" lives implicitly in wake_lifecycle_emitter.py's docstring. The 4 docstring-drift cases (auditor_reminder.py etc.) all carried the same incorrect template line — proving the convention propagates by template-copying without verification. PR #697 swept the immediate drift but the convention itself still needs canonical documentation.
See #704 for full Phase 4 scope (includes additions surfaced in PR #697: shared extract_tool_response helper pattern, SECURITY/PLANNING_SCAN_PATH_EXCLUDES bootstrap-self-block pattern, strict cross-op match enforcement, deny-compound-destructive pattern, eval+heredoc pre-strip detection).
The merge_guard_post bug and the lead-owns-commits violation pattern (#675) and the persona §2 /clear factual error (closed in PR #671) are all instances of the same meta-pattern: rules documented in prose with no mechanical enforcement, where the runtime cost of the violation is invisible to operators until something else surfaces it.
Inventory current PACT rules that are prose-only (delivered as audit comment 2026-05-07 — categorized 30+ rules into A/B/C/D enforcement-state buckets)
For each, propose either a test, a hook, or an explicit acceptance test (delivered in audit comment)
File any net-new mechanical-enforcement issues that fall out of the inventory (C1 + C2 follow-ups recommended; C3 folded into C2)
⏳ PR Validate skill auto-activation with real user tasks #3: Phase 3 (captured fixtures + integration tests). Larger; needs separate scratch branches per hook for capture, then a consolidation PR with all fixtures. This is the biggest scope and deserves its own architecture-phase planning.
✅ Phase 5 delivered as audit comment 2026-05-07. C1 + C2 follow-ups recommended in that comment.
Acceptance for this umbrella
Phase 1 PR merged; merge guard demonstrably works end-to-end (verified by running gh pr merge after AskUserQuestion + "Yes" without manual token-touching).
Close this umbrella when Phases 3 + 4 land OR explicitly deferred-with-justification.
Out of scope
Other plugins or external code that may have copied PACT's hook templates. We control PACT; downstream consumers are their own concern.
Cross-platform Claude Code behavior verification (e.g., does the same bug exist on Linux/Windows? Probably yes since the field name is platform-side, but verifying is out of scope for the immediate fix).
This umbrella was filed during the post-merge investigation following PR #671 (#664 hook-driven bootstrap marker write). The merge-guard bug was discovered when the merge for #671 was blocked even after a textually-perfect AskUserQuestion + affirmative answer. Manual token write salvaged the merge; investigation traced the post-hook field-name mismatch and surfaced the broader docstring drift + the meta-pattern of prose-rules-without-mechanical-enforcement that has been visible in the merge guard, in lead-owns-commits (#675), in the SendMessage-then-TaskUpdate ordering invariant (#674), and in the persona §2 /clear factual error (closed in PR #671).
Why this exists
The merge-guard mechanism has two distinct bugs that prevent it from working end-to-end as designed:
merge_guard_pre._GH_PR_NUMBER_REgreedy-quantifier picks the wrong PR number from heredoc-style--bodyarguments (also broken by trailing2>&1). Produces"Authorization token exists but does not match this operation"even with a valid token.merge_guard_post.pyreads stdin fieldtool_output; Claude Code sendstool_response. Authorization tokens have never been written automatically by the post-hook. Every observably-clean PACT merge has gone around the guard (web UI, manualtouch, or operator typinggh pr mergeoutside the hook surface).Either bug alone breaks merges. Together they make the merge guard a no-op security theater: when it does block, the failure is misleading; when it appears to work, it's because operators learned workarounds.
This umbrella issue bundles Option B — fix both #665 and #676 in a single PR — plus the surrounding gaps that fixing the immediate bugs does NOT address.
Scope
Phase 1 — Pair the two bug fixes (the merge guard works end-to-end) ✅ DONE (PR #697)
tool_output→tool_responseinmerge_guard_post.py. Update the test fixtures intest_merge_guard.py. Update the docstring header.merge_guard_pre._GH_PR_NUMBER_REgreedy-quantifier so it matches the FIRST PR-number-shaped digit sequence aftergh pr(or use a stricter parse that locates the subcommand and reads its first positional arg). Add tests covering: heredoc bodies with version numbers, dates, test counts; trailing2>&1; cross-flag positions.auditor_reminder.py,file_size_check.py,track_files.py,teachback_check.py) where"tool_output"is in the docstring template but the code only readstool_input+tool_name.tool_response or tool_output or {}fallback (no harm), add comment explaining WHY + extract to sharedextract_tool_response()helper threaded into all 4 PostToolUse hooks (delivered in PR fix(merge-guard): pair #665 + #676 fixes; bump to 4.1.8 (#677 PR #1) #697 cycle-2 commite6ae33ee).After Phase 1: the merge guard works as designed. AskUserQuestion + "Yes, merge" +
gh pr mergeflow succeeds without manual intervention. ✅ verified in v4.1.8 ship.Phase 2 — Behavioral-change communication ✅ DONE (PR #697 + v4.1.8 release)
gh pr mergedirectly" will hit unexpected denials. (v4.1.8 release notes shipped: https://github.com/Synaptic-Labs-AI/PACT-Plugin/releases/tag/v4.1.8)Phase 3 — Captured-fixture hardening for ALL hooks ⏳ NOT STARTED
The root cause of #676 is that
test_merge_guard.py's 13+ fixtures usetool_output— a fictional shape — and the code matches the fixtures. Synthetic fixtures don't catch platform-shape drift. Only fixtures captured from real Claude Code stdin during actual hook execution will detect when the platform schema changes underneath us.#612logging-shim pattern (already proven inbootstrap_marker_writer's captured-fixture follow-up Capture real UserPromptSubmit stdin fixture per #664 §8.7 #672) to every PostToolUse hook inpact-plugin/hooks/:pact-plugin/tests/fixtures/{hook_name}_stdin.jsonrepresenting the actual stdin shape Claude Code sends. The#612shim runs on a separate scratch branch, captures during a fresh-startup PACT session, then is removed.This is the only durable defense against the next silent platform-schema drift. Needs separate plan-mode + scratch branches per hook; cannot run inside the same session as the fix.
Phase 4 — Hook contract documentation ⏳ NOT STARTED — tracked as #704
The convention "PostToolUse hooks read
tool_response" lives implicitly inwake_lifecycle_emitter.py's docstring. The 4 docstring-drift cases (auditor_reminder.pyetc.) all carried the same incorrect template line — proving the convention propagates by template-copying without verification. PR #697 swept the immediate drift but the convention itself still needs canonical documentation.See #704 for full Phase 4 scope (includes additions surfaced in PR #697: shared
extract_tool_responsehelper pattern, SECURITY/PLANNING_SCAN_PATH_EXCLUDES bootstrap-self-block pattern, strict cross-op match enforcement, deny-compound-destructive pattern, eval+heredoc pre-strip detection).Phase 5 — Governance / meta-pattern ✅ DONE (audit comment 2026-05-07)
The merge_guard_post bug and the lead-owns-commits violation pattern (#675) and the persona §2 /clear factual error (closed in PR #671) are all instances of the same meta-pattern: rules documented in prose with no mechanical enforcement, where the runtime cost of the violation is invisible to operators until something else surfaces it.
Cross-references
Closed by PR #697 (Phase 1):
73410afb+ cycle-1 expansion04e63dd1.b74ec540+ sibling docstring sweepd771af27.Closed by PR #697 (cycle-2 + cycle-3 security remediation):
43640e45(symmetric WRITE+READ + strict-match).e1ba7ea0.e1ba7ea0(was IMPROVED-but-RESIDUAL post-cycle-1, now FULLY CLOSED).e1ba7ea0.Open follow-ups from PR #697 review:
Original cross-references (pre-PR #697):
bootstrap_marker_writer(Phase 3 sets the precedent).task_lifecycle_gate._has_paired_sendmessage. Phase 5 candidate (covered in audit comment).Suggested PR sequencing
SHARED_CONVENTIONS.md). NEXT — tracked as Phase 4: SHARED_CONVENTIONS.md hook contract documentation (deferred from #677 umbrella) #704. Zero code change; pure doc.Acceptance for this umbrella
gh pr mergeafter AskUserQuestion + "Yes" without manual token-touching).SHARED_CONVENTIONS.mdwritten and pinned. Tracked as Phase 4: SHARED_CONVENTIONS.md hook contract documentation (deferred from #677 umbrella) #704.Out of scope
Background
This umbrella was filed during the post-merge investigation following PR #671 (#664 hook-driven bootstrap marker write). The merge-guard bug was discovered when the merge for #671 was blocked even after a textually-perfect AskUserQuestion + affirmative answer. Manual token write salvaged the merge; investigation traced the post-hook field-name mismatch and surfaced the broader docstring drift + the meta-pattern of prose-rules-without-mechanical-enforcement that has been visible in the merge guard, in lead-owns-commits (#675), in the SendMessage-then-TaskUpdate ordering invariant (#674), and in the persona §2 /clear factual error (closed in PR #671).