Summary
During PR #790's wrap-up forensics, a 19-hour-old expired/cross-session merge_guard authorization token was discovered to have survived undisturbed through an entire session containing dozens of Bash tool calls. Direct invocation of merge_guard_pre.py (the PreToolUse Bash matcher hook) against the same forensic state immediately cleaned up the stale token — proving the hook works correctly when invoked. The empirical conclusion: Claude Code did not invoke the merge_guard_pre.py PreToolUse hook on the gh pr merge Bash tool call during PR #790's merge, despite the hook being source-correct, registered in hooks.json, and proven-functional via direct testing.
This issue tracks the platform-level investigation. Token-lifecycle hardening (which would have caught the empirical bypass via a different code path) is filed separately in #797.
Empirical evidence chain
| Check |
Result |
| Hook source correctness (md5) |
✅ be5262082e9ff88be97a06a5eb3468a2 — identical between ~/.claude/plugins/cache/pact-plugin/PACT/4.2.9/hooks/merge_guard_pre.py and worktree source |
Hook registration in v4.2.9 hooks.json |
✅ PreToolUse matcher=Bash group at line 68-80; md5 19f97c623d981a69a0919a9775b066cd matches worktree |
| v4.2.8 (prior plugin version) also registered the hook |
✅ — same registration shape — eliminates plugin-version-cache theories |
| Hook fires reliably on Bash calls after forensic instrumentation |
✅ 5+ invocations logged across 5+ Bash calls during forensic |
| Hook correctly denies the EXACT compound-merge command shape used at 16:31:45 EDT |
✅ Direct test: gh pr merge 99999 --repo ... --squash --subject "..." 2>&1 | tail -5 → "Compound destructive command rejected" |
| Token-validation logic operates correctly under freshness/session-scope/slot-claim checks |
✅ Single manual invocation cleaned up the 19-hour-old stale token immediately |
| 19-hour-old stale token persisted UNDISTURBED through dozens of pre-forensic Bash calls in session |
✅ Direct proof find_valid_token was never invoked during the session pre-forensic — i.e., merge_guard_pre.py was never run |
Why this is a Claude-Code-platform issue (not PACT-Plugin)
The hook implementation is correct. The hook registration is correct. The hook's functional verification reproduces correctly when invoked directly. The bypass is at the invocation layer: Claude Code did not call the hook at the moment the dangerous Bash tool was used.
This is not a PACT-Plugin code defect. It is either:
- A Claude Code plugin-load lifecycle issue (the matcher-based PreToolUse hook group failed to register at session-start, even though the matcher-less PreToolUse hook
bootstrap_gate.py clearly DID register)
- A Claude Code internal bash-tool dispatch issue (some bash invocations bypass PreToolUse hooks for reasons internal to the harness)
- Some other harness-side gap
Hypotheses (unfalsified — investigation needed)
H-A: Lazy/deferred matcher-based hook registration at session start.
Plugin v4.2.9 was extracted at May 18 14:06:38 EDT; session started 7 minutes later at 14:13:56 EDT. Claude Code may have completed bootstrap_gate.py registration (no-matcher PreToolUse) but not yet completed the matcher=Bash group registration when the session was instantiated. This would leave git_commit_check.py AND merge_guard_pre.py both un-invoked for some window after session start.
H-B: Plugin re-scan triggered later in session. The plugin directory mtime is May 18 16:41:04 EDT — approximately 10 minutes AFTER the bypass at 16:31:45. Something touched the plugin directory after the merge. This is consistent with the empirical observation that the hook began firing reliably ONLY after I modified the hook file at ~16:47 EDT. A plugin-scan event at 16:41 may have completed the matcher-based registration that wasn't completed at session start.
H-C: First-Bash-call registration race. A race condition during the first Bash tool call may leave the matcher=Bash hook group unregistered, persisting for the remainder of the session until a re-scan event corrects it.
H-D: Internal Claude Code bash-tool routing. Some Bash invocations may bypass PreToolUse hooks for reasons internal to Claude Code's tool dispatch. No supporting evidence; rejected as too speculative without further data.
Combined plausibility: H-A + H-B + H-C share a common shape — Claude Code's plugin-load lifecycle has a window where matcher-based hooks are not yet active, and a later scan event corrects it. The exact mechanism requires platform-team confirmation.
Out of scope for this issue
Proposed investigation steps (Layer 0)
-
Add operator-visible diagnostic logging to merge_guard_pre.py (env-var-gated). When PACT_MERGE_GUARD_DEBUG=1 is set, the hook writes a line to ~/.claude/.merge_guard_invocation_log on every entry (timestamp, PID, tool_name, command-snippet, decision). This makes future hook-non-invocation events forensically visible without manual instrumentation.
-
Reproduce in a controlled fresh session:
- Extract plugin v4.2.9 (or current version)
- Start Claude Code session WITHIN N seconds of plugin extract (varying N: 5s, 30s, 60s, 5 min)
- Invoke a dangerous bash command (e.g.,
git push --force origin nonexistent-branch)
- Observe whether merge_guard_pre fires (via debug log) and whether the harness blocks the command
- Compare against a control session where plugin was extracted hours before session start
-
Inspect Claude Code's plugin-load lifecycle:
- Use
claude-code-guide agent (per pinned discipline) to query Claude Code platform documentation for plugin-load semantics
- Specifically: when are hooks loaded? Is loading deferred / lazy / event-triggered? Are matcher-based hooks loaded differently from no-matcher hooks?
-
File upstream with Claude Code platform team if reproducer succeeds:
- Hook-load-race condition would be a real platform bug worth surfacing
- Defense-in-depth implications: PACT-Plugin's merge_guard is one of several plugin-installed PreToolUse hooks Claude Code users rely on for safety
-
Document the platform-trust boundary:
- PACT-Plugin's merge_guard system assumes Claude Code invokes registered hooks on every matching tool call. If that assumption can be violated by plugin-load timing, the threat model needs to be updated to call out the platform dependency explicitly.
Acceptance criteria
Empirical record (full reproduction state)
Plugin install (v4.2.9): 2026-05-18 14:06:38 EDT
Session start: 2026-05-18 14:13:56 EDT (7 min post-install)
Bash calls pre-merge: ~30+ (git status, git log, git add, git commit ×7,
git push ×3, gh pr view, gh pr create, gh issue create ×7, etc.)
PR #790 merge: 2026-05-18 16:31:45 EDT (gh pr merge 790 --squash ...
2>&1 | tail -5 + multi-line follow-up commands — compound
+ dangerous-pattern matches; should have been DENIED)
Plugin dir mtime: 2026-05-18 16:41:04 EDT (~10 min POST-merge — something
touched the plugin dir)
Forensic instrumentation: ~2026-05-18 16:47 EDT
Hook fires reliably: from forensic-instrumentation onward
Token survived undisturbed:
Cross-refs
Pattern note
This issue joins #791, #792, #793, #794, #795, #796, #797, #798 as instances of the meta-pattern surfaced during PR #790 work: load-bearing assumptions need both code-level correctness AND platform-level reliability. PACT-Plugin can deliver correct hook code, but invocation depends on Claude Code's platform behavior. When the platform-level assumption is violated, even correct hook code provides no safety. The discipline this PR's wrap-up surfaces: identify which load-bearing assumptions DEPEND on platform behavior, and document them as platform dependencies in the threat model rather than treating them as self-contained code invariants.
Summary
During PR #790's wrap-up forensics, a 19-hour-old expired/cross-session merge_guard authorization token was discovered to have survived undisturbed through an entire session containing dozens of Bash tool calls. Direct invocation of
merge_guard_pre.py(the PreToolUse Bash matcher hook) against the same forensic state immediately cleaned up the stale token — proving the hook works correctly when invoked. The empirical conclusion: Claude Code did not invoke themerge_guard_pre.pyPreToolUse hook on thegh pr mergeBash tool call during PR #790's merge, despite the hook being source-correct, registered inhooks.json, and proven-functional via direct testing.This issue tracks the platform-level investigation. Token-lifecycle hardening (which would have caught the empirical bypass via a different code path) is filed separately in #797.
Empirical evidence chain
be5262082e9ff88be97a06a5eb3468a2— identical between~/.claude/plugins/cache/pact-plugin/PACT/4.2.9/hooks/merge_guard_pre.pyand worktree sourcehooks.jsonPreToolUsematcher=Bashgroup at line 68-80; md519f97c623d981a69a0919a9775b066cdmatches worktreegh pr merge 99999 --repo ... --squash --subject "..." 2>&1 | tail -5→ "Compound destructive command rejected"find_valid_tokenwas never invoked during the session pre-forensic — i.e.,merge_guard_pre.pywas never runWhy this is a Claude-Code-platform issue (not PACT-Plugin)
The hook implementation is correct. The hook registration is correct. The hook's functional verification reproduces correctly when invoked directly. The bypass is at the invocation layer: Claude Code did not call the hook at the moment the dangerous Bash tool was used.
This is not a PACT-Plugin code defect. It is either:
bootstrap_gate.pyclearly DID register)Hypotheses (unfalsified — investigation needed)
H-A: Lazy/deferred matcher-based hook registration at session start.
Plugin v4.2.9 was extracted at
May 18 14:06:38 EDT; session started 7 minutes later at14:13:56 EDT. Claude Code may have completedbootstrap_gate.pyregistration (no-matcher PreToolUse) but not yet completed the matcher=Bashgroup registration when the session was instantiated. This would leavegit_commit_check.pyANDmerge_guard_pre.pyboth un-invoked for some window after session start.H-B: Plugin re-scan triggered later in session. The plugin directory mtime is
May 18 16:41:04 EDT— approximately 10 minutes AFTER the bypass at 16:31:45. Something touched the plugin directory after the merge. This is consistent with the empirical observation that the hook began firing reliably ONLY after I modified the hook file at ~16:47 EDT. A plugin-scan event at 16:41 may have completed the matcher-based registration that wasn't completed at session start.H-C: First-Bash-call registration race. A race condition during the first Bash tool call may leave the matcher=
Bashhook group unregistered, persisting for the remainder of the session until a re-scan event corrects it.H-D: Internal Claude Code bash-tool routing. Some Bash invocations may bypass PreToolUse hooks for reasons internal to Claude Code's tool dispatch. No supporting evidence; rejected as too speculative without further data.
Combined plausibility: H-A + H-B + H-C share a common shape — Claude Code's plugin-load lifecycle has a window where matcher-based hooks are not yet active, and a later scan event corrects it. The exact mechanism requires platform-team confirmation.
Out of scope for this issue
Proposed investigation steps (Layer 0)
Add operator-visible diagnostic logging to
merge_guard_pre.py(env-var-gated). WhenPACT_MERGE_GUARD_DEBUG=1is set, the hook writes a line to~/.claude/.merge_guard_invocation_logon every entry (timestamp, PID, tool_name, command-snippet, decision). This makes future hook-non-invocation events forensically visible without manual instrumentation.Reproduce in a controlled fresh session:
git push --force origin nonexistent-branch)Inspect Claude Code's plugin-load lifecycle:
claude-code-guideagent (per pinned discipline) to query Claude Code platform documentation for plugin-load semanticsFile upstream with Claude Code platform team if reproducer succeeds:
Document the platform-trust boundary:
Acceptance criteria
PACT_MERGE_GUARD_DEBUG=1env var; documented in READMEEmpirical record (full reproduction state)
Token survived undisturbed:
2026-05-17 21:39:54 EDT(session ad575016, PR fix(#611, #778): in-process teammate wake-arm via agent_id field-presence discriminator #783 merge)2026-05-18 16:43 EDT(my forensic), 19 hours laterCross-refs
pact-plugin/hooks/merge_guard_pre.py— the hook that didn't firepact-plugin/.claude-plugin/hooks.json(worktree) /~/.claude/plugins/cache/pact-plugin/PACT/4.2.9/hooks/hooks.json— correct registration; md5 matchPattern note
This issue joins #791, #792, #793, #794, #795, #796, #797, #798 as instances of the meta-pattern surfaced during PR #790 work: load-bearing assumptions need both code-level correctness AND platform-level reliability. PACT-Plugin can deliver correct hook code, but invocation depends on Claude Code's platform behavior. When the platform-level assumption is violated, even correct hook code provides no safety. The discipline this PR's wrap-up surfaces: identify which load-bearing assumptions DEPEND on platform behavior, and document them as platform dependencies in the threat model rather than treating them as self-contained code invariants.