Skip to content

Restore session-startup ritual (#628)#641

Merged
michael-wojcik merged 34 commits into
mainfrom
restore-startup-ritual-628
May 6, 2026
Merged

Restore session-startup ritual (#628)#641
michael-wojcik merged 34 commits into
mainfrom
restore-startup-ritual-628

Conversation

@michael-wojcik
Copy link
Copy Markdown
Collaborator

Summary

PR #621 v4.0.0 cleanly migrated the orchestrator persona to --agent flag delivery, but the migration deleted session-start ritual surfaces without compensation. This PR restores those surfaces via 14 atomic commits — a scaled-down /PACT:bootstrap command + reinstated hooks (bootstrap_gate.py, bootstrap_prompt_gate.py, peer_inject.py) + adapted session_init.py directives + persona body §2 Session-Start Ritual + F2 mid-session pin-memory directive + claude_md_manager.py C6 restoration.

Closes #628.

Architectural separation (per audit-architect Q2 split)

  1. Per-session ritual lives in /PACT:bootstrap command file (TeamCreate-or-reuse, secretary spawn, paused-state, marker-write). Invoked once per session.
  2. Per-turn governance + identity lives in --agent-delivered persona body (S5 policy, algedonic, completion authority, etc.). Loads at every turn via --agent flag.
  3. Per-tool gate lives in bootstrap_gate.py PreToolUse hook. Blocks Edit/Write/Agent/NotebookEdit until marker is set.

The persona body's new §2 is the directive surface (per-turn governance reader); commands/bootstrap.md is the mechanics surface (per-session ritual reader). Two-surface separation matches the two-audience design.

Adaptations (semantic-equivalence over byte-equivalence)

  • DROP FIRST-ACTION clause from peer_inject prelude (teammate frontmatter absorbs role-establishment context per audit-architect Q5).
  • ADD charter cross-ref to peer_inject prelude (Q5 ADDENDUM — closes F9 charter-omission gap as a single-restoration two-finding-closure).
  • Update _TEACHBACK_REMINDER in peer_inject to reference pact-teachback skill + metadata.teachback_submit (the canonical replacement for the deleted /PACT:teammate-bootstrap skill).
  • SUNSET-BEFORE-v4.x.y comment on strip_orphan_kernel_block (renamed from remove_stale_kernel_block, mirroring strip_orphan_* naming for sibling project-CLAUDE.md cleanup) — clarifies it's a one-version-window migration helper.
  • 5-branch source × team_exists dispatch restored in session_init.py with a slim 3-line bootstrap-pointer directive (full mechanics in commands/bootstrap.md).

Test plan

  • Suite floor: 7124 → 7145 passing tests (+21 net new), 10 skipped, 0 failures.
  • All 32 plan §Test Phase scenarios mapped to passing tests.
  • 3 high-risk TDD specs verified complete: peer_inject._sanitize_agent_name (8/8 methods, marker-spoof injection defense); bootstrap_gate.is_marker_set (7/7 methods, gate-bypass defense).
  • Auditor YELLOW signals addressed: cross-file consistency on _TEACHBACK_REMINDER (peer_inject ↔ pact-teachback skill), encoding-tolerant marker-write semantics test.
  • Test pollution detected and resolved (test_628_coverage.py module-isolation pattern).
  • Post-merge smoke verification in fresh session (gate activation, marker write, ritual flow end-to-end). hooks.json edits don't take effect in the running session — same dogfood pattern as PR fix: extend _extract_task_id to probe nested tool_response.task.id (#620) #638.
  • Edit GitHub release v4.0.0 description with split "Known regressions" / "Phase 2 enhancements" subsections (heredoc captured in dispatch artifacts; runs after merge).

Cross-references

…3-§13

Restores per-session ritual surface deleted in PR #621. New §2 covers
TeamCreate-or-reuse, secretary spawn, paused-state check, with slim
pointer to commands/bootstrap.md for full mechanics. Old §2-§12
renumbered to §3-§13. Invocation directive: "YOUR FIRST ACTION
(BEFORE ANY OTHER TOOL CALL): invoke Skill('PACT:bootstrap') to
execute the session-start ritual."

Refs: #628
Atomic prerequisite for the session-startup ritual restoration. Defines
the module-level constant BOOTSTRAP_MARKER_NAME='bootstrap-complete'
used by the bootstrap_gate / bootstrap_prompt_gate / session_init hooks
(restored in subsequent commits) to identify the session-scoped marker
file whose presence signals that Skill('PACT:bootstrap') has run.

build_session_path was retained through PR #621 and is already exported;
no change needed for that half of the plan's described 're-exports'.

Refs: #628
Restores the two-hook bootstrap gate deleted in PR #621:

- bootstrap_gate.py (PreToolUse, no matcher): blocks Edit/Write/Agent/
  NotebookEdit when the bootstrap marker is absent in a lead session.
  Bash, Read, Glob, Grep, Web*, and MCP tools always allowed; Bash
  is allowed because the marker-write itself is a Bash command.
  Fail-open on any exception.

- bootstrap_prompt_gate.py (UserPromptSubmit, no matcher): injects
  the 'invoke Skill(PACT:bootstrap)' additionalContext directive
  while the marker is absent. Zero-token fast path once marker is
  present. Fail-open on any exception.

Source files only — not yet wired in hooks.json (Commit 8). Imports
BOOTSTRAP_MARKER_NAME from shared (Commit 1's atomic prerequisite).

Both bindings are in the SAFE livelock class per #538 (UserPromptSubmit
+ PreToolUse are not in the TeammateIdle/TaskCompleted/Stop forbidden
class).

Refs: #628
Adds a new '#### Pin to CLAUDE.md mid-session' sub-section in §13
Memory Management. Four trigger bullets (SACROSANCT clarification,
load-bearing architectural decision, durable user correction, subtle
invariant) followed by the directive to invoke /PACT:pin-memory at
the moment of insight rather than deferring to wrap-up.

Distinct from the post-review pin-memory invocation in §13 PR Review
Workflow — that trigger fires after review synthesis; this trigger
fires mid-session.

Refs: #628
Adapted for command-marker via Commit 5's bootstrap.md preserving
the touch literal — no source adaptation needed at this layer.

Coverage:
- test_bootstrap_gate.py: 52 tests covering PreToolUse no-matcher
  refusal/allow paths, marker-presence cache, allow-list (Bash, Read,
  Glob, Grep, Web*), MCP exemption, lead-session enforcement,
  fail-open on stdin parse / OSError / lock timeout. Plus
  _sanitize_agent_name security parametrization (newline, ')',
  control-char rejection).
- test_bootstrap_prompt_gate.py: 21 tests covering UserPromptSubmit
  inject-vs-suppress paths, marker-presence fast path, fail-open.

Transient-red disclosure: 1 of 73 tests
(TestMarkerNameConsistency.test_bootstrap_md_references_same_marker)
asserts the touch literal in pact-plugin/commands/bootstrap.md, which
doesn't land until Commit 5. Acts as a forcing function for Commit 5.
72/73 pass through Commits 3+4; full 73/73 once Commit 5 lands.

Refs: #628
…RST-ACTION)

Restores the SubagentStart hook deleted in PR #621 (C4), with two
adaptations from pre-#621 source per the plan and one surgical fix
for a dead reference.

Adaptations:
- DROP FIRST-ACTION clause from prelude template — teammate frontmatter
  now absorbs role-establishment context (per audit-architect Q5
  ADDENDUM rationale).
- ADD 1-line charter cross-ref to prelude template — points new
  teammates at protocols/pact-communication-charter.md before they
  send teammate messages. Closes F9 charter-omission gap as a
  single-restoration two-finding-closure.

Surgical fix (beyond byte-equal):
- Update _TEACHBACK_REMINDER: pre-deletion pointed at /PACT:teammate-bootstrap
  (also deleted in #621). Now points at the canonical pact-teachback
  skill and metadata.teachback_submit contract. Restoring byte-equal
  would ship a broken reference.

Test coverage (160 tests across 3 files):
- test_peer_inject.py: 73 tests, 5 adapted for new contract
  (charter cross-ref assertion replacing FIRST-ACTION assertion;
  pact-orchestrator excluded from SubagentStart agent-types per
  --agent flag delivery model).
- test_error_output.py: +TestPeerInjectSuppressOutput class (2 tests).
- test_spawn_overhead_benchmark.py: +2 peer_inject methods asserting
  prelude byte-cost reduction (FIRST-ACTION drop > Q5 ADDENDUM add).

Security: _sanitize_agent_name marker-spoof injection defense
preserved verbatim (newline / ')' / control-char rejection).

Refs: #628
NEW pact-plugin/commands/bootstrap.md (63 lines, scaled-down per Q2
strict-ritual scope). Owns the per-session ritual mechanics:
TeamCreate-or-reuse, secretary spawn, paused-state surface, plugin
banner, bootstrap-marker write. The persona body's §2 Session-Start
Ritual is the invocation contract; this command file is the mechanics
surface.

Excluded by design (live in --agent-delivered persona body): MISSION /
MOTTO / governance framing, S5 POLICY, Algedonic, Completion Authority,
SACROSANCT Fail-Safe, FINAL MANDATE — all owned by the persona delivered
via the --agent flag, not by this ritual command.

The marker-write literal 'touch "<path>/bootstrap-complete"' is the
load-bearing string asserted by Commit 3's TestMarkerNameConsistency
cross-file consistency test, which now passes (transient-red resolved).
HTML coupling comment pins marker name to shared.BOOTSTRAP_MARKER_NAME.

plugin.json: adds ./commands/bootstrap.md to the commands array.
teammate-bootstrap.md stays deleted per #628 — teammate frontmatter
absorbs role-establishment context.

Refs: #628
…r branch

Restores the session_init.py bootstrap-directive surface deleted in
PR #621 (C7) so SessionStart inline-emits the 'invoke Skill(PACT:bootstrap)'
directive on every session.

session_init.py restorations (~172 lines):
- 5-branch source × team_exists dispatch (compact / clear / resume /
  startup-no-team / anomalous fallbacks); compact-branch checkpoint
  preserved nested inside the compact path
- Bootstrap-pointer adaptation: directive prose points at /PACT:bootstrap
  command (slim pointer); full mechanics live in commands/bootstrap.md
  (Commit 5)
- Marker-clear branch on source='clear' (user explicit reset signal):
  unlinks bootstrap-complete via build_session_path + BOOTSTRAP_MARKER_NAME.
  Narrowed to 'clear' only (NOT compact) per #414 narrower-guard rationale
- _build_safety_net_context() helper for main()'s except path: emits
  PACT ROLE marker + minimal bootstrap directive even on partial failure
  (paired with systemMessage error reporting)
- _substitutions, _team_reuse, _team_create directive strings restored

test_session_init.py: +1481 lines, 11 reinstated test classes covering
all 5 dispatch branches + marker-clear + safety-net + parametrized
team_name injection-defense (newline / ')' / null-byte / control-char).

Downstream test fixes (anti-pattern tests that assert v4.0.0 deletion
behavior, which #628 reverses):
- test_plugin_json_orchestrator.py: drops bootstrap.md from REMOVED_COMMANDS
  (still asserts teammate-bootstrap.md is dropped); adds positive assertion
  that bootstrap.md IS registered
- test_agents_structure.py: split FOSSIL_SKILL_INVOCATIONS into all-agents
  set (teammate-bootstrap; permanently removed) and orchestrator-only set
  (bootstrap; orchestrator may carry per persona §2)
- test_skills_structure.py: §6 → §7 reference (renumber drift from
  Commit 6); 1-line fix bundled here for pipeline coherence

Hard ordering: lands BEFORE Commit 8 (hooks.json wiring) so the
bootstrap-pointer directive is in place when the gate activates,
preventing the deadlock window during PR review.

Pyright: BOOTSTRAP_MARKER_NAME and build_session_path are now accessed
(used in marker-clear branch); transient mid-write 'not accessed'
warnings resolve. PIN_*/_estimate_tokens warnings are pre-existing
upstream cruft, deferred to suite-wide cleanup home (out of #628 scope).

Refs: #628
…oks.json

Activates the three hooks restored in earlier commits by adding their
hooks.json entries. The plan's deadlock-window prevention is honored:
this commit lands AFTER Commit 7 (3f759fa) which restored session_init's
bootstrap-pointer directive, so the orchestrator already receives the
'invoke Skill(PACT:bootstrap)' directive on SessionStart when the gate
activates.

Three hooks.json entries added:

1. UserPromptSubmit (NEW array, no matcher): bootstrap_prompt_gate.py.
   Injects the bootstrap directive on every prompt while the marker is
   absent; zero-token fast path once the marker exists.

2. PreToolUse (NEW no-matcher entry, first in array): bootstrap_gate.py.
   Blocks Edit/Write/Agent/NotebookEdit when the marker is absent;
   Bash/Read/Glob/Grep/Web*/MCP always allowed (Bash specifically because
   the marker-write itself is a Bash command). Order matters — the
   no-matcher entry runs before the matcher-scoped entries below it.

3. SubagentStart (NEW array): peer_inject.py with 12-stem matcher
   covering all spawnable PACT teammates (pact-orchestrator excluded;
   that one is delivered via --agent flag, not SubagentStart).

Both bootstrap bindings are in the SAFE livelock class per #538
(UserPromptSubmit + PreToolUse not in the TaskCompleted/TeammateIdle/Stop
forbidden class).

test_hooks_json.py: +1 restored test (test_subagent_start_covers_all_agent_types,
adapted to exclude pact-orchestrator) + new TestBootstrapGateInvariants class
(3 tests pinning the wiring contract structurally).

Smoke: full plugin suite 7110/7110 passes (7 skipped, baseline).

Refs: #628
…rip_orphan_routing_markers

Restores C6 + C8 + every-session-symmetry surfaces deleted in PR #621
to recover orphan-marker cleanup for the v3.x → v4.0 migration class.

claude_md_manager.py changes:

- NEW strip_orphan_kernel_block() (renamed from remove_stale_kernel_block,
  with SUNSET-BEFORE-v4.x.y comment marking it as a one-version-window
  migration helper). Operates on ~/.claude/CLAUDE.md. Strips PACT_START
  / PACT_END kernel block. file_lock + symlink-guard-inside-lock
  (TOCTOU defense). Handles orphan single-marker and reversed-pair
  (END-before-START) by returning a warning string for systemMessage
  routing rather than partial-strip.

- _strip_legacy_lines every-session restoration: extended
  strip_orphan_routing_markers to call _strip_legacy_lines after the
  marker-strip passes. Pre-C6 had this every-session inside
  update_pact_routing; restored to fire every SessionStart on the
  project CLAUDE.md (audit-backend symmetry argument).

shared/__init__.py: re-export strip_orphan_kernel_block + __all__ entry.

session_init.py: imports strip_orphan_kernel_block + wires it on
SessionStart so user-CLAUDE.md is cleaned alongside the project file.

Tests:
- test_claude_md_manager.py: 5 stale-doc refs scrubbed (3 references
  to remove_stale_kernel_block → strip_orphan_kernel_block; 2 references
  to update_pact_routing → _build_migrated_content / strip_orphan_routing_markers).
  NEW _StripOrphanBlockTestBase shared mixin + TestStripOrphanKernelBlock
  class (7 tests covering proper-pair / no-markers / orphan-START-only
  / orphan-END-only / reversed-pair / symlink / read-OSError).
- test_session_init.py: NEW TestStripOrphanRoutingMarkers class (7 tests)
  parallel to the kernel-block class via the same shared mixin shape,
  plus an every-session legacy-lines pin test.

Skipped (out of #628 scope): the content_sans_routing local-variable
cosmetic rename in _build_migrated_content. Zero behavioral impact;
would churn tests that grep for the variable name. Filed mentally as
a separate refinement if needed.

Smoke: full plugin suite 7124/7124 passes (7 skipped, baseline).
+14 net tests over Commit 8 baseline.

Refs: #628
Root README.md:
- Hooks table: row for bootstrap_gate.py (UserPromptSubmit + PreToolUse,
  inject session-start ritual directive on first turn)
- Migration table: row for v4.0 → v4.1 session-start ritual restoration

pact-plugin/README.md:
- L7 banner: chained v4.0 + v3.0 callout per plan Q2 — v4.0 breaking
  change (--agent flag persona delivery) chained with v3.0+ Agent Teams
  requirement; both embedded section links preserved.
- Install section: --agent flag wiring step pointing at the three
  convenience patterns (per-project .claude/settings.json, the pact shell
  wrapper, or manual --agent invocation).
- What's New: replaced v3.0+ list with v4.0+ bullets (--agent delivery,
  lazy-load protocols, restored session-start ritual v4.1, Communication
  Charter).
- Reference: 5 entries for Communication Charter, Completion Authority,
  State Recovery, S5 Policy, S4 Tension protocols.

Stays v4.0-specific minimum per plan §DevOps; broader pact-plugin/README.md
refresh deferred to pre-existing #593 (cross-reference at PR creation).

Refs: #628
4-file version dance:
- pact-plugin/.claude-plugin/plugin.json: "version": "4.1.0"
- .claude-plugin/marketplace.json: "version": "4.1.0"
- pact-plugin/README.md L3: > **Version**: 4.1.0
- README.md L508: directory-tree example 4.0.1/ → 4.1.0/

Minor bump justification (per pin: minor for new commands/agents/protocols):
- NEW commands/bootstrap.md (scaled-down ritual command)
- NEW persona §2 Session-Start Ritual + F2 mid-session pin-memory directive
- NEW bootstrap_gate + bootstrap_prompt_gate hooks (UserPromptSubmit + PreToolUse wiring)
- RESTORED peer_inject SubagentStart hook with charter cross-ref ADDENDUM
- RESTORED claude_md_manager strip_orphan_kernel_block (SUNSET-BEFORE-v4.x.y)
- RESTORED session_init bootstrap-pointer directive + 5-branch dispatch + marker-clear branch

Refs: #628
24 new tests in pact-plugin/tests/test_628_coverage.py covering 8
identified gaps in the 32-scenario plan §Test Phase coverage:

- TestBootstrapCommandStructure (4): file existence, frontmatter,
  ritual-only content, governance-fossil exclusion. Plus
  EXPECTED_COMMANDS entry in test_commands_structure.py for the
  structural test that pins plugin.json registration.
- TestBootstrapMarkerClearAcrossSources (2 + parametrized): clear-source
  unlinks; resume/startup/compact preserve.
- TestOrchestratorPersonaInvariants (3): bootstrap-skill invocation
  reference, §2 Session-Start Ritual presence, F2 mid-session
  pin-memory directive presence.
- TestLeadSideHaltFanOutByteEqualAtTwoSites (3): drift-pin asserting
  byte-equal HALT idiom in orchestrator §11 + algedonic protocol.
- TestStripOrphanRoutingMarkersLockTimeout (1): fail-open under lock
  timeout.
- TestTeachbackReminderCrossFileConsistency (3, YELLOW-1 resolution):
  Path-(a) cross-file consistency — _TEACHBACK_REMINDER text in
  peer_inject + pact-teachback skill body must both reference
  metadata.teachback_submit. Drift-detection on either side.
- TestMarkerNameConsistencyEncodingTolerant (1 + parametrized,
  YELLOW-2 resolution): cross-file consistency on marker-write
  semantics, not byte-equal syntax. Parametrized over 5 encodings;
  asserts at least one matches.

Suite floor preserved: 7124 baseline → 7145 (+21 net) with no
regressions. Test pollution fixed via module-isolation conftest pattern.

Refs: #628
Suppresses two reportMissingImports warnings on the
`from peer_inject import _TEACHBACK_REMINDER` lines (L461, L483) used
by TestTeachbackReminderCrossFileConsistency. The imports work at
runtime via the same sys.path bootstrap pattern as test_peer_inject.py;
this just makes Pyright static analysis match runtime resolution.

Refs: #628
… BLOCKING)

The bootstrap_gate's stated invariant — 'agent dispatch is blocked
until /PACT:bootstrap completes' — was broken because _BLOCKED_TOOLS
contained the literal 'Agent' which never matches the platform's
actual agent-dispatch tool name 'Task'.

Cross-evidence (in this same PR):
- pact-plugin/hooks/hooks.json L66 (PreToolUse): matcher='Task' for
  the team_guard.py hook that runs on agent dispatch
- pact-plugin/hooks/hooks.json L187 (PostToolUse): matcher='Task' for
  the auditor_reminder.py hook that runs on agent completion
- These matchers WORK in production, confirming Task is the canonical
  platform tool name

The test suite asserted the wrong invariant in lockstep with the
source — tests passed because they sent the literal 'Agent' to the
gate, but the platform never sends 'Agent'. test_bootstrap_gate.py
parametrize lists, the exact-members assertion, and prose docstrings
are all updated.

Surgical changes:
- bootstrap_gate.py: _BLOCKED_TOOLS literal Agent→Task; _DENY_REASON
  wording 'agent spawning (Agent)' → 'agent dispatch (Task)';
  docstring + clarifying paragraph cite the cross-evidence.
- test_bootstrap_gate.py: parametrize lists at L141, L153, L573;
  test_blocked_set_exact_members at L487-490 + docstring expansion.
- Note: TaskList/TaskGet/TaskUpdate (PACT plugin task-system tools
  used internally) are non-hookable and DISTINCT from the agent-
  dispatch 'Task' tool that IS blocked — this distinction is now
  documented in the bootstrap_gate.py docstring.

Origin: pre-#621 source had 'Agent' (predates PR #621's deletion);
restoration in 41cf80d carried the bug forward byte-equal. Caught
by adversarial security review on PR #641.

Empirical confirmation: tool_name='Task' with marker absent now
returns deny + exit_code=2 (was suppressOutput + exit_code=0
pre-fix).

Suite: 52/52 test_bootstrap_gate.py + 7145/0/10 full plugin suite.

Refs: #628
…Arch-M2)

Persona §2 line 49 said 'SESSION_START block in CLAUDE.md' but line 53
said 'Current Session block.' Both refer to the same heading; using two
names in adjacent sentences was internally inconsistent and confusing.

Standardized on 'Current Session block' (matches line 53, bootstrap.md
line 13, and the heading text rendered in CLAUDE.md). Removes a
1-instance reader-confusion source surfaced by architect-review's
self-review of Commit 6 (PR #641 cycle).

Refs: #628 #641
…T explicit (B1 + A1 bundle)

Bundles two fixes that landed in overlapping commits during PR #641
review remediation Cycle-2:

B1 (backend-coder-review finding) — peer_inject.main() fail-open
contract gap. Outer try/except wraps the build path so AttributeError
on non-dict stdin (e.g., '123', null, [1,2,3]) returns suppressOutput
+ rc=0 instead of propagating. Sibling docstring contract restored.

A1 (test-engineer-review finding) — sanitizer phantom-green for
Unicode line terminators. Counter-test-by-revert P1=0 RED today.
Two changes:
- Regex sharpened from glyph-form [\x00-\x1f\x7f<U+2028><U+2029>]
  to escape-form [\x00-\x1f\x7f\u0085\u2028\u2029]. Adds NEL
  (U+0085) which was MISSING from glyph-form (NEL falls in the gap
  between \x7f and the line-separator glyphs). Defends all 3
  line-terminators that LLM tokenizers and Python str.splitlines
  recognize.
- TestSanitizeAgentName parametrize: 3 new rows for U+0085, U+2028,
  U+2029 with explicit codepoint labels and adversarial e2e injection
  test. Cardinality: regex narrowed-to-ASCII would now produce ≥3
  RED.

Test changes:
- Inverted test_main_propagates_exception_from_get_peer_context to
  test_main_suppresses_exception_from_get_peer_context (was pinning
  the broken contract; now pins the fixed contract).
- NEW test_main_suppresses_non_dict_json_payloads parametrized over 7
  non-dict JSON shapes.

Bundle rationale: B1 source + B1 regression test + A1 parametrize
rows + A1 regex-sharpen all landed in working-tree concurrently
during cross-specialist remediation cycle. Index-serialization
discipline was attempted via SendMessage coordination but two
specialists working on adjacent file regions of test_peer_inject.py
+ peer_inject.py crossed staging boundaries. The pre-bundle state
was committed at da5afaa0 with 5 tests failing transiently mid-edit;
soft-reset + fresh re-stage produces 84/84 PASS post-bundle.

Empirical verifications:
- B1: stdin='123' → suppressOutput + rc=0 (was AttributeError + rc=1)
- A1: TestSanitizeAgentName::test_strips_unicode_line_terminators[\x85]
  / [\u2028] / [\u2029] all PASS

Refs: #628 #641
Caught by: backend-coder-review (B1 Task #22), test-engineer-review
(A1 Task #20)
Backend authors today use the f-string-touch encoding in
commands/bootstrap.md. The encoding-tolerant test
TestMarkerNameConsistencyEncodingTolerant covered 5 encodings;
test-engineer-review's A2 finding flagged 4 missing legitimate
alternatives that loud-fail today via the hard oracle but burden
future authors with extending the parametrize list manually.

Adds 4 parametrize rows to test_at_least_one_marker_write_encoding_matches:
- printf-create: `printf "" > "<path>/bootstrap-complete"`
- no-op redirect-create: `: > "<path>/bootstrap-complete"`
- echo-no-output redirect-create: `echo -n > "<path>/bootstrap-complete"`
- bare redirect-create: `>"<path>/bootstrap-complete"`

Mirrored into test_any_supported_encoding_matches rollup oracle so
the loud-fail accepts the new encodings as valid marker-write
semantics. Improved diagnostic message.

Counter-test-by-revert: simulated bootstrap.md rewrite to
`printf "" > ...` produces 4 PASS + 7 SKIP (oracle accepts new
encoding). Safe-substitution behavior preserved; no silent regression.

Floor: 7146 → 7150 expected (4 new PASS + 4 new SKIP rows).

Caught by: test-engineer-review (A2 Task #20).

Refs: #628 #641
…Arch-M1, Arch-F14)

Pre-extract: marker check was inlined inside _check_tool_allowed at
L80-82 as Path(session_dir) / BOOTSTRAP_MARKER_NAME + marker_path.exists().
The plan's §High-Risk-TDD-Specs called this 'bootstrap_gate.is_marker_set'
as if it were a real symbol; commands/bootstrap.md:60 referenced
the same name. Architect-review's Arch-M1 finding flagged the doc-vs-code
mismatch; Arch-F14 was the future-track twin asking to extract.

Resolved both by extracting is_marker_set(session_dir: str | None) -> bool
as a public top-level helper. Logic mirrors the pre-extraction inline
check 1:1. Public visibility (no underscore prefix) per Q1 acceptance:
plan uses bare name; tests target as a real callable; discoverability
matters for sibling-hook reuse and future refactors.

_check_tool_allowed now delegates the marker check via is_marker_set(session_dir).
Docstring documents fail-open behavior on OSError + flags S2 follow-up
for symlink hardening (next commit in this cycle).

Test changes:
- NEW TestIsMarkerSet class with 4 direct unit tests: present-marker,
  absent-marker, None-session, empty-session. All return-value
  assertions verify the helper's contract surface.
- 14 existing TestCheckToolAllowed tests still PASS (delegation
  through helper preserves call-site behavior).

Suite: 52 → 56 in test_bootstrap_gate.py; full plugin 7150 PASS.

Caught by: architect-review (Arch-M1 Task #18, Arch-F14 same).

Refs: #628 #641
Pre-commit: the persona-§2 vs commands/bootstrap.md split was
README-discipline-only — no test caught drift if a future contributor
added mechanics prose to persona §2 OR governance directives to
bootstrap.md.

NEW pact-plugin/tests/test_three_surface_split_enforcement.py (239
lines, 12 tests in 2 classes) applies the
convention_must_be_enforced_not_just_documented memory pattern at
MEDIUM strictness:

TestPersonaSection2ContainsNoMechanics (4 parametrized rows):
  - _extract_persona_section_2 bounds §2 by next `---` or next `## ` H2.
  - Forbidden patterns: \bmkdir\b, \btouch\b, /bootstrap-complete\b,
    fenced shell code-block. Persona §2 = directive surface ONLY.

TestBootstrapCommandExcludesGovernanceDirectives (8 tests):
  - 7 governance-keyword parametrize rows: MISSION:, MOTTO:, FINAL
    MANDATE, SACROSANCT, S5 POLICY, Algedonic, Completion Authority.
  - 1 mandatory-imperative pattern test: walks H2 sections; for each
    non-Step section, asserts no `**You MUST` (regex). Step-N
    sections legitimately use mandatory voice for ritual mechanics
    like "**Substitute <path>...**"; non-Step sections must not.
    bootstrap.md = mechanics surface ONLY.

Counter-test-by-revert (3 classes verified):
  - Inject `mkdir` into persona §2 → test_persona_section_2_excludes_
    mechanics_pattern[mkdir command] FAIL (1 RED)
  - Inject `## SACROSANCT non-negotiable` into bootstrap.md →
    test_bootstrap_md_excludes_governance_keyword[SACROSANCT] FAIL
    (1 RED)
  - Inject `## Overview\n**You MUST do X.**` into bootstrap.md →
    test_bootstrap_md_no_mandatory_imperative_outside_step_sections
    FAIL (1 RED)

All 3 violation classes loud-fail. Floor: 12 new PASS rows;
expected full-suite 7162 post-commit.

Caught by: architect-review (Arch-F4 Task #18).

Refs: #628 #641
…ID_CONTROL_CHARS_RE (S3+S9)

Two related sanitization fixes that share the same defense axis:

S3 (security-engineer-review): the slug derived from CLAUDE_PROJECT_DIR's
basename flows into a shell-quoted command body in commands/bootstrap.md
('mkdir -p "<path>" && touch "<path>/bootstrap-complete"'). A project-dir
basename containing shell metacharacters ($, backtick, ;, &&, |) would
shell-inject without producer-side sanitization. Added _UNSAFE_SLUG_CHARS_RE
in shared/pact_context.py that collapses any character outside [A-Za-z0-9_-]
into '_' inside _build_session_path before the slug reaches any downstream
consumer. Sanitize-substitute (NOT reject) so sessions with unusual project-dir
names still proceed; existing sessions on disk are untouched.

S9 (security-engineer-review): asymmetric session_id sanitization across
hook entry points. session_init.py defined its own _SESSION_ID_CONTROL_CHARS_RE
at L445 while shared/session_state.py defined _RENDER_STRIP_RE at L82 — same
strip set, two definitions, asymmetric-defense risk per security-engineer
memory patterns_symmetric_sanitization.md. Lifted to shared SSOT:
SESSION_ID_CONTROL_CHARS_RE in shared/session_state.py is now the canonical
strip regex; session_init.py imports it via the existing shared facade;
_RENDER_STRIP_RE preserved as a module-private alias for back-compat with
existing call sites.

Note on S6 (PACT_SESSION_DIR= info disclosure in bootstrap_prompt_gate):
NOT addressed in this commit. S6 is amplifier-only on S3 once S3's
producer-side sanitization holds — the path interpolated into PACT_SESSION_DIR=
is now sanitized at the producer, so the disclosure carries no shell-active
characters. The 'omit-the-line-on-shell-active-chars' fallback path that
backend's stage-ready described is structurally redundant once the producer
guarantees safe slugs. Reframed as defense-in-depth follow-up rather than
gap-closure. Will not file as separate issue unless review surfaces.

Tests: 8 new TestSanitizationProducerSide tests in test_pact_context.py
covering adversarial slug inputs (proj$(rm -rf ~)bar, control chars, empty)
+ symmetric defense across hook entry points. Suite: 7175 → 7177 (+2
parametrized).

Caught by: security-engineer-review (S3 + S9 Task #24).

Refs: #628 #641
Pre-fix: file_lock context manager fail-opens silently on the 5-second
timeout — every read-mutate-write protected operation (strip_orphan_kernel_block,
strip_orphan_routing_markers, _strip_legacy_lines, update_session_info,
update_pact_routing, etc.) silently skips when the lock can't be
acquired. An attacker (or buggy concurrent process) holding the lock
past the timeout causes silent deferral of cleanup operations
indefinitely with no signal to user or diagnostics.

Post-fix: BlockingIOError-deadline branch emits a stderr warning
before raising TimeoutError. Caller fail-open semantics unchanged
(callers still catch TimeoutError + skip the protected operation),
but the silent-skip event is now visible in Claude Code's debug logs.

Stderr message: 'PACT file_lock timeout: failed to acquire lock on
{lock_path} within {_LOCK_TIMEOUT_SECONDS}s; falling open'. Inline
comment documents the fail-open vs visibility trade-off and the
chosen design point (lowest-risk: surface event without behavior change).

This is the lowest-cost option among the 3 design candidates from
security-engineer-review (stderr warning vs lengthen timeout vs
fail-closed); the broader design discussion is deferred to a follow-up
issue if review surfaces deeper concerns.

Test: NEW test_timeout_emits_stderr_warning in TestFileLockContextManager.
Forces the timeout path by monkeypatching _LOCK_TIMEOUT_SECONDS=0 +
fcntl.flock to raise BlockingIOError on LOCK_EX only (LOCK_UN passes
through so the unlock in finally doesn't shadow the TimeoutError).
Asserts captured.err contains 'PACT file_lock timeout', the
lock-target path, and 'falling open'.

Suite: 7177 → 7178.

Caught by: security-engineer-review (S8 Task #24).

Refs: #628 #641
3-reviewer convergence finding from Round-2 blind review:
architect-review-r2 (R2-M1) + backend-coder-review-r2 (R2-B1) +
security-engineer-review-r2 (partial R2-S-A2) all independently
flagged that bootstrap_prompt_gate.py:62 used Path.exists() (which
follows symlinks, no ancestor check) while sibling bootstrap_gate.py
uses the hardened is_marker_set helper (lstat + S_ISREG + ancestor-
symlink defense via Path.resolve containment).

Defense-asymmetry reproducer: pre-planted symlink at session_dir/
bootstrap-complete → /etc/hostname → tool gate (is_marker_set) DENIES
Edit/Write/Task; prompt gate (Path.exists()) observes True and
suppresses bootstrap directive injection. Orchestrator gets blocked
tools with NO bootstrap directive explaining why → confused state.

Fix: import is_marker_set from bootstrap_gate, replace L62 inline
check. Single safe-marker-check contract across both enforcement
points. Closes the highest-conviction Round-2 finding.

Refs: #628 #641
…R2-T2 + R2-T3)

Two test-tightening fixes from Round-2 test-engineer-review-r2:

R2-T2 (case-insensitive governance keyword check): TestBootstrapCommandExcludesGovernanceDirectives's keyword check used case-sensitive substring (`keyword not in bootstrap_text`). Per user-decision: full case-insensitive. Future contributor lowercasing a section heading (`## sacrosanct non-negotiables`) or mixing case (`Mission:`) would bypass case-sensitive guard. Updated to lowercase BOTH sides before comparison; assertion message clarifies '(case-insensitive)'.

R2-T3 (MUST regex variant): test_bootstrap_md_no_mandatory_imperative_outside_step_sections originally matched only uppercase `**You MUST`. Updated regex to `**You (MUST|must|Must)` triple-form to catch all-caps mandate, lowercase imperative, and leading-cap stylized forms. Word-boundary `\b` prevents false-firing on substring matches like 'Mustard'.

Validation: grep current bootstrap.md for all 7 lowercase governance keyword variants — empirical absence confirmed; no false-positives at status-quo. Existing 12 tests still pass.

Refs: #628 #641
Round-2 test-engineer-review-r2 finding: TestMarkerNameConsistencyEncodingTolerant 9-encoding hard oracle missed legitimate `tee FILE </dev/null` and `dd of=FILE if=/dev/null` argument-form encodings. Hard oracle still loud-fails on rewrite (forcing function intact), but reduces future-author maintenance burden.

Adds 2 parametrize rows + mirrors them into test_any_supported_encoding_matches rollup oracle. Counter-test: simulate bootstrap.md rewrite to `tee` form → oracle accepts gracefully.

Refs: #628 #641
…mpotency (R2-NIT-2)

Round-2 architect-review-r2 finding: §2 'When to re-invoke' sub-section L60 read 'Re-invoke Skill("PACT:bootstrap") only when:' — implied a hard precondition. But session_init.py:681 narrows marker-clear to source=clear (per #414); the marker SURVIVES compaction. Post-compaction re-invocation is idempotent advisory rather than required.

Wording change:
- Old: 'The ritual is per-session, not per-turn. Re-invoke Skill("PACT:bootstrap") only when:'
- New: 'The ritual is per-session and idempotent — the marker survives compaction. Re-invoke Skill("PACT:bootstrap") for team-existence re-verification when:'

Removes the hard-precondition implication; preserves the same downstream sub-bullets (post-compaction + bootstrap_gate refusal). Closes the wording-vs-behavior drift surfaced by Round-2.

Refs: #628 #641
…Error test (R2-S-r2-1; R2-B1 followup)

Round-2 security-engineer-review-r2 finding: pact_context.init() session_id sanitizer used SESSION_ID_CONTROL_CHARS_RE strip-only (control-chars-only), admitting shell metacharacters (", ;, |, $, backtick, &, \\, parens, spaces). Slug uses substitute-from-allowlist (_UNSAFE_SLUG_CHARS_RE); session_id should follow same shape per memory patterns_symmetric_sanitization.md.

Producer-side fix: replace SESSION_ID_CONTROL_CHARS_RE.sub("", ...) with _UNSAFE_SLUG_CHARS_RE.sub("_", ...) in pact_context.init(). Reuses the existing slug regex (single SSOT — no two-source-of-truth drift). Comment block expanded to explain Cycle-2 → Cycle-3 widening (control-chars-only → full allowlist).

R2-B1 followup: test_bootstrap_prompt_gate.py::test_oserror_in_marker_check was patching the OLD code path (Path.exists()) which R2-B1's c760ff4/5b12f805 replaced with is_marker_set (lstat-based). Inverted to test_oserror_in_marker_check_treats_marker_absent — pins the new contract: OSError in is_marker_set → returns False → bootstrap directive injected (was suppressOutput pre-fix). Sibling-symmetric with the Cycle-2 S2 inversion already applied to bootstrap_gate's TestFailOpen.

Tests:
- test_pact_context.py: inverted test_session_id_with_control_chars_substituted_at_init; NEW test_session_id_shell_metacharacters_substituted parametrized on 8 adversarial inputs (", $, backtick, ;, |, &&, NL, NUL).
- test_bootstrap_prompt_gate.py: inverted OSError test to mock os.lstat instead of Path.exists; pins the post-R2-B1 fail-open-via-helper-OSError contract.

Empirical verification: pact_context.init({"session_id": 'evil$(rm -rf ~)id'}) → context_path's session_id component is 'evil_rm_-rf__id' (substituted, no shell metacharacters). 140/140 tests pass on the 3 affected files.

Refs: #628 #641
… comment (R2-NIT-4)

Round-2 architect-review-r2 finding: commands/bootstrap.md:62-63 marker-name SSOT coupling comment was functional but didn't cite the broader MEMORY pattern. A future contributor maintaining the file might not know there's a 'convention-must-be-enforced-not-just-documented' pattern + structural enforcement test backing the comment.

Extends the existing HTML coupling comment with:
- Pattern reference: convention-must-be-enforced-not-just-documented
- Cross-reference to test_three_surface_split_enforcement.py as the structural enforcement surface

Net effect: future readers see the coupling-comment + its enforcement pattern in-place, reducing the chance of silent drift if the marker-name moves.

Refs: #628 #641
…SOURCES (R2-B3)

Round-2 backend-coder-review-r2 finding: session_init.py L1108-1118 anomalous-source else-branch lumped legitimate-but-anomalous sources (compact/clear/no-team) and unrecognized values under same WARNING text. Auditability concern.

Fix: split via _KNOWN_SOURCES = {"startup", "resume", "compact", "clear"} allowlist:
- Known source without team → INFORMATIONAL note 'Session source X without team — creating fresh team. Run TaskList to check current state.' Recovery-hint preserved (per Q1); WARNING tone removed since this is a legitimate scenario (user ran /clear, no prior session).
- Unknown source → stderr WARNING 'session_init: unknown source value: X' + additionalContext WARNING preserved. Distinguishes adversarial-or-malformed-input from legitimate state-reset.

Note on source normalization: session_init's preceding _VALID_SOURCES check normalizes any non-allowlist raw_source to literal 'unknown' before reaching the dispatch. So _KNOWN_SOURCES test always routes raw-malformed input through the unknown-source branch with literal token 'unknown' in messages — auditability win without leaking raw stdin into additionalContext.

Tests:
- INVERTED 3 tests (test_resume/compact/clear_no_team_anomalous → _informational suffix). New assertions check WARNING absent + 'creating fresh team' present.
- NEW test_unknown_source_no_team_warns_and_emits_stderr parametrized on 4 raw inputs (weird, manual_invoke, unknown-xyz, compact1) covering the normalization-class spectrum.

Suite: 7186 → 7190 (+4 net new), 16 skipped, 0 failed.

Refs: #628 #641
…ecretary spawn (R2-NIT-1)

Round-2 architect-review-r2 finding: commands/bootstrap.md Step 2 (secretary spawn) didn't explicitly note that the spawn is per-session, not per-turn. Persona §2 covers the distinction in the directive surface, but the mechanics surface should reinforce it for symmetry.

Adds standalone sentence at end of Step 2:
'Spawn the secretary once per session — reuse the existing instance on subsequent re-invocations of this command rather than spawning a duplicate.'

Standalone-paragraph form (not parenthetical inline) per architect's Q1 default — matches symmetry with persona §2 NIT-2 wording, gives readers a hard-stop signal rather than aside.

Refs: #628 #641
… _extract_prev_session_dir (R2-B4)

Round-2 backend-coder-review-r2 finding: _extract_prev_session_dir read project CLAUDE.md without acquiring file_lock, while all writers (update_session_info, update_pact_routing, migrate_to_managed_structure) use the lock. Asymmetric read-side gap: a writer mid-write could produce a torn read on the reader.

Fix: wrap the read_text() call in 'with file_lock(claude_md):'. Re-entrancy verified safe — the read at step 5a runs AFTER strip_orphan_routing_markers releases its lock at step 3c, BEFORE update_session_info acquires its own at step 5b. No nesting.

Fail-open contract preserved: file_lock raises TimeoutError on 5s timeout (with stderr warning per S8), and _extract_prev_session_dir's existing exception handling returns None (graceful degradation to new-session path).

Tests: NEW TestExtractPrevSessionDirReadSideLock class with 2 deterministic-monkeypatch tests:
- test_acquires_file_lock_before_reading: monkeypatch file_lock + Path.read_text to record entry/exit/read events; assert order ['lock_enter', 'read', 'lock_exit'].
- test_timeout_returns_none_fail_open: monkeypatch file_lock to raise TimeoutError; assert returns None.

Deterministic-monkeypatch over threading-based concurrent-write to avoid timing fragility (per dispatch Q recommendation).

Suite: 7190 → 7192 (+2 net new). 16 skipped, 0 failed.

Refs: #628 #641
…r (R2-NIT-3 + R2-M2 + cleanup)

Bundle three small chores from Round-2 review:

R2-NIT-3 (architect-review-r2): bootstrap_gate.py:108 is_marker_set docstring softer phrasing. Old: 'any OSError on stat (treated as marker-absent → fail-closed for the gate-bypass class; gate stays armed)'. New: 'any OSError on stat (treated as marker-absent so the gate stays armed)'. Semantics correct in both wordings; new wording is calmer and doesn't conflate fail-closed/fail-open terminology.

R2-M2 (architect-review-r2): claude_md_manager.py:282 SUNSET BEFORE v4.x.y → SUNSET BEFORE v5.0.0. v5.0.0 = next major after v4.x stabilizes; matches typical migration-helper sunset cadence. strip_orphan_kernel_block is a v3.x → v4.x migration helper; concrete sunset gives future contributors a deletion target.

Cleanup (Pyright reportRedeclaration): test_session_init.py L4459 _make_banner_plugin_root duplicate function definition deleted. Pre-existing latent issue (introduced 3f759fa Cycle-1 Commit 7 when 11 reinstated test classes were bulk-appended; included near-superset of L2764+ block with overlap on this helper). No behavior change today (Python last-defined-wins on duplicate function names with byte-identical bodies); only Pyright reportRedeclaration warning silenced. Verified: grep -nE '^def _make_banner_plugin_root' returns only L2856 (renumbered post-dedupe) post-fix.

Suite: 7192 passed / 16 skipped / 0 failed (was 7192 — net 0 because the dead orphan class skeleton had 0 tests). Pyright cleaner.

Refs: #628 #641
Cycle-4 remediation per pinned rule: committed plugin agent-consumed
docs describe current-state instructions only.

- bootstrap_prompt_gate.py: drop "agent spawning (Agent)" wording, use
  "agent dispatch (Task)" matching bootstrap_gate canonical name; strip
  3-reviewer-convergence prefix and past-conditional Path.exists()
  sentence.
- session_init.py: drop reviewer-ref prefix on file_lock comment; drop
  reviewer-ref prefix and Pre-fix/Post-fix narrative from KNOWN_SOURCES
  branch comment, replace with concise current-state description; bump
  obsolete-marker comment from "v4.x.y" to "v5.0.0".
- shared/pact_context.py: drop reviewer-ref prefix and Cycle-2-history
  paragraph from session_id sanitizer comment; keep current-state
  symmetric-defense intent and sanitize-substitute rationale.
- tests/test_claude_md_manager.py: bump SUNSET-BEFORE marker docstrings
  from v4.x.y to v5.0.0.
- tests/test_session_init.py: rename "compact1" parametrize input to
  "compact_typo" for clarity (no behavior change).
… mid-clause from §2 re-invoke wording (R3-Arch-NIT-1)

The mid-clause narrowed the trigger to a single use case ("team-existence
re-verification"), but the bullets that follow describe the broader set of
re-invoke conditions (post-resume, gate refusal, etc.). The clause is now
dropped so the lead-in matches the bullet list it introduces.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Restore session_init inline directives + peer_inject hook (5 functions) deleted in C4/C7 of #621

1 participant