You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Follow-up to issue #401 / PR #477. PR #477 ships Phase 1 (advisory mode) per the canonical plan's advisory-first discipline. This issue tracks Phase 2 (blocking mode), which was deliberately held from PR #477 pending empirical observation data.
See the canonical plan: docs/plans/teachback-gate-plan.md (local-only; also summarized in PR #477 body).
Scope
Flip teachback_gate.py from advisory to blocking mode. Single constant change + hook output semantics migration:
Test assertions migrate from TestAdvisoryWarningMode expectations to TestBlockingMode expectations
hooks.json registration confirmed (already matcherless PreToolUse; no migration needed)
Once flipped: when a teammate at variety ≥ 7 attempts Edit/Write without valid teachback content, the hook blocks the tool call with a concrete rejection message pointing at the specific validation failure.
Gate criteria (from canonical plan §F10)
Phase 2 flip must satisfy:
Zero false-positive blocks observed over 2 consecutive full workflows at variety ≥ 7, measured via teachback_gate_advisory journal events with would_have_blocked=true.
A false positive is a well-formed teachback that the advisory rules flagged anyway — meaning the peripheral content-rules are too strict. Zero false positives over 2 workflows indicates the rules are calibrated correctly for the honest-reframe posture (ritual enforcement for honest-but-careless output).
Pre-existing diagnostic
pact-plugin/scripts/check_teachback_phase2_readiness.py (shipped in PR #477 commit c309f1c) reads session journals and computes the F10 criterion. Use this to determine when Phase 2 is ready to flip:
python3 pact-plugin/scripts/check_teachback_phase2_readiness.py
# → pass/fail + observation counts per workflow
Dependent follow-ups
Per peer-review synthesis on PR #477, these tuning questions become actionable after Phase 1 observation data surfaces:
Lock in _TEACHBACK_MODE removal from public-flag surface once Phase 2 is verified stable
Precedent
This mirrors PR #407 (bootstrap gate enforcement) → PR #415 + issue #414 multi-phase shipping: advisory-first then hardening-follow-up is a PACT-wide convention for enforcement-code rollouts.
Context
Follow-up to issue #401 / PR #477. PR #477 ships Phase 1 (advisory mode) per the canonical plan's advisory-first discipline. This issue tracks Phase 2 (blocking mode), which was deliberately held from PR #477 pending empirical observation data.
See the canonical plan:
docs/plans/teachback-gate-plan.md(local-only; also summarized in PR #477 body).Scope
Flip
teachback_gate.pyfrom advisory to blocking mode. Single constant change + hook output semantics migration:pact-plugin/hooks/teachback_gate.py:_TEACHBACK_MODE = "advisory"→"blocking"systemMessage(exit 0, advisory) →hookSpecificOutput/permissionDecision=deny(exit 2, blocking)TestAdvisoryWarningModeexpectations toTestBlockingModeexpectationshooks.jsonregistration confirmed (already matcherless PreToolUse; no migration needed)Once flipped: when a teammate at variety ≥ 7 attempts Edit/Write without valid teachback content, the hook blocks the tool call with a concrete rejection message pointing at the specific validation failure.
Gate criteria (from canonical plan §F10)
Phase 2 flip must satisfy:
A false positive is a well-formed teachback that the advisory rules flagged anyway — meaning the peripheral content-rules are too strict. Zero false positives over 2 workflows indicates the rules are calibrated correctly for the honest-reframe posture (ritual enforcement for honest-but-careless output).
Pre-existing diagnostic
pact-plugin/scripts/check_teachback_phase2_readiness.py(shipped in PR #477 commit c309f1c) reads session journals and computes the F10 criterion. Use this to determine when Phase 2 is ready to flip:Dependent follow-ups
Per peer-review synthesis on PR #477, these tuning questions become actionable after Phase 1 observation data surfaces:
_emit_state_transition_if_changedjournal scans; relevant only if Phase 2 deny rate is sustained highBlocked until
check_teachback_phase2_readiness.pyreportspasswith zero false-positive blocksImplementation outline
teachback_gate_advisoryevents to tune peripheral rules if needed (Post-Phase-1: tune template-density threshold (50% → 25%) based on observation data #479 deferral)TestAdvisoryWarningMode→TestBlockingMode_TEACHBACK_MODEremoval from public-flag surface once Phase 2 is verified stablePrecedent
This mirrors PR #407 (bootstrap gate enforcement) → PR #415 + issue #414 multi-phase shipping: advisory-first then hardening-follow-up is a PACT-wide convention for enforcement-code rollouts.