fix(hooks): deleting or breaking DONE.md mid-session no longer bypasses the gate#1
Open
alexandertomana wants to merge 1 commit into
Open
fix(hooks): deleting or breaking DONE.md mid-session no longer bypasses the gate#1alexandertomana wants to merge 1 commit into
alexandertomana wants to merge 1 commit into
Conversation
…es the gate The stop hook had two fail-open holes, each one command away from disarming the gate: - rm DONE.md → findDonefile() comes up empty → silent no-op allow. The no_done_edits guard's "deleted" finding was unreachable through the hook: it needs the donefile to load before any guard can run. - a DONE.md that no longer parses → loadConfig() throws → warn and allow, even when the breakage happened mid-session. Both paths now bounce the stop when the session baseline proves the donefile existed (and what it hashed to) at session start: - missing donefile + orphaned .donegate/baseline.json → block with restore instructions - unparseable donefile whose hash no longer matches the baseline → block with the parse error in the report The no-trap guarantees hold: both paths share the per-session bounce budget (the default 3, since the donefile that would configure gate.max_bounces is exactly what's missing or unreadable), cursor ctrl-c/aborted turns are still never gated, repos that never opted in are still silent no-ops, and a donefile that was already broken before the session started still fails open. Also: docs/threat-model.md — an honest map of what the gate catches, what it deliberately doesn't (semantic cheats, command indirection, attacks on donegate's own state), and why CI + branch protection is the actual boundary. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The problem
The stop hook had two fail-open holes, each one command away from disarming the gate mid-session:
rm DONE.mdwas a total bypass. The hook short-circuits with exit 0 whenfindDonefile()comes up empty — before any guard can run. Theno_done_editsguard's documented "or deleted" finding was unreachable through the hook path, because the hook needs the donefile to load before guards exist. Onermand the agent stops freely.Corrupting DONE.md was also a bypass.
loadConfig()throws → warn on stderr → allow. Reasonable for a human's config typo; exploitable as a one-edit off switch for the exact file the agent is incentivized to break.Both contradict the README's stated guarantee (
no_done_edits... "DONE.md modified or deleted mid-session → fail") and the FAQ ("Won't the agent just edit DONE.md?").The fix
The session baseline (
.donegate/baseline.json) already recordsdonefile_path+donefile_sha— it just wasn't consulted on these paths. Now:.donegate/baseline.jsonwhose recorded donefile has vanished, and bounces the stop with restore instructions ({"decision":"block", ...}/followup_message).Verified end-to-end through the built CLI:
What deliberately did NOT change
gate.max_bouncesis exactly what's missing/unreadable), then give up loudly. Cursor aborted/ctrl-c turns are never gated. Repos that never opted in (no donefile, no baseline) remain silent no-ops..donegate/too when removing donegate for real.Also in this PR
docs/threat-model.md— an honest map: what the gate catches outright, what it deliberately can't (semantic cheats like weakened assertions, command indirection through package.json, attacks on donegate's own state from inside the sandbox), and whydonegate install ci+ branch protection is the actual security boundary.bounceOrGiveUp()helper instead of triplicating the protocol JSON + bounce-state logic.Tests
Three new cases in
test/hooks.test.ts(79 total, all passing): deletion bounces ×3 then gives up then recovers on restore; corruption bounces then recovers on repair; cursor aborted turns stay ungated even with the donefile gone.node dist/cli.js check(the repo's own gate) is clean, all six guards green.🤖 Generated with Claude Code