fix(hooks): deleting or breaking DONE.md mid-session no longer bypasses the gate by alexandertomana · Pull Request #1 · intrepideai/donegate

alexandertomana · 2026-06-10T11:36:23Z

The problem

The stop hook had two fail-open holes, each one command away from disarming the gate mid-session:

rm DONE.md was a total bypass. The hook short-circuits with exit 0 when findDonefile() comes up empty — before any guard can run. The no_done_edits guard's documented "or deleted" finding was unreachable through the hook path, because the hook needs the donefile to load before guards exist. One rm and the agent stops freely.
Corrupting DONE.md was also a bypass. loadConfig() throws → warn on stderr → allow. Reasonable for a human's config typo; exploitable as a one-edit off switch for the exact file the agent is incentivized to break.

Both contradict the README's stated guarantee (no_done_edits ... "DONE.md modified or deleted mid-session → fail") and the FAQ ("Won't the agent just edit DONE.md?").

The fix

The session baseline (.donegate/baseline.json) already records donefile_path + donefile_sha — it just wasn't consulted on these paths. Now:

Missing donefile + orphaned baseline → the hook walks up looking for a .donegate/baseline.json whose recorded donefile has vanished, and bounces the stop with restore instructions ({"decision":"block", ...} / followup_message).
Unparseable donefile whose hash no longer matches the baseline → same treatment, with the parse error in the report.

Verified end-to-end through the built CLI:

$ donegate baseline --quiet && rm DONE.md
$ echo '{"session_id":"s1", ...}' | donegate hook claude
{"decision":"block","reason":"donegate: NOT DONE — DONE.md was deleted mid-session (attempt 1/3). ..."}

What deliberately did NOT change

No-trap guarantees hold. Both new paths share the per-session bounce budget (the default 3 — the donefile that would configure gate.max_bounces is exactly what's missing/unreadable), then give up loudly. Cursor aborted/ctrl-c turns are never gated. Repos that never opted in (no donefile, no baseline) remain silent no-ops.
Pre-existing breakage still fails open. A donefile that was already broken when the session started (no baseline, or hash unchanged) warns and allows, as before — a config typo must never trap an agent that didn't cause it.
The give-up paths tell humans the legitimate off switch: delete .donegate/ too when removing donegate for real.

Also in this PR

docs/threat-model.md — an honest map: what the gate catches outright, what it deliberately can't (semantic cheats like weakened assertions, command indirection through package.json, attacks on donegate's own state from inside the sandbox), and why donegate install ci + branch protection is the actual security boundary.
README: guards framed as a "ratchet, not a sandbox" with a pointer to the threat model; FAQ updated to cover delete/corrupt; hooks docs explain the new behavior.
Refactor: the three blocking paths share one bounceOrGiveUp() helper instead of triplicating the protocol JSON + bounce-state logic.

Tests

Three new cases in test/hooks.test.ts (79 total, all passing): deletion bounces ×3 then gives up then recovers on restore; corruption bounces then recovers on repair; cursor aborted turns stay ungated even with the donefile gone. node dist/cli.js check (the repo's own gate) is clean, all six guards green.

🤖 Generated with Claude Code

…es the gate The stop hook had two fail-open holes, each one command away from disarming the gate: - rm DONE.md → findDonefile() comes up empty → silent no-op allow. The no_done_edits guard's "deleted" finding was unreachable through the hook: it needs the donefile to load before any guard can run. - a DONE.md that no longer parses → loadConfig() throws → warn and allow, even when the breakage happened mid-session. Both paths now bounce the stop when the session baseline proves the donefile existed (and what it hashed to) at session start: - missing donefile + orphaned .donegate/baseline.json → block with restore instructions - unparseable donefile whose hash no longer matches the baseline → block with the parse error in the report The no-trap guarantees hold: both paths share the per-session bounce budget (the default 3, since the donefile that would configure gate.max_bounces is exactly what's missing or unreadable), cursor ctrl-c/aborted turns are still never gated, repos that never opted in are still silent no-ops, and a donefile that was already broken before the session started still fails open. Also: docs/threat-model.md — an honest map of what the gate catches, what it deliberately doesn't (semantic cheats, command indirection, attacks on donegate's own state), and why CI + branch protection is the actual boundary. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

alexandertomana mentioned this pull request Jun 10, 2026

feat: loop-age features — guards.protect, SubagentStop adapter, check --against, progress-aware bounces #2

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(hooks): deleting or breaking DONE.md mid-session no longer bypasses the gate#1

fix(hooks): deleting or breaking DONE.md mid-session no longer bypasses the gate#1
alexandertomana wants to merge 1 commit into
mainfrom
fix/donefile-tamper-bypass

alexandertomana commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

alexandertomana commented Jun 10, 2026

The problem

The fix

What deliberately did NOT change

Also in this PR

Tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant