diff --git a/CHANGELOG.md b/CHANGELOG.md index a38ec92..61668e0 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,20 @@ # Changelog +## Unreleased + +- **The donefile can no longer be deleted or broken out of the way.** The stop + hook used to treat a missing DONE.md as "not my repo" and an unparseable one + as a config typo — both fail-open, both one `rm` or one bad edit away from + disarming the gate mid-session. Now, when the session baseline records a + donefile that has since vanished or stopped matching, the stop is bounced + with restore instructions instead (bounded by the default bounce budget). + Repos that never opted in are still silent no-ops, and a donefile that was + already broken before the session started still fails open — a config typo + must never trap an agent that didn't cause it. +- `docs/threat-model.md` — an honest map of what the gate catches, what it + deliberately doesn't (semantic cheats, command indirection, attacks on + donegate's own state), and why CI + branch protection is the actual boundary. + ## 0.1.0 Initial release. diff --git a/README.md b/README.md index bb5a37b..d23f256 100644 --- a/README.md +++ b/README.md @@ -174,6 +174,13 @@ these patterns (lint configs, pattern scanners) go in `guards.exclude`, and a deliberate DONE.md edit is blessed with `donegate baseline`. Renames are followed, so moving a test file is never "deleting" it. +Guards are a **ratchet, not a sandbox**: they make the cheap, common shortcuts +loud and expensive, with receipts. An agent with shell access can still find +quieter moves — weakening assertions, redefining what `npm test` means in +package.json, re-blessing the baseline itself. What the gate catches, what it +deliberately doesn't, and why CI is the copy of the gate an agent can't touch: +[docs/threat-model.md](docs/threat-model.md). + ## Works with | | command | mechanism | @@ -187,9 +194,11 @@ followed, so moving a test file is never "deleting" it. Hooks are installed **project-level by default** (commit them — the whole team is gated) or `--global` for every repo you touch, and they carry explicit generous timeouts so a long test suite is never cut off by an agent's 60-second -hook default. Repos without a DONE.md are silently ignored, an agent can never -be trapped (see *bounce protection* in [docs/hooks.md](docs/hooks.md)), and -ctrl-c always means ctrl-c. +hook default. Repos without a DONE.md are silently ignored — though deleting +or breaking DONE.md *mid-session* doesn't make a repo one of those; the +baseline remembers, and the stop gets bounced. An agent can never be trapped +(see *bounce protection* in [docs/hooks.md](docs/hooks.md)), and ctrl-c always +means ctrl-c. ## Receipts, not vibes @@ -222,8 +231,10 @@ Code 3 is the one to alert a human on. ## FAQ (the short version) -**Won't the agent just edit DONE.md?** That trips `no_done_edits`. The -definition of done is not the agent's to edit. +**Won't the agent just edit DONE.md?** That trips `no_done_edits`. Deleting it +or breaking its yaml doesn't work either — the session baseline remembers it +existed, and the stop hook bounces until it's restored. The definition of done +is not the agent's to edit. **Won't it delete the failing test?** That trips `no_deleted_tests` — file deletions *and* per-file test-count drops. diff --git a/docs/hooks.md b/docs/hooks.md index 289dd0b..05bd7da 100644 --- a/docs/hooks.md +++ b/docs/hooks.md @@ -89,3 +89,26 @@ increments a per-session bounce counter (`.donegate/state.json`, pruned after stop through with a loud warning — but it keeps verifying, so the receipt always tells the truth. Sessions that recover reset their counter on the first green run. + +## When the gate itself is the target + +A stop hook that can be disarmed by deleting its config file isn't much of a +gate, so the donefile gets special handling: + +- **Deleted mid-session** — repos without a DONE.md are normally silent + no-ops, but if `.donegate/baseline.json` records a donefile that has since + vanished, the stop is bounced with restore instructions instead. Removing + donegate for real is still easy — delete `.donegate/` too — it's just not + something an agent can do as a shortcut without it showing. +- **Broken mid-session** — a donefile that no longer parses *and* no longer + matches the baseline hash bounces the stop, with the parse error in the + report. One that was already broken when the session started (or has no + baseline at all) warns and allows, unchanged: a pre-existing config typo + must never trap an agent that didn't cause it. + +Both paths use the default bounce budget (3) — the donefile that would +normally configure `gate.max_bounces` is exactly the thing that's missing or +unreadable — and both keep every no-trap guarantee: bounded bounces, ctrl-c +respected, never-opted-in repos untouched. The wider map of what an agent +could still do, and why CI is the backstop, is in +[threat-model.md](threat-model.md). diff --git a/docs/threat-model.md b/docs/threat-model.md new file mode 100644 index 0000000..5ec6930 --- /dev/null +++ b/docs/threat-model.md @@ -0,0 +1,83 @@ +# Threat model + +donegate gates an agent **from inside the agent's own sandbox**. The hook +config, the baseline, the bounce state — even donegate's installed code — all +live in files the agent could touch. That makes the design question precise: +which cheats does the gate make *impossible*, which does it make *loud and +expensive*, and which does it merely leave evidence of? This page is the +honest map. + +## What the gate catches outright + +Deterministic, diff-based, with `file:line` receipts: + +| move | what happens | +|---|---| +| stop while checks fail | bounced, failure output fed back into context | +| `.skip` / `.only` / `xfail` / `t.Skip()` / `#[ignore]` added | `no_new_skips` → exit 3 | +| test file deleted, or a per-file test count dropping | `no_deleted_tests` → exit 3 | +| `eslint-disable` / `@ts-ignore` / `# noqa` / `//nolint` added | `no_disabled_lint` → exit 3 | +| DONE.md edited mid-session | `no_done_edits` → exit 3 | +| DONE.md **deleted** mid-session | stop bounced — the baseline remembers it existed | +| DONE.md **broken** mid-session (no longer parses) | stop bounced — its hash no longer matches the baseline | + +Exit 3 — *"checks pass but the bar was lowered to get there"* — is +deliberately a distinct, alarmable signal. + +## What the gate cannot catch (and won't pretend to) + +These are out of reach for any deterministic, no-LLM gate, and you should know +that rather than guess it: + +- **Semantic cheats inside the tests.** Weakening an assertion + (`expect(total).toBe(total)`), gutting a test body while keeping its name, + deleting a hard test and adding a trivial one in the same file (the per-file + count stays flat), updating snapshots to bake a bug in, or writing both the + bug and the test that blesses it. Catching these means judging *meaning*; a + diff and a regex can't, and donegate refuses to guess. +- **Command indirection.** DONE.md is hash-guarded, but `run: npm test` + resolves through package.json — and package.json, jest.config, pytest.ini, + tsconfig, and the linter config are all agent-editable. `"test": "exit 0"` + is green. Mitigations: prefer direct invocations in your checks + (`npx vitest run` rather than `npm test`) and treat tooling-config diffs as + review-worthy, because they are. +- **Attacks on donegate's own state.** An agent with shell access can run + `donegate baseline` to bless its own donefile edit, hand-edit + `.donegate/state.json` to burn the bounce budget, or patch + `node_modules/donegate` itself. Local hooks raise the cost of cheating and + leave a trail; they are not a security boundary. + +## Where the boundary actually is + +The copy of the gate an agent can't touch is the one that runs **outside its +sandbox**: + +1. **`donegate install ci`** — the same DONE.md, the same guards, evaluated + against the merge-base on a runner the agent never sees. A deleted DONE.md + is a config error there (exit 2), not a no-op. +2. **Branch protection** — make the donegate job a required check, so "merge + anyway" isn't on the table. +3. **Receipts in review** — the CI install posts the receipt (guard findings + included) as a PR comment, so a lowered bar is visible to the humans who + own the bar. + +Local hooks are the fast loop: catch it while the agent still has the context +to fix it. CI is the slow loop: catch whatever survived. The combination is +the design — neither alone is. + +## Design choices worth knowing + +- **Bounded blocking, always.** Every blocking path has a bounce budget + (`gate.max_bounces`, default 3 — the default also applies when the donefile + is missing or unreadable, since its config is too). After that the stop is + allowed with a loud red note, and the receipt stays honest. A gate that can + hold a session hostage is worse than a gate that can be outlasted. +- **Fail open on infrastructure, fail closed on work.** A donefile that was + already broken *before* the session (no baseline, or hash unchanged since + it) warns and allows — donegate won't trap an agent in a problem it didn't + cause. A donefile that broke *during* the session is treated as work, not + infrastructure: bounced. +- **Receipts are evidence, not proof.** They're plain JSON written by whatever + ran the gate, not cryptographically signed artifacts. The local ones are for + fast feedback and honest retrospectives; the ones CI writes and posts are + the ones to trust. diff --git a/src/donefile.ts b/src/donefile.ts index fe28340..c8db980 100644 --- a/src/donefile.ts +++ b/src/donefile.ts @@ -40,7 +40,10 @@ const DEFAULT_GUARDS: GuardsConfig = { exclude: [], }; -const DEFAULT_GATE: GateConfig = { max_bounces: 3 }; +/** Bounce budget used when there is no (readable) donefile to say otherwise. */ +export const DEFAULT_MAX_BOUNCES = 3; + +const DEFAULT_GATE: GateConfig = { max_bounces: DEFAULT_MAX_BOUNCES }; const CANDIDATES = ['DONE.md', 'done.yml', 'done.yaml', path.join('.donegate', 'done.yml')]; diff --git a/src/hooks.ts b/src/hooks.ts index b36679b..4a0c404 100644 --- a/src/hooks.ts +++ b/src/hooks.ts @@ -1,9 +1,9 @@ import fs from 'node:fs'; import path from 'node:path'; -import type { CheckRunSummary, DoneConfig, GuardFinding, Receipt } from './types.js'; -import { findDonefile, loadConfig } from './donefile.js'; +import type { Baseline, CheckRunSummary, DoneConfig, GuardFinding, Receipt } from './types.js'; +import { DEFAULT_MAX_BOUNCES, findDonefile, loadConfig } from './donefile.js'; import { verify } from './check.js'; -import { DONEGATE_DIR, baselinePath, writeBaseline } from './baseline.js'; +import { DONEGATE_DIR, baselinePath, loadBaseline, sha256, writeBaseline } from './baseline.js'; import { ms } from './ui.js'; export type HookAgent = 'claude' | 'codex' | 'cursor'; @@ -140,39 +140,144 @@ export interface HookOutcome { exitCode: number; } +/** + * Walk upward looking for a `.donegate/baseline.json` whose recorded donefile + * no longer exists. findDonefile() already searched these directories and came + * up empty, so a hit means the donefile was deleted (or renamed away) after + * the baseline was taken — mid-session, by definition. + */ +function findOrphanedBaseline(cwd: string): { root: string; baseline: Baseline } | null { + let dir = path.resolve(cwd); + while (true) { + const baseline = loadBaseline(dir); + if (baseline) { + if ( + typeof baseline.donefile_path === 'string' && + baseline.donefile_path !== '' && + !fs.existsSync(path.join(dir, baseline.donefile_path)) + ) { + return { root: dir, baseline }; + } + // The nearest baseline governs; if its donefile is accounted for (or it + // doesn't record one), there is nothing orphaned here. + return null; + } + const parent = path.dirname(dir); + if (parent === dir) return null; + dir = parent; + } +} + +/** Block the stop (incrementing the session's bounce count), or give up loudly once the budget is spent. */ +function bounceOrGiveUp(options: { + agent: HookAgent; + root: string; + sessionId: string; + maxBounces: number; + reason: (attempt: number) => string; + giveUp: (bounces: number) => string; +}): HookOutcome { + const state = loadState(options.root); + const bounces = state.sessions[options.sessionId]?.bounces ?? 0; + + if (bounces >= options.maxBounces) { + return { stdout: null, stderr: options.giveUp(bounces), exitCode: 0 }; + } + + const attempt = bounces + 1; + state.sessions[options.sessionId] = { bounces: attempt, updated_at: new Date().toISOString() }; + saveState(options.root, state); + const reason = options.reason(attempt); + + if (options.agent === 'cursor') { + return { stdout: JSON.stringify({ followup_message: reason }), stderr: null, exitCode: 0 }; + } + // claude + codex share the decision/block contract. + return { stdout: JSON.stringify({ decision: 'block', reason }), stderr: null, exitCode: 0 }; +} + export async function runStopHook(agent: HookAgent, rawStdin: string): Promise { const payload = parsePayload(rawStdin); const cwd = resolveCwd(payload); - // No DONE.md → never interfere. A globally-installed hook must be a no-op - // in repos that haven't opted in. - const found = findDonefile(cwd); - if (!found) return { stdout: null, stderr: null, exitCode: 0 }; - // Cursor tells us how the turn ended; don't harass a user who hit ctrl-c. if (agent === 'cursor' && payload.status && payload.status !== 'completed') { return { stdout: null, stderr: null, exitCode: 0 }; } + const sessionId = payload.session_id ?? payload.conversation_id ?? 'default'; + + // No DONE.md → never interfere. A globally-installed hook must be a no-op in + // repos that haven't opted in. The exception is a session baseline whose + // donefile has vanished: deleting DONE.md mid-session is not an off switch. + const found = findDonefile(cwd); + if (!found) { + const orphan = findOrphanedBaseline(cwd); + if (!orphan) return { stdout: null, stderr: null, exitCode: 0 }; + const name = orphan.baseline.donefile_path; + return bounceOrGiveUp({ + agent, + root: orphan.root, + sessionId, + // The donefile (and its gate.max_bounces with it) is gone — use the default. + maxBounces: DEFAULT_MAX_BOUNCES, + reason: (attempt) => + [ + `donegate: NOT DONE — ${name} was deleted mid-session (attempt ${attempt}/${DEFAULT_MAX_BOUNCES}).`, + '', + `The session baseline (${path.join(DONEGATE_DIR, 'baseline.json')}) records that ${name} existed when this session started, and deleting the donefile does not turn the gate off. Restore it (e.g. \`git checkout -- ${name}\`), fix what is actually failing, then finish again. If a human is deliberately removing donegate from this repo, they should delete the ${DONEGATE_DIR}/ directory as well — that switch is not the agent's to flip.`, + ].join('\n'), + giveUp: (bounces) => + `donegate: ✗ ${name} is still missing after ${bounces} bounce${bounces > 1 ? 's' : ''} — giving up and allowing the stop. Restore ${name}, or delete ${DONEGATE_DIR}/ to remove the gate for real.`, + }); + } + let config: DoneConfig; try { config = loadConfig(cwd); } catch (err) { + const msg = err instanceof Error ? err.message : String(err); + // If the donefile stopped parsing *and* stopped matching the session + // baseline, the breakage happened mid-session — treat it like the + // tampering it almost certainly is, not like a config typo to fail open on. + const baseline = loadBaseline(found.root); + if (baseline && typeof baseline.donefile_sha === 'string') { + let changed: boolean; + try { + changed = sha256(fs.readFileSync(found.sourcePath)) !== baseline.donefile_sha; + } catch { + changed = true; // readable when the baseline was taken, unreadable now + } + if (changed) { + const name = path.relative(found.root, found.sourcePath).split(path.sep).join('/') || 'DONE.md'; + return bounceOrGiveUp({ + agent, + root: found.root, + sessionId, + // The config is unreadable, so its gate.max_bounces is too — use the default. + maxBounces: DEFAULT_MAX_BOUNCES, + reason: (attempt) => + [ + `donegate: NOT DONE — ${name} was modified mid-session and no longer parses (attempt ${attempt}/${DEFAULT_MAX_BOUNCES}): ${msg}`, + '', + `The definition of done is not the agent's to edit. Restore the donefile (e.g. \`git checkout -- ${name}\`), fix the real failures, then finish again. A deliberate donefile edit must be repaired and blessed by a human with \`npx donegate baseline\`.`, + ].join('\n'), + giveUp: (bounces) => + `donegate: ✗ ${name} still does not parse after ${bounces} bounce${bounces > 1 ? 's' : ''} — giving up and allowing the stop: ${msg}`, + }); + } + } // A broken DONE.md should surface to the user, not silently allow stops — // but blocking the agent forever on a config typo is worse. Warn and allow. - const msg = err instanceof Error ? err.message : String(err); return { stdout: null, stderr: `donegate: ${msg} — allowing stop`, exitCode: 0 }; } - const sessionId = payload.session_id ?? payload.conversation_id ?? 'default'; - const state = loadState(config.root); - const bounces = state.sessions[sessionId]?.bounces ?? 0; - // Always verify — the receipt should reflect reality even when we've stopped // blocking. We give up on bouncing, never on checking. const summary = await verify({ cwd, config, via: agent }); if (summary.exitCode === 0) { + const state = loadState(config.root); if (state.sessions[sessionId]) { delete state.sessions[sessionId]; saveState(config.root, state); @@ -184,34 +289,15 @@ export async function runStopHook(agent: HookAgent, rawStdin: string): Promise= config.gate.max_bounces) { - return { - stdout: null, - stderr: `donegate: ✗ still NOT DONE after ${bounces} bounce${bounces > 1 ? 's' : ''} — giving up and allowing the stop. The receipt is red: ${path.join(DONEGATE_DIR, 'receipts', 'latest.json')}`, - exitCode: 0, - }; - } - - const bounce = bounces + 1; - state.sessions[sessionId] = { bounces: bounce, updated_at: new Date().toISOString() }; - saveState(config.root, state); - - const reason = buildReason(summary, bounce, config.gate.max_bounces); - - if (agent === 'cursor') { - return { - stdout: JSON.stringify({ followup_message: reason }), - stderr: null, - exitCode: 0, - }; - } - - // claude + codex share the decision/block contract. - return { - stdout: JSON.stringify({ decision: 'block', reason }), - stderr: null, - exitCode: 0, - }; + return bounceOrGiveUp({ + agent, + root: config.root, + sessionId, + maxBounces: config.gate.max_bounces, + reason: (attempt) => buildReason(summary, attempt, config.gate.max_bounces), + giveUp: (bounces) => + `donegate: ✗ still NOT DONE after ${bounces} bounce${bounces > 1 ? 's' : ''} — giving up and allowing the stop. The receipt is red: ${path.join(DONEGATE_DIR, 'receipts', 'latest.json')}`, + }); } /** SessionStart hook: record a fresh tamper baseline unless one already exists. */ diff --git a/test/hooks.test.ts b/test/hooks.test.ts index 0d3e7db..c0c8927 100644 --- a/test/hooks.test.ts +++ b/test/hooks.test.ts @@ -1,7 +1,7 @@ import { test } from 'node:test'; import assert from 'node:assert/strict'; import { runBaselineHook, runStopHook } from '../src/hooks.js'; -import { BASIC_DONEFILE, cleanup, gitCommitAll, gitInit, tmpdir, write } from './helpers.js'; +import { BASIC_DONEFILE, cleanup, gitCommitAll, gitInit, rm, tmpdir, write } from './helpers.js'; const FAILING_DONEFILE = `# DoD \`\`\`yaml @@ -151,3 +151,79 @@ test('baseline hook records once with --if-missing', async () => { cleanup(root); } }); + +test('deleting the donefile mid-session bounces instead of bypassing the gate', async () => { + const root = await setup(BASIC_DONEFILE); + try { + await runBaselineHook({ ifMissing: false, quiet: true, cwd: root }); + rm(root, 'DONE.md'); + + for (const attempt of [1, 2, 3]) { + const outcome = await runStopHook('claude', payload(root)); + assert.ok(outcome.stdout, 'expected a block'); + const response = JSON.parse(outcome.stdout) as { decision: string; reason: string }; + assert.equal(response.decision, 'block'); + assert.match(response.reason, /DONE\.md was deleted mid-session/); + assert.match(response.reason, new RegExp(`attempt ${attempt}/3`)); + } + + // bounce budget spent → allow, loudly, with the way out spelled out + const final = await runStopHook('claude', payload(root)); + assert.equal(final.stdout, null); + assert.match(final.stderr ?? '', /giving up/); + assert.equal(final.exitCode, 0); + + // restoring the donefile brings the gate back to green and resets the session + write(root, 'DONE.md', BASIC_DONEFILE); + const restored = await runStopHook('claude', payload(root)); + assert.equal(restored.stdout, null); + assert.match(restored.stderr ?? '', /✓ DONE/); + } finally { + cleanup(root); + } +}); + +test('breaking the donefile mid-session bounces instead of failing open', async () => { + const root = await setup(BASIC_DONEFILE); + try { + await runBaselineHook({ ifMissing: false, quiet: true, cwd: root }); + write(root, 'DONE.md', '# the yaml block went missing\n'); + + const outcome = await runStopHook('claude', payload(root)); + assert.ok(outcome.stdout, 'expected a block'); + const response = JSON.parse(outcome.stdout) as { decision: string; reason: string }; + assert.equal(response.decision, 'block'); + assert.match(response.reason, /modified mid-session and no longer parses/); + + // repairing the donefile brings the gate back to green + write(root, 'DONE.md', BASIC_DONEFILE); + const repaired = await runStopHook('claude', payload(root)); + assert.equal(repaired.stdout, null); + assert.match(repaired.stderr ?? '', /✓ DONE/); + } finally { + cleanup(root); + } +}); + +test('cursor: aborted turns are not gated even when the donefile is gone', async () => { + const root = await setup(BASIC_DONEFILE); + try { + await runBaselineHook({ ifMissing: false, quiet: true, cwd: root }); + rm(root, 'DONE.md'); + + const aborted = await runStopHook( + 'cursor', + JSON.stringify({ conversation_id: 'c', workspace_roots: [root], status: 'aborted' }), + ); + assert.equal(aborted.stdout, null); + + const completed = await runStopHook( + 'cursor', + JSON.stringify({ conversation_id: 'c', workspace_roots: [root], status: 'completed' }), + ); + const response = JSON.parse(completed.stdout!) as { followup_message: string }; + assert.match(response.followup_message, /deleted mid-session/); + } finally { + cleanup(root); + } +});