Skip to content

feat: loop-age features — guards.protect, SubagentStop adapter, check --against, progress-aware bounces#2

Open
alexandertomana wants to merge 1 commit into
fix/donefile-tamper-bypassfrom
feat/agent-loops
Open

feat: loop-age features — guards.protect, SubagentStop adapter, check --against, progress-aware bounces#2
alexandertomana wants to merge 1 commit into
fix/donefile-tamper-bypassfrom
feat/agent-loops

Conversation

@alexandertomana

Copy link
Copy Markdown
Contributor

Stacked on #1 (will auto-retarget to main when #1 merges). Only the last commit is new here.

Coding-agent workflows are becoming loops that fan out — orchestrators, subagents, worktrees, workflow scripts. donegate gated exactly one point of that topology: the terminal Stop. This PR gives it the other seats, plus closes the biggest practical guard bypass.

1. guards.protect — pin the files the checks mean

DONE.md is hash-guarded, but run: npm test resolves through package.json — "test": "exit 0" was green with no finding. Now:

guards:
  protect:
    - package.json
    - eslint.config.js

Protected files are hashed into the baseline like the donefile itself. Changed, deleted, or newly shadowed (a new file matching the globs can override config resolution) → no_protected_edits → exit 3. With no baseline it falls back to the git diff against the comparison ref, so it works in CI. An empty protect adds zero receipt noise — existing repos see no new guard row.

2. SubagentStop adapter — gate the fan-out per node

donegate install claude now also wires SubagentStopdonegate hook claude --subagent: a guards-only scan (no checks — git diffs + regexes, ~100ms) at every subagent boundary. A subagent that skipped tests, deleted tests, or touched a protected file is bounced while it still has the context to undo it, instead of the tampering surfacing at the terminal stop after its output was absorbed. Read-only subagents change nothing and pass for the cost of one diff.

Subagent bounces use their own ledger (<session>:subagent) so a noisy fan-out can't burn the terminal gate's bounce budget. uninstall removes the hook; codex/cursor are untouched (no equivalent event).

3. donegate check --against <ref> — judge mode

Fan-out patterns end with verification, and the verifier should be deterministic. --against pins the comparison to an explicit ref — each worktree judged against its fork point, CI pinned to the PR base — ignoring the session baseline (judge mode judges a diff, not a session). That also makes it the antidote to a re-blessed baseline: the E2E smoke shows a committed-and-re-blessed .skip passing the session gate but exiting 3 under --against <fork>. A nonexistent ref is exit 2, never a silent pass. Receipts record baseline.kind: "explicit".

4. Progress-aware bounce budget

A fixed max_bounces: 3 cuts off an agent steadily fixing a 5-item failure list — the opposite of loop-until-done. The budget now counts consecutive bounces without new progress: a stop attempt with strictly fewer failing checks + tripped guards than the session's best refreshes the budget and tells the agent so. Best-ever (not better-than-last) is the bar, so oscillating between failure sets can't farm refreshes — total bounces stay bounded, wedged sessions still exit with a red receipt, and the anti-hostage guarantee is intact.

Docs

  • New docs/agent-loops.md: the three seats (terminal gate / per-node scan / judge mode), worktree behavior, bounded loop-until-done.
  • spec.md (schema + semantics), hooks.md (SubagentStop, bounce semantics), threat-model.md (command-indirection entry updated — it's now closable), README (guard table, FAQ, fan-out pointer).

Verification

  • 88/88 tests (9 new: protect baseline/CI modes, subagent guards-only + ledger isolation, judge-past-blessed-baseline + bad-ref rejection, progress-refresh/stall/recover sequence, install/uninstall wiring, donefile parsing).
  • Dogfood gate green (node dist/cli.js check), all guards clean, no new receipt noise.
  • E2E through the built CLI: package.json rewrite → exit 3; .skip at a subagent boundary → decision:block; blessed-baseline blind (exit 0) but --against catches it (exit 3); bad ref → exit 2.

🤖 Generated with Claude Code

…st, progress-aware bounces

Four features that put donegate inside agentic fan-out workflows instead
of only at the session's terminal stop:

- guards.protect + no_protected_edits: pin the files the checks *mean*
  (package.json scripts, lint/test configs). Hashed into the baseline
  like the donefile; changed, deleted, or newly shadowing files are
  findings. Falls back to the git diff when there is no baseline, so it
  works in CI. Closes the '"test": "exit 0"' indirection hole.

- SubagentStop adapter (Claude Code): donegate install claude now wires
  `donegate hook claude --subagent` — a guards-only tamper scan at every
  subagent boundary. No checks run, so fan-outs are gated per node at
  git-diff cost; findings bounce the subagent while it still has the
  context to undo them. Subagent bounces keep their own ledger so a
  noisy fan-out can't burn the terminal gate's budget.

- donegate check --against <ref>: judge mode. Evaluates checks + guards
  against an explicit ref, ignoring the session baseline — grade each
  worktree against its fork point from a workflow script, pin CI to the
  PR base, or re-derive a verdict past a re-blessed baseline. Receipts
  record kind "explicit"; a nonexistent ref is a config error (exit 2),
  never a silent pass.

- Progress-aware bounce budget: gate.max_bounces now counts consecutive
  bounces without new progress. A stop attempt with strictly fewer
  failing checks + tripped guards than the session's best refreshes the
  budget (and says so in the reason). Best-ever is the bar, so
  oscillating failure sets can't farm refreshes and total bounces stay
  bounded — loop-until-done semantics without the hostage situation.

Docs: new docs/agent-loops.md (terminal gate / per-node scan / judge
mode, worktree behavior), spec + hooks + threat-model + README updated.
An empty guards.protect adds no receipt noise for existing repos.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant