docs(readme): Claude Code Routines comparison section (#27)

stackbilt-admin · Codebeast · claude · web-flow · commit 678372a5a4e1 · 2026-04-16T07:17:48.000-05:00
* chore: add canonical SECURITY.md Adds the standardized Stackbilt-dev security reporting template to this repository. The template is the canonical per-repo security file rolled out across the entire Stackbilt-dev organization as part of the outbound disclosure policy (Stackbilt-dev/docs#15). Key points: - Primary reporting channel: admin@stackbilt.dev - GitHub Security Advisory link scoped to this repo - Response target matrix (critical 24h ack / 7d fix, high 48h / 14d) - Full policy link at https://docs.stackbilt.dev/security/ - Explicit "do not open public GH issues for vulns" rule This replaces the implicit policy that existed via the Stackbilt-dev organization profile with an explicit per-repo file, so the GitHub security tab surfaces it and external researchers have a clear reporting path. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(taskrunner): ratchet mode — measure-before-after validation Closes #16. Adds an opt-in guard that captures a baseline snapshot of typecheck + test state on main BEFORE the task branch is created, re-runs the same checks on the branch AFTER the task commits, and automatically reverts the branch (delete locally, skip push/PR, mark failed) when any check transitioned pass→fail. Opt-in paths - Per-task: `"ratchet": true` in the task JSON - Category default: `refactor` and `bugfix` tasks ratchet automatically - Environment: `CC_RATCHET=1` force-enables for every task Never ratcheted - `docs`, `tests`, `research`, `deploy` categories (no regression surface or outcomes aren't code-level) Decision rule Only pass→fail transitions revert. fail→fail (unchanged broken surface) and skip→fail (first-time check on a pre-existing breakage) are both `keep`. fail→pass is `keep`. The goal is to gate regressions, not punish tasks for inheriting broken state. Snapshot surface - `npm run typecheck` exit code → pass/fail/skip - `npm test` exit code → pass/fail/skip - Each check is independent and degrades to `skip` when the repo has no corresponding script in `package.json`. Zero new dependencies. Integration points - Baseline captured right after `git pull --ff-only`, before the task branch is checked out (so we measure true main state). - Post-validation runs after commits but BEFORE push, so a regressed branch never reaches origin and never opens a PR. - Ratchet state is local to each execute_task() call — initialized up front so operator-authority tasks (which skip branch creation) don't trip unbound-variable errors under set -u. Applied symmetrically to taskrunner.sh and plugin/taskrunner.sh. Smoke-tested ratchet_decision() against 5 transition cases: - skip→skip: keep ✓ - pass→pass: keep ✓ - pass→fail: revert (rc=1) ✓ - fail→fail: keep (no regression) ✓ - skip→fail: keep (first-time surface) ✓ Env knobs - CC_RATCHET=1|0 force-enable/disable, overrides task fields - CC_RATCHET_TIMEOUT=<seconds> per-check timeout (default: 180) - CC_DISABLE_RATCHET=1 legacy alias for CC_RATCHET=0 Closes #16 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs(changelog): 1.6.0 ratchet mode entry Should've been in the prior commit but Edit bailed on an unread file. Squash candidate on merge. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use .git/info/exclude instead of .gitignore for worktree protection (#25) The worktree-protection pattern (C:* glob for Windows-path pollution, added in #6) was being appended to .gitignore and staged, causing every auto-generated PR to include unsolicited .gitignore modifications. Move the exclusion to .git/info/exclude, which provides identical git ignore behavior but is local to the repository and never committed. Closes #25 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs(readme): add Claude Code Routines comparison section Anthropic shipped Claude Code Routines (research preview) in April 2026 — saved Claude Code configurations that run on Anthropic's cloud on a schedule, via API trigger, or on GitHub repository events. Routines and cc-taskrunner solve overlapping problems differently. New users evaluating the taskrunner deserve to know the alternative exists and when each substrate is the right fit. New "cc-taskrunner vs. Claude Code Routines" section between "Why This Exists" and "Quick Start" includes: - 12-row capability comparison table (where it runs, cost model, trigger types, cadence floor, local FS access, runs-while-laptop-closed, queue management, branch isolation, safety hooks, blast radius, GitHub event triggers, setup overhead) - Explicit "when cc-taskrunner is right" decision rubric (queue management, local FS access, sub-hour cadence, blast-radius enforcement, hook-level safety) - Explicit "when Claude Code Routines are right" decision rubric (single repeatable task, GitHub-event-driven, runs while laptop off, MCP-only mutations) - Honest disclosure that Stackbilt itself runs the taskrunner in paused mode and uses Routines for several scheduled workloads — complementary not competitive Framing throughout: pick the substrate that fits the work, neither obsoletes the other in a real ecosystem. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Codebeast <codebeast@stackbilt.dev> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -4,6 +4,29 @@ All notable changes to cc-taskrunner will be documented in this file.
 
 Format follows [Keep a Changelog](https://keepachangelog.com/).
 
+## [Unreleased]
+
+### Added
+- **README: Claude Code Routines vs cc-taskrunner comparison section.** Anthropic shipped Claude Code Routines (research preview) in April 2026 — saved Claude Code configurations that run on Anthropic's cloud on a schedule, via API trigger, or on GitHub events. Added a new section between "Why This Exists" and "Quick Start" with: a 12-row capability comparison table, explicit "when cc-taskrunner is right" / "when Routines are right" decision rubrics, and an honest disclosure that Stackbilt itself currently runs the taskrunner in paused mode and uses Routines for several scheduled workloads. Framing: complementary, not competitive — pick the substrate that fits the work.
+
+## [1.6.0] — 2026-04-11
+
+### Added
+- **Ratchet mode — measure-before-after validation for autonomous improvements** (#16). The runner now captures a baseline snapshot of `npm run typecheck` + `npm test` pass/fail on `main` before creating the task branch, re-runs the same checks on the branch after the task commits, and automatically reverts the task (delete branch, skip push/PR, mark failed) when a check transitioned `pass → fail`. Gates regressions from reaching origin.
+
+  **Opt-in paths:**
+  - `"ratchet": true` in the task JSON (explicit per-task)
+  - Category defaults: `refactor` and `bugfix` ratchet automatically
+  - `CC_RATCHET=1` environment override (force-enable every task)
+
+  **Never ratcheted:** `docs`, `tests`, `research`, `deploy` — no regression surface or outcomes aren't code-level.
+
+  **Decision rule:** only `pass → fail` transitions revert. `fail → fail` (unchanged broken surface) and `skip → fail` (first-time check on pre-existing breakage) are both `keep`. `fail → pass` is `keep` (improvement).
+
+  **Env knobs:** `CC_RATCHET=1|0`, `CC_RATCHET_TIMEOUT=<seconds>` (default 180), `CC_DISABLE_RATCHET=1` (legacy alias).
+
+  Applied symmetrically to `taskrunner.sh` and `plugin/taskrunner.sh`. Pure bash + python3 — zero new dependencies. Degrades to no-op when the repo has no `typecheck` or `test` script. Only runs on branch-isolated tasks (operator-authority tasks skip ratchet entirely).
+
 ## [1.5.0] — 2026-04-09
 
 ### Added
diff --git a/README.md b/README.md
@@ -37,6 +37,47 @@ Claude Code is powerful in interactive sessions. But there's no built-in way to:
 
 cc-taskrunner fills that gap. It's the execution layer between "Claude can write code" and "Claude can ship code safely."
 
+## cc-taskrunner vs. Claude Code Routines
+
+In April 2026 Anthropic shipped [Claude Code Routines](https://code.claude.com/docs/en/routines) (research preview) — saved Claude Code configurations that run on Anthropic's cloud infrastructure on a schedule, via API trigger, or on GitHub events. **Routines and cc-taskrunner solve overlapping problems differently.** Both have a place; pick the substrate that fits the work.
+
+|  | cc-taskrunner | Claude Code Routines |
+|---|---|---|
+| **Where it runs** | Your machine (or any box with bash + `claude` CLI) | Anthropic-managed cloud |
+| **Cost model** | Your local resources + your Claude subscription | Subscription quota only |
+| **Trigger** | Manual / loop mode (1-minute polling) | Schedule (1h cron min), API endpoint, or GitHub event |
+| **Cadence floor** | Sub-minute possible | 1 hour minimum on schedule triggers |
+| **Local filesystem access** | ✅ Full — operate on any directory | ❌ Cloned-repo only, fresh clone per fire |
+| **Runs while laptop is closed** | ❌ Needs your machine running | ✅ Cloud-managed |
+| **Queue management** | ✅ JSON file, dependencies, FIFO | ❌ One prompt per routine; multiple triggers per routine |
+| **Branch isolation** | ✅ `auto/{task-id}` per task | ✅ `claude/*`-prefixed branches enforced by default |
+| **Pre-flight safety hooks** | ✅ Bash hooks block destructive ops | ⚠️ Permission-mode-less by design (autonomous) |
+| **Blast radius gate** | ✅ via `charter blast` integration | Not built-in |
+| **GitHub event triggers** | ❌ Not designed for it | ✅ `pull_request` and `release` events |
+| **Setup overhead** | bash + python3 + `gh` CLI + clone | claude.ai account with web/Pro/Max plan |
+
+### When cc-taskrunner is the right substrate
+
+- You want to queue a backlog of tasks and run them unattended — taskrunner is built for this; routines are not (one prompt per routine)
+- You need work to happen against your **local filesystem** (paths outside any GitHub repo, machine-specific tooling, in-progress work in your worktree)
+- You need **sub-hour cadence** or want to run a continuous polling loop
+- You want to enforce blast-radius limits via [`@stackbilt/cli`](https://github.com/Stackbilt-dev/charter)'s `charter blast` before any change touches code
+- You want **bash-hook safety enforcement** that blocks destructive operations at the OS level rather than relying on prompt discipline alone
+
+### When Claude Code Routines are the right substrate
+
+- The work is a single repeatable task that fires on a schedule, on a GitHub event, or on demand via API call
+- You want it to run while your machine is off (overnight, weekends, while traveling)
+- You want **GitHub-event-driven** automation (PR review on every `pull_request.opened`, port-on-merge between SDKs, etc.)
+- The work needs to write back via MCP connectors (Slack, Linear, custom MCP servers) without local credentials
+- You don't need queue management — one prompt + one schedule + one trigger is enough
+
+### Honest disclosure
+
+Stackbilt (the project that originated cc-taskrunner) currently runs taskrunner in **paused** mode and uses Routines for several scheduled workloads. That's not because the taskrunner is broken — it's because the workloads in question (autonomous heartbeat triage, weekly cross-repo pattern scans) fit the routine substrate better. Routines and the taskrunner are **complementary** in a real ecosystem; we don't claim one obsoletes the other.
+
+If you're starting fresh and your work fits the schedule/event/API-trigger model, try Routines first — there's nothing to install. If you need queue management, sub-hour polling, local filesystem access, or hook-level safety enforcement, taskrunner remains the right tool.
+
 ## Quick Start
 
 ```bash
diff --git a/plugin/taskrunner.sh b/plugin/taskrunner.sh
@@ -143,6 +143,77 @@ print("\n".join(lines))
 '
 }
 
+# ─── Ratchet mode (#16) ─────────────────────────────────────
+# Measure-before-after validation. See taskrunner.sh for the full doc —
+# this is the parallel plugin copy. Keep them in sync.
+
+ratchet_enabled_for_task() {
+  local task_json="$1"
+  if [[ "${CC_DISABLE_RATCHET:-0}" = "1" ]]; then return 1; fi
+  if [[ "${CC_RATCHET:-}" = "0" ]]; then return 1; fi
+  if [[ "${CC_RATCHET:-}" = "1" ]]; then return 0; fi
+  local explicit category
+  explicit=$(echo "$task_json" | python3 -c 'import json,sys; v=json.load(sys.stdin).get("ratchet"); print("" if v is None else str(v).lower())' 2>/dev/null)
+  category=$(echo "$task_json" | python3 -c 'import json,sys; print(json.load(sys.stdin).get("category", ""))' 2>/dev/null)
+  if [[ "$explicit" = "true" ]]; then return 0; fi
+  if [[ "$explicit" = "false" ]]; then return 1; fi
+  case "$category" in
+    refactor|bugfix) return 0 ;;
+    docs|tests|research|deploy) return 1 ;;
+    *) return 1 ;;
+  esac
+}
+
+ratchet_snapshot() {
+  local repo_path="$1" label="$2"
+  local timeout_secs="${CC_RATCHET_TIMEOUT:-180}"
+  local tc_status="skip" test_status="skip"
+  if [[ -f "${repo_path}/package.json" ]] && command -v python3 >/dev/null 2>&1; then
+    local has_typecheck has_test
+    has_typecheck=$(python3 -c 'import json,sys; d=json.load(open(sys.argv[1])); print("1" if "typecheck" in d.get("scripts", {}) else "0")' "${repo_path}/package.json" 2>/dev/null || echo 0)
+    has_test=$(python3 -c 'import json,sys; d=json.load(open(sys.argv[1])); print("1" if "test" in d.get("scripts", {}) else "0")' "${repo_path}/package.json" 2>/dev/null || echo 0)
+    if [[ "$has_typecheck" = "1" ]]; then
+      if ( cd "$repo_path" && timeout "$timeout_secs" npm run typecheck >/dev/null 2>&1 ); then
+        tc_status="pass"
+      else
+        tc_status="fail"
+      fi
+    fi
+    if [[ "$has_test" = "1" ]]; then
+      if ( cd "$repo_path" && timeout "$timeout_secs" npm test >/dev/null 2>&1 ); then
+        test_status="pass"
+      else
+        test_status="fail"
+      fi
+    fi
+  fi
+  printf '{"label":"%s","typecheck":"%s","test":"%s"}' "$label" "$tc_status" "$test_status"
+}
+
+ratchet_decision() {
+  local baseline="$1" post="$2"
+  local bt pt bx px
+  bt=$(echo "$baseline" | python3 -c 'import json,sys; print(json.load(sys.stdin).get("typecheck","skip"))' 2>/dev/null || echo skip)
+  pt=$(echo "$post"     | python3 -c 'import json,sys; print(json.load(sys.stdin).get("typecheck","skip"))' 2>/dev/null || echo skip)
+  bx=$(echo "$baseline" | python3 -c 'import json,sys; print(json.load(sys.stdin).get("test","skip"))'      2>/dev/null || echo skip)
+  px=$(echo "$post"     | python3 -c 'import json,sys; print(json.load(sys.stdin).get("test","skip"))'      2>/dev/null || echo skip)
+  local reverts=()
+  if [[ "$bt" = "pass" && "$pt" = "fail" ]]; then
+    reverts+=("typecheck regression (pass → fail)")
+  fi
+  if [[ "$bx" = "pass" && "$px" = "fail" ]]; then
+    reverts+=("test regression (pass → fail)")
+  fi
+  if [[ ${#reverts[@]} -gt 0 ]]; then
+    local reason
+    reason=$(IFS=', '; echo "${reverts[*]}")
+    echo "revert: $reason"
+    return 1
+  fi
+  echo "keep: baseline=tc:${bt},test:${bx} post=tc:${pt},test:${px}"
+  return 0
+}
+
 # ─── Queue management ───────────────────────────────────────
 
 init_queue() {
@@ -375,6 +446,10 @@ execute_task() {
   local branch=""
   local use_branch=false
   local stashed=false
+  # Ratchet state — initialized up front so the post-validation block
+  # can reference them on paths that skip branch creation entirely.
+  local RATCHET_ENABLED=false
+  local RATCHET_BASELINE=""
   cd "$repo_path"
 
   # Non-operator tasks get their own branch
@@ -407,6 +482,14 @@ execute_task() {
     git checkout main 2>/dev/null || git checkout master 2>/dev/null
     git pull --ff-only 2>/dev/null || true
 
+    # ── Ratchet baseline capture (#16) ────────────────────────
+    if ratchet_enabled_for_task "$task_json"; then
+      RATCHET_ENABLED=true
+      log "│  Ratchet: capturing baseline on main…"
+      RATCHET_BASELINE="$(ratchet_snapshot "$repo_path" baseline)"
+      log "│  Ratchet baseline: ${RATCHET_BASELINE}"
+    fi
+
     # Create or reset task branch
     if git rev-parse --verify "$branch" >/dev/null 2>&1; then
       git checkout "$branch"
@@ -543,6 +626,36 @@ Task: ${title}" 2>/dev/null || true
       commit_count=$((commit_count + 1))
     fi
 
+    # ── Ratchet post-validation (#16) ─────────────────────────
+    local ratchet_verdict=""
+    if [[ "$RATCHET_ENABLED" = "true" && "$commit_count" -gt 0 ]]; then
+      log "│  Ratchet: validating branch…"
+      local ratchet_post
+      ratchet_post="$(ratchet_snapshot "$repo_path" post)"
+      log "│  Ratchet post:     ${ratchet_post}"
+      ratchet_verdict="$(ratchet_decision "$RATCHET_BASELINE" "$ratchet_post")"
+      local ratchet_rc=$?
+      log "│  Ratchet verdict:  ${ratchet_verdict}"
+
+      if [[ $ratchet_rc -ne 0 ]]; then
+        log "│  Ratchet REVERT — dropping branch and skipping PR"
+        git checkout main 2>/dev/null || git checkout master 2>/dev/null
+        git branch -D "$branch" 2>/dev/null || true
+        branch=""
+        commit_count=0
+        if [[ "$stashed" == "true" ]]; then
+          git stash pop 2>/dev/null && log "│  Restored stashed changes" || true
+          stashed=false
+        fi
+        result_text="[ratchet_revert] ${ratchet_verdict}
+
+${result_text}"
+        update_task_status "$task_id" "failed" "ratchet revert: ${ratchet_verdict}"
+        log "└─ Task reverted by ratchet gate"
+        return 1
+      fi
+    fi
+
     # Push and create PR if there are commits
     if [[ "$commit_count" -gt 0 ]]; then
       log "│  Pushing ${commit_count} commit(s) to ${branch}..."
diff --git a/taskrunner.sh b/taskrunner.sh