Skip to content

Commit 678372a

Browse files
stackbilt-adminCodebeastclaude
authored
docs(readme): Claude Code Routines comparison section (#27)
* chore: add canonical SECURITY.md Adds the standardized Stackbilt-dev security reporting template to this repository. The template is the canonical per-repo security file rolled out across the entire Stackbilt-dev organization as part of the outbound disclosure policy (Stackbilt-dev/docs#15). Key points: - Primary reporting channel: admin@stackbilt.dev - GitHub Security Advisory link scoped to this repo - Response target matrix (critical 24h ack / 7d fix, high 48h / 14d) - Full policy link at https://docs.stackbilt.dev/security/ - Explicit "do not open public GH issues for vulns" rule This replaces the implicit policy that existed via the Stackbilt-dev organization profile with an explicit per-repo file, so the GitHub security tab surfaces it and external researchers have a clear reporting path. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(taskrunner): ratchet mode — measure-before-after validation Closes #16. Adds an opt-in guard that captures a baseline snapshot of typecheck + test state on main BEFORE the task branch is created, re-runs the same checks on the branch AFTER the task commits, and automatically reverts the branch (delete locally, skip push/PR, mark failed) when any check transitioned pass→fail. Opt-in paths - Per-task: `"ratchet": true` in the task JSON - Category default: `refactor` and `bugfix` tasks ratchet automatically - Environment: `CC_RATCHET=1` force-enables for every task Never ratcheted - `docs`, `tests`, `research`, `deploy` categories (no regression surface or outcomes aren't code-level) Decision rule Only pass→fail transitions revert. fail→fail (unchanged broken surface) and skip→fail (first-time check on a pre-existing breakage) are both `keep`. fail→pass is `keep`. The goal is to gate regressions, not punish tasks for inheriting broken state. Snapshot surface - `npm run typecheck` exit code → pass/fail/skip - `npm test` exit code → pass/fail/skip - Each check is independent and degrades to `skip` when the repo has no corresponding script in `package.json`. Zero new dependencies. Integration points - Baseline captured right after `git pull --ff-only`, before the task branch is checked out (so we measure true main state). - Post-validation runs after commits but BEFORE push, so a regressed branch never reaches origin and never opens a PR. - Ratchet state is local to each execute_task() call — initialized up front so operator-authority tasks (which skip branch creation) don't trip unbound-variable errors under set -u. Applied symmetrically to taskrunner.sh and plugin/taskrunner.sh. Smoke-tested ratchet_decision() against 5 transition cases: - skip→skip: keep ✓ - pass→pass: keep ✓ - pass→fail: revert (rc=1) ✓ - fail→fail: keep (no regression) ✓ - skip→fail: keep (first-time surface) ✓ Env knobs - CC_RATCHET=1|0 force-enable/disable, overrides task fields - CC_RATCHET_TIMEOUT=<seconds> per-check timeout (default: 180) - CC_DISABLE_RATCHET=1 legacy alias for CC_RATCHET=0 Closes #16 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs(changelog): 1.6.0 ratchet mode entry Should've been in the prior commit but Edit bailed on an unread file. Squash candidate on merge. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use .git/info/exclude instead of .gitignore for worktree protection (#25) The worktree-protection pattern (C:* glob for Windows-path pollution, added in #6) was being appended to .gitignore and staged, causing every auto-generated PR to include unsolicited .gitignore modifications. Move the exclusion to .git/info/exclude, which provides identical git ignore behavior but is local to the repository and never committed. Closes #25 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs(readme): add Claude Code Routines comparison section Anthropic shipped Claude Code Routines (research preview) in April 2026 — saved Claude Code configurations that run on Anthropic's cloud on a schedule, via API trigger, or on GitHub repository events. Routines and cc-taskrunner solve overlapping problems differently. New users evaluating the taskrunner deserve to know the alternative exists and when each substrate is the right fit. New "cc-taskrunner vs. Claude Code Routines" section between "Why This Exists" and "Quick Start" includes: - 12-row capability comparison table (where it runs, cost model, trigger types, cadence floor, local FS access, runs-while-laptop-closed, queue management, branch isolation, safety hooks, blast radius, GitHub event triggers, setup overhead) - Explicit "when cc-taskrunner is right" decision rubric (queue management, local FS access, sub-hour cadence, blast-radius enforcement, hook-level safety) - Explicit "when Claude Code Routines are right" decision rubric (single repeatable task, GitHub-event-driven, runs while laptop off, MCP-only mutations) - Honest disclosure that Stackbilt itself runs the taskrunner in paused mode and uses Routines for several scheduled workloads — complementary not competitive Framing throughout: pick the substrate that fits the work, neither obsoletes the other in a real ecosystem. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Codebeast <codebeast@stackbilt.dev> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent e99a650 commit 678372a

File tree

4 files changed

+358
-6
lines changed

4 files changed

+358
-6
lines changed

CHANGELOG.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,29 @@ All notable changes to cc-taskrunner will be documented in this file.
44

55
Format follows [Keep a Changelog](https://keepachangelog.com/).
66

7+
## [Unreleased]
8+
9+
### Added
10+
- **README: Claude Code Routines vs cc-taskrunner comparison section.** Anthropic shipped Claude Code Routines (research preview) in April 2026 — saved Claude Code configurations that run on Anthropic's cloud on a schedule, via API trigger, or on GitHub events. Added a new section between "Why This Exists" and "Quick Start" with: a 12-row capability comparison table, explicit "when cc-taskrunner is right" / "when Routines are right" decision rubrics, and an honest disclosure that Stackbilt itself currently runs the taskrunner in paused mode and uses Routines for several scheduled workloads. Framing: complementary, not competitive — pick the substrate that fits the work.
11+
12+
## [1.6.0] — 2026-04-11
13+
14+
### Added
15+
- **Ratchet mode — measure-before-after validation for autonomous improvements** (#16). The runner now captures a baseline snapshot of `npm run typecheck` + `npm test` pass/fail on `main` before creating the task branch, re-runs the same checks on the branch after the task commits, and automatically reverts the task (delete branch, skip push/PR, mark failed) when a check transitioned `pass → fail`. Gates regressions from reaching origin.
16+
17+
**Opt-in paths:**
18+
- `"ratchet": true` in the task JSON (explicit per-task)
19+
- Category defaults: `refactor` and `bugfix` ratchet automatically
20+
- `CC_RATCHET=1` environment override (force-enable every task)
21+
22+
**Never ratcheted:** `docs`, `tests`, `research`, `deploy` — no regression surface or outcomes aren't code-level.
23+
24+
**Decision rule:** only `pass → fail` transitions revert. `fail → fail` (unchanged broken surface) and `skip → fail` (first-time check on pre-existing breakage) are both `keep`. `fail → pass` is `keep` (improvement).
25+
26+
**Env knobs:** `CC_RATCHET=1|0`, `CC_RATCHET_TIMEOUT=<seconds>` (default 180), `CC_DISABLE_RATCHET=1` (legacy alias).
27+
28+
Applied symmetrically to `taskrunner.sh` and `plugin/taskrunner.sh`. Pure bash + python3 — zero new dependencies. Degrades to no-op when the repo has no `typecheck` or `test` script. Only runs on branch-isolated tasks (operator-authority tasks skip ratchet entirely).
29+
730
## [1.5.0] — 2026-04-09
831

932
### Added

README.md

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,47 @@ Claude Code is powerful in interactive sessions. But there's no built-in way to:
3737

3838
cc-taskrunner fills that gap. It's the execution layer between "Claude can write code" and "Claude can ship code safely."
3939

40+
## cc-taskrunner vs. Claude Code Routines
41+
42+
In April 2026 Anthropic shipped [Claude Code Routines](https://code.claude.com/docs/en/routines) (research preview) — saved Claude Code configurations that run on Anthropic's cloud infrastructure on a schedule, via API trigger, or on GitHub events. **Routines and cc-taskrunner solve overlapping problems differently.** Both have a place; pick the substrate that fits the work.
43+
44+
| | cc-taskrunner | Claude Code Routines |
45+
|---|---|---|
46+
| **Where it runs** | Your machine (or any box with bash + `claude` CLI) | Anthropic-managed cloud |
47+
| **Cost model** | Your local resources + your Claude subscription | Subscription quota only |
48+
| **Trigger** | Manual / loop mode (1-minute polling) | Schedule (1h cron min), API endpoint, or GitHub event |
49+
| **Cadence floor** | Sub-minute possible | 1 hour minimum on schedule triggers |
50+
| **Local filesystem access** | ✅ Full — operate on any directory | ❌ Cloned-repo only, fresh clone per fire |
51+
| **Runs while laptop is closed** | ❌ Needs your machine running | ✅ Cloud-managed |
52+
| **Queue management** | ✅ JSON file, dependencies, FIFO | ❌ One prompt per routine; multiple triggers per routine |
53+
| **Branch isolation** |`auto/{task-id}` per task |`claude/*`-prefixed branches enforced by default |
54+
| **Pre-flight safety hooks** | ✅ Bash hooks block destructive ops | ⚠️ Permission-mode-less by design (autonomous) |
55+
| **Blast radius gate** | ✅ via `charter blast` integration | Not built-in |
56+
| **GitHub event triggers** | ❌ Not designed for it |`pull_request` and `release` events |
57+
| **Setup overhead** | bash + python3 + `gh` CLI + clone | claude.ai account with web/Pro/Max plan |
58+
59+
### When cc-taskrunner is the right substrate
60+
61+
- You want to queue a backlog of tasks and run them unattended — taskrunner is built for this; routines are not (one prompt per routine)
62+
- You need work to happen against your **local filesystem** (paths outside any GitHub repo, machine-specific tooling, in-progress work in your worktree)
63+
- You need **sub-hour cadence** or want to run a continuous polling loop
64+
- You want to enforce blast-radius limits via [`@stackbilt/cli`](https://github.com/Stackbilt-dev/charter)'s `charter blast` before any change touches code
65+
- You want **bash-hook safety enforcement** that blocks destructive operations at the OS level rather than relying on prompt discipline alone
66+
67+
### When Claude Code Routines are the right substrate
68+
69+
- The work is a single repeatable task that fires on a schedule, on a GitHub event, or on demand via API call
70+
- You want it to run while your machine is off (overnight, weekends, while traveling)
71+
- You want **GitHub-event-driven** automation (PR review on every `pull_request.opened`, port-on-merge between SDKs, etc.)
72+
- The work needs to write back via MCP connectors (Slack, Linear, custom MCP servers) without local credentials
73+
- You don't need queue management — one prompt + one schedule + one trigger is enough
74+
75+
### Honest disclosure
76+
77+
Stackbilt (the project that originated cc-taskrunner) currently runs taskrunner in **paused** mode and uses Routines for several scheduled workloads. That's not because the taskrunner is broken — it's because the workloads in question (autonomous heartbeat triage, weekly cross-repo pattern scans) fit the routine substrate better. Routines and the taskrunner are **complementary** in a real ecosystem; we don't claim one obsoletes the other.
78+
79+
If you're starting fresh and your work fits the schedule/event/API-trigger model, try Routines first — there's nothing to install. If you need queue management, sub-hour polling, local filesystem access, or hook-level safety enforcement, taskrunner remains the right tool.
80+
4081
## Quick Start
4182

4283
```bash

plugin/taskrunner.sh

Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -143,6 +143,77 @@ print("\n".join(lines))
143143
'
144144
}
145145

146+
# ─── Ratchet mode (#16) ─────────────────────────────────────
147+
# Measure-before-after validation. See taskrunner.sh for the full doc —
148+
# this is the parallel plugin copy. Keep them in sync.
149+
150+
ratchet_enabled_for_task() {
151+
local task_json="$1"
152+
if [[ "${CC_DISABLE_RATCHET:-0}" = "1" ]]; then return 1; fi
153+
if [[ "${CC_RATCHET:-}" = "0" ]]; then return 1; fi
154+
if [[ "${CC_RATCHET:-}" = "1" ]]; then return 0; fi
155+
local explicit category
156+
explicit=$(echo "$task_json" | python3 -c 'import json,sys; v=json.load(sys.stdin).get("ratchet"); print("" if v is None else str(v).lower())' 2>/dev/null)
157+
category=$(echo "$task_json" | python3 -c 'import json,sys; print(json.load(sys.stdin).get("category", ""))' 2>/dev/null)
158+
if [[ "$explicit" = "true" ]]; then return 0; fi
159+
if [[ "$explicit" = "false" ]]; then return 1; fi
160+
case "$category" in
161+
refactor|bugfix) return 0 ;;
162+
docs|tests|research|deploy) return 1 ;;
163+
*) return 1 ;;
164+
esac
165+
}
166+
167+
ratchet_snapshot() {
168+
local repo_path="$1" label="$2"
169+
local timeout_secs="${CC_RATCHET_TIMEOUT:-180}"
170+
local tc_status="skip" test_status="skip"
171+
if [[ -f "${repo_path}/package.json" ]] && command -v python3 >/dev/null 2>&1; then
172+
local has_typecheck has_test
173+
has_typecheck=$(python3 -c 'import json,sys; d=json.load(open(sys.argv[1])); print("1" if "typecheck" in d.get("scripts", {}) else "0")' "${repo_path}/package.json" 2>/dev/null || echo 0)
174+
has_test=$(python3 -c 'import json,sys; d=json.load(open(sys.argv[1])); print("1" if "test" in d.get("scripts", {}) else "0")' "${repo_path}/package.json" 2>/dev/null || echo 0)
175+
if [[ "$has_typecheck" = "1" ]]; then
176+
if ( cd "$repo_path" && timeout "$timeout_secs" npm run typecheck >/dev/null 2>&1 ); then
177+
tc_status="pass"
178+
else
179+
tc_status="fail"
180+
fi
181+
fi
182+
if [[ "$has_test" = "1" ]]; then
183+
if ( cd "$repo_path" && timeout "$timeout_secs" npm test >/dev/null 2>&1 ); then
184+
test_status="pass"
185+
else
186+
test_status="fail"
187+
fi
188+
fi
189+
fi
190+
printf '{"label":"%s","typecheck":"%s","test":"%s"}' "$label" "$tc_status" "$test_status"
191+
}
192+
193+
ratchet_decision() {
194+
local baseline="$1" post="$2"
195+
local bt pt bx px
196+
bt=$(echo "$baseline" | python3 -c 'import json,sys; print(json.load(sys.stdin).get("typecheck","skip"))' 2>/dev/null || echo skip)
197+
pt=$(echo "$post" | python3 -c 'import json,sys; print(json.load(sys.stdin).get("typecheck","skip"))' 2>/dev/null || echo skip)
198+
bx=$(echo "$baseline" | python3 -c 'import json,sys; print(json.load(sys.stdin).get("test","skip"))' 2>/dev/null || echo skip)
199+
px=$(echo "$post" | python3 -c 'import json,sys; print(json.load(sys.stdin).get("test","skip"))' 2>/dev/null || echo skip)
200+
local reverts=()
201+
if [[ "$bt" = "pass" && "$pt" = "fail" ]]; then
202+
reverts+=("typecheck regression (pass → fail)")
203+
fi
204+
if [[ "$bx" = "pass" && "$px" = "fail" ]]; then
205+
reverts+=("test regression (pass → fail)")
206+
fi
207+
if [[ ${#reverts[@]} -gt 0 ]]; then
208+
local reason
209+
reason=$(IFS=', '; echo "${reverts[*]}")
210+
echo "revert: $reason"
211+
return 1
212+
fi
213+
echo "keep: baseline=tc:${bt},test:${bx} post=tc:${pt},test:${px}"
214+
return 0
215+
}
216+
146217
# ─── Queue management ───────────────────────────────────────
147218

148219
init_queue() {
@@ -375,6 +446,10 @@ execute_task() {
375446
local branch=""
376447
local use_branch=false
377448
local stashed=false
449+
# Ratchet state — initialized up front so the post-validation block
450+
# can reference them on paths that skip branch creation entirely.
451+
local RATCHET_ENABLED=false
452+
local RATCHET_BASELINE=""
378453
cd "$repo_path"
379454

380455
# Non-operator tasks get their own branch
@@ -407,6 +482,14 @@ execute_task() {
407482
git checkout main 2>/dev/null || git checkout master 2>/dev/null
408483
git pull --ff-only 2>/dev/null || true
409484

485+
# ── Ratchet baseline capture (#16) ────────────────────────
486+
if ratchet_enabled_for_task "$task_json"; then
487+
RATCHET_ENABLED=true
488+
log "│ Ratchet: capturing baseline on main…"
489+
RATCHET_BASELINE="$(ratchet_snapshot "$repo_path" baseline)"
490+
log "│ Ratchet baseline: ${RATCHET_BASELINE}"
491+
fi
492+
410493
# Create or reset task branch
411494
if git rev-parse --verify "$branch" >/dev/null 2>&1; then
412495
git checkout "$branch"
@@ -543,6 +626,36 @@ Task: ${title}" 2>/dev/null || true
543626
commit_count=$((commit_count + 1))
544627
fi
545628

629+
# ── Ratchet post-validation (#16) ─────────────────────────
630+
local ratchet_verdict=""
631+
if [[ "$RATCHET_ENABLED" = "true" && "$commit_count" -gt 0 ]]; then
632+
log "│ Ratchet: validating branch…"
633+
local ratchet_post
634+
ratchet_post="$(ratchet_snapshot "$repo_path" post)"
635+
log "│ Ratchet post: ${ratchet_post}"
636+
ratchet_verdict="$(ratchet_decision "$RATCHET_BASELINE" "$ratchet_post")"
637+
local ratchet_rc=$?
638+
log "│ Ratchet verdict: ${ratchet_verdict}"
639+
640+
if [[ $ratchet_rc -ne 0 ]]; then
641+
log "│ Ratchet REVERT — dropping branch and skipping PR"
642+
git checkout main 2>/dev/null || git checkout master 2>/dev/null
643+
git branch -D "$branch" 2>/dev/null || true
644+
branch=""
645+
commit_count=0
646+
if [[ "$stashed" == "true" ]]; then
647+
git stash pop 2>/dev/null && log "│ Restored stashed changes" || true
648+
stashed=false
649+
fi
650+
result_text="[ratchet_revert] ${ratchet_verdict}
651+
652+
${result_text}"
653+
update_task_status "$task_id" "failed" "ratchet revert: ${ratchet_verdict}"
654+
log "└─ Task reverted by ratchet gate"
655+
return 1
656+
fi
657+
fi
658+
546659
# Push and create PR if there are commits
547660
if [[ "$commit_count" -gt 0 ]]; then
548661
log "│ Pushing ${commit_count} commit(s) to ${branch}..."

0 commit comments

Comments
 (0)