From d0ba9f11945fff1b661d6056466f9660ed0660a0 Mon Sep 17 00:00:00 2001 From: Karl Date: Mon, 1 Jun 2026 21:05:23 -0700 Subject: [PATCH 1/3] feat(devcontainer): add autonomous-agent sandbox with egress lockdown + scoped tokens MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add a containerized sandbox so Claude Code agents can run autonomously (--dangerously-skip-permissions) without host risk or credential exfiltration. The container is the boundary; three layers cover its gaps: - .devcontainer/init-firewall.sh — default-drop egress, allowlisting only the Anthropic API and GitHub's published CIDRs (from api.github.com/meta). Go builds need no egress because the repo vendors its deps. Self-verifies (GitHub reachable, example.com blocked) and exits non-zero otherwise. - .devcontainer/claude-settings.json — deny rules (secret-file reads, sudo, Keychain) baked into the container user's settings so repo settings can't relax them. Closes the GitHub exfil channel the firewall can't (GitHub is an allowed host). - scripts/mint-installation-token.sh — host-side; reads the App PEM from the macOS Keychain via process substitution (never on disk), mints a <=1h repo-scoped installation token with minimal permissions. Verified end-to-end. - .devcontainer/{Dockerfile,devcontainer.json,README.md} — image, wiring, and the security model + one-time agent-App setup steps. The agent commits as a separate least-privilege identity, not the actions-gateway-test runner App (which lacks contents/pull_requests and carries administration:write). --- .devcontainer/Dockerfile | 41 +++++++++ .devcontainer/README.md | 141 ++++++++++++++++++++++++++++ .devcontainer/claude-settings.json | 19 ++++ .devcontainer/devcontainer.json | 24 +++++ .devcontainer/init-firewall.sh | 124 +++++++++++++++++++++++++ scripts/mint-installation-token.sh | 143 +++++++++++++++++++++++++++++ 6 files changed, 492 insertions(+) create mode 100644 .devcontainer/Dockerfile create mode 100644 .devcontainer/README.md create mode 100644 .devcontainer/claude-settings.json create mode 100644 .devcontainer/devcontainer.json create mode 100755 .devcontainer/init-firewall.sh create mode 100755 scripts/mint-installation-token.sh diff --git a/.devcontainer/Dockerfile b/.devcontainer/Dockerfile new file mode 100644 index 00000000..c121633e --- /dev/null +++ b/.devcontainer/Dockerfile @@ -0,0 +1,41 @@ +# Agent sandbox image: Go toolchain + Claude Code + egress-lockdown tooling. +# Matches the repo's builder base (golang:1.26, see cmd/*/Dockerfile). +FROM golang:1.26 + +# Tools the agent and the firewall need. +RUN apt-get update && apt-get install -y --no-install-recommends \ + ca-certificates git curl jq ripgrep \ + iptables ipset dnsutils sudo gnupg \ + && rm -rf /var/lib/apt/lists/* + +# GitHub CLI (from the official apt repo). +RUN curl -fsSL https://cli.github.com/packages/githubcli-archive-keyring.gpg \ + | dd of=/usr/share/keyrings/githubcli-archive-keyring.gpg \ + && echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/githubcli-archive-keyring.gpg] https://cli.github.com/packages stable main" \ + > /etc/apt/sources.list.d/github-cli.list \ + && apt-get update && apt-get install -y --no-install-recommends gh \ + && rm -rf /var/lib/apt/lists/* + +# Node + Claude Code CLI. +RUN curl -fsSL https://deb.nodesource.com/setup_22.x | bash - \ + && apt-get install -y --no-install-recommends nodejs \ + && npm install -g @anthropic-ai/claude-code \ + && rm -rf /var/lib/apt/lists/* + +# Unprivileged agent user. It may run ONLY the firewall script as root (so the +# lockdown can program iptables at start); it has no general sudo. +RUN useradd -m -s /bin/bash agent \ + && echo 'agent ALL=(root) NOPASSWD: /usr/local/bin/init-firewall.sh' \ + > /etc/sudoers.d/agent-firewall \ + && chmod 0440 /etc/sudoers.d/agent-firewall + +COPY init-firewall.sh /usr/local/bin/init-firewall.sh +RUN chmod 0755 /usr/local/bin/init-firewall.sh + +# Container-level deny rules at the user-settings layer (repo settings can't +# relax these). +COPY claude-settings.json /home/agent/.claude/settings.json +RUN chown -R agent:agent /home/agent/.claude + +USER agent +WORKDIR /workspace diff --git a/.devcontainer/README.md b/.devcontainer/README.md new file mode 100644 index 00000000..71140fa7 --- /dev/null +++ b/.devcontainer/README.md @@ -0,0 +1,141 @@ +# Agent sandbox + +A container for running Claude Code autonomously (`--dangerously-skip-permissions` +or default mode with a broad allowlist) without the agent being able to damage +your host or exfiltrate credentials. + +> **Status: draft.** Review and adapt before wiring into a fleet. Nothing here +> is committed or active until you build and run it. + +## Why a container changes the security model + +In full bypass mode (`--dangerously-skip-permissions`), Claude Code consults +**none** of the `allow`/`ask`/`deny` rules. The container becomes the only +boundary. That bounds damage to your *host*, but inside the container the agent +still holds whatever you mounted — push credentials, tokens. So "the container +is the sandbox" is only half true: it protects the laptop, not the repo or the +secrets. + +This setup adds three layers so broad autonomy stays safe: + +| Layer | File | Stops | +|---|---|---| +| Network egress allowlist | `init-firewall.sh` | Exfiltration / C2 to any host except the Anthropic API and GitHub | +| Credential-read deny rules | `claude-settings.json` | The agent reading a key/token and leaking it through an *allowed* channel (e.g. a PR body on GitHub) | +| Scoped credentials | _operational, see below_ | Catastrophic git ops surviving even if a token leaks | + +No single layer is sufficient; they cover each other's gaps. + +## Egress allowlist (`init-firewall.sh`) + +Default-drops all outbound traffic, then allows DNS, established return flows, +the Anthropic API, and GitHub's published CIDR ranges (pulled from +`api.github.com/meta`). It self-verifies at the end: GitHub must be reachable +and `example.com` must be blocked, or it exits non-zero. + +Because the repo **vendors** its Go dependencies, `go build`/`go test` run +offline — the allowlist deliberately does *not* include the Go module proxy. If +you drop vendoring, add `proxy.golang.org`/`sum.golang.org` to `ALLOWED_DOMAINS`, +but note their `storage.googleapis.com` backend is CDN-backed and resolves to +shifting IPs; for CDN-heavy egress prefer a filtering HTTP CONNECT proxy +(tinyproxy/squid with a hostname allowlist, `HTTPS_PROXY=...`) over IP rules. + +Runs once at container start via `devcontainer.json`'s `postStartCommand`. +Requires `NET_ADMIN`/`NET_RAW` (granted in `runArgs`). + +## Credential-read deny rules (`claude-settings.json`) + +Baked into the agent user's `~/.claude/settings.json`, so repo-level settings +can't relax them (`deny` always wins). Blocks reads of `*.pem`, `*.key`, SSH +keys, `.env`, `secrets/`, `/run/secrets`, plus `sudo` and Keychain access. + +**Known gap:** denying the `Read` *tool* doesn't stop `git` from using a +credential file (different process), which is what you want — but it also +doesn't stop the agent from `cat`/`head`-ing that file via Bash, and it can't +hide an env-var token from `printenv`. Don't rely on secrecy. The robust fix is +to never put readable long-lived secrets in the container at all — see below. + +## Scoped credentials (operational — do this, don't skip it) + +The token in the container is the real risk surface. Make a leak survivable: + +### One-time: create the agent's GitHub App + +Do **not** reuse the `actions-gateway-test` App — that one is the runner +control plane (`actions:write`, `administration:write`, …) and has no +`contents`/`pull_requests`. The agent needs its own least-privilege identity. + +1. github.com → org `actions-gateway` → Settings → Developer settings → GitHub + Apps → **New GitHub App**. +2. **Repository permissions:** Contents → *Read and write*; Pull requests → + *Read and write*; Metadata → *Read* (auto). Add Workflows → *Read and write* + **only** if agents edit `.github/workflows/`. Leave everything else *No access*. +3. **Webhook:** uncheck *Active* (not needed). **Where can this be installed:** + *Only on this account*. +4. Create it → note the **Client ID** (`Iv23…`) → **Generate a private key** + (downloads a `.pem`) → **Install** the App on the `github-actions-gateway` + repo only. +5. Store the key in Keychain under a *distinct account* (hex-encoded, matching + how the mint script reads it), then shred the download: + ```bash + security add-generic-password -U -a actions-gateway-agent \ + -s github-app-private-key \ + -w "$(xxd -p < ~/Downloads/agent-app.*.private-key.pem | tr -d '\n')" + rm -P ~/Downloads/agent-app.*.private-key.pem + ``` + +The mint script then targets this App via env overrides — no code change: + +```bash +GITHUB_APP_CLIENT_ID=Iv23…theNewAppId \ +KEYCHAIN_ACCOUNT=actions-gateway-agent \ + scripts/mint-installation-token.sh +``` + +### Operational rules + +1. **Short-lived, repo-scoped token, not the App PEM.** The script above mints + an *installation* token (expires in ≤1h), scoped to this one repo with only + `contents:write`+`pull_requests:write`. Pass it as `AGENT_GH_TOKEN`. The App + private key never leaves the host Keychain. +2. **Protect `main` server-side.** Enable branch protection requiring PR + + green CI and disallowing force-push. This is the only reliable guard against + a destructive push — client-side `deny` globs can't reliably tell which + branch a `git push` targets. With protection on, even a fully compromised + agent can't rewrite `main`. +3. **Prefer SSH deploy-key-over-agent-socket** if you want no token on disk or + in env at all: forward an `ssh-agent` socket holding a deploy key that lacks + force-push rights. The key bytes never enter the container. + +## Build & run + +```bash +# Build the image +docker build -t actions-gateway-agent .devcontainer + +# Run one autonomous agent (token minted fresh on the host) +docker run --rm -it \ + --cap-add=NET_ADMIN --cap-add=NET_RAW \ + -e AGENT_GH_TOKEN="$(./scripts/mint-installation-token.sh)" \ + -e ANTHROPIC_API_KEY="$ANTHROPIC_API_KEY" \ + -v "$PWD:/workspace" \ + actions-gateway-agent \ + bash -lc 'sudo /usr/local/bin/init-firewall.sh && claude --dangerously-skip-permissions -p "next"' +``` + +(`mint-installation-token.sh` is a stub you still need — it reads the App PEM +*on the host* and exchanges it for a short-lived installation token. The PEM +stays on the host; only the token crosses into the container.) + +Or open the folder in an editor/CLI that understands `devcontainer.json`. + +## Residual risks (be honest about these) + +- The agent can still do anything *within* its token's scope: open junk PRs, + push to non-protected branches, burn CI minutes. Scope + branch protection + cap the damage; they don't eliminate it. +- The allowlist trusts GitHub wholesale — anything reachable under GitHub's + CIDRs (gists, any repo the token can touch) is a possible exfil sink. This is + why the token must be narrowly scoped. +- `api.github.com/meta` ranges are fetched at start; if GitHub changes ranges + mid-run, new IPs aren't picked up until the next container start. diff --git a/.devcontainer/claude-settings.json b/.devcontainer/claude-settings.json new file mode 100644 index 00000000..19fe3346 --- /dev/null +++ b/.devcontainer/claude-settings.json @@ -0,0 +1,19 @@ +{ + "//": "Container-only deny rules. Baked into the agent user's ~/.claude/settings.json so repo-level settings cannot relax them. deny > ask > allow in Claude Code, so these win regardless of permission mode. These complement (do not replace) the network firewall: the firewall stops exfiltration over arbitrary hosts, but GitHub is an *allowed* host, so a secret read into a PR body would still escape — denying reads of credential material closes that channel.", + "permissions": { + "deny": [ + "Read(**/*.pem)", + "Read(**/*.key)", + "Read(**/id_rsa*)", + "Read(**/id_ed25519*)", + "Read(**/.ssh/**)", + "Read(**/.env)", + "Read(**/.env.*)", + "Read(**/secrets/**)", + "Read(/run/secrets/**)", + "Read(/etc/agent/**)", + "Bash(security find-generic-password:*)", + "Bash(sudo:*)" + ] + } +} diff --git a/.devcontainer/devcontainer.json b/.devcontainer/devcontainer.json new file mode 100644 index 00000000..87aace58 --- /dev/null +++ b/.devcontainer/devcontainer.json @@ -0,0 +1,24 @@ +{ + "name": "actions-gateway-agent", + "build": { "dockerfile": "Dockerfile" }, + + // NET_ADMIN/NET_RAW let init-firewall.sh program iptables + ipset. + "runArgs": ["--cap-add=NET_ADMIN", "--cap-add=NET_RAW"], + + "workspaceFolder": "/workspace", + + // Lock down egress BEFORE the agent does anything. postStartCommand runs as + // the container user, which is allowed to sudo only this one script. + "postStartCommand": "sudo /usr/local/bin/init-firewall.sh", + + "remoteUser": "agent", + + // Credentials are injected from the host environment at run time. Prefer a + // SHORT-LIVED, repo-scoped token here, not a long-lived PAT or the App PEM. + // See README.md "Scoped credentials" — env vars are readable by the agent + // process, so the token's blast radius is the real control, not secrecy. + "containerEnv": { + "GH_TOKEN": "${localEnv:AGENT_GH_TOKEN}", + "ANTHROPIC_API_KEY": "${localEnv:ANTHROPIC_API_KEY}" + } +} diff --git a/.devcontainer/init-firewall.sh b/.devcontainer/init-firewall.sh new file mode 100755 index 00000000..8b2a0ee7 --- /dev/null +++ b/.devcontainer/init-firewall.sh @@ -0,0 +1,124 @@ +#!/usr/bin/env bash +# +# init-firewall.sh — lock down container egress to an allowlist. +# +# Description: +# Runs once at container start (as root, before the agent user starts doing +# work). Default-drops all outbound traffic except DNS, established return +# flows, and a small allowlist: the Anthropic API (so Claude Code works) and +# GitHub (so git/gh work). Everything else — including any attempt to POST a +# leaked secret to an arbitrary host — is dropped at the kernel. +# +# Go builds do NOT need egress: this repo vendors its dependencies, so +# `go build` / `go test` run fully offline. If you ever drop vendoring, add +# the Go module proxy hosts (proxy.golang.org, sum.golang.org, and their +# storage.googleapis.com backend) to ALLOWED_DOMAINS — note that CDN-backed +# hosts resolve to changing IPs, so a filtering HTTP proxy is more robust +# than IP allowlisting for those (see .devcontainer/README.md). +# +# Requires: iptables, ipset, dig (dnsutils), jq, curl. Must run with NET_ADMIN. +# +# Usage: +# sudo ./init-firewall.sh + +# --- Strict mode --- +set -euo pipefail + +# Hostnames the agent is permitted to reach (resolved to IPs below). +readonly ALLOWED_DOMAINS=( + "api.anthropic.com" + "statsig.anthropic.com" +) + +readonly IPSET_NAME="agent-allow" + +log() { printf '[init-firewall] %s\n' "$*" >&2; } + +die() { log "ERROR: $*"; exit 1; } + +# Remove any pre-existing rules so re-runs are idempotent. +flush_existing() { + local table + for table in filter nat mangle; do + iptables -t "$table" -F + iptables -t "$table" -X + done + ipset destroy "$IPSET_NAME" 2>/dev/null || true +} + +# Build the set of allowed destination CIDRs/IPs. +build_allowset() { + ipset create "$IPSET_NAME" hash:net + + # GitHub publishes its CIDR ranges via the meta API. Pull web/api/git/ + # packages so git clone, gh, and HTTPS to github.com all work. + local meta + meta="$(curl -fsSL --max-time 20 https://api.github.com/meta)" \ + || die "could not fetch GitHub meta ranges" + + local cidr + while IFS= read -r cidr; do + [[ -n "$cidr" ]] || continue + ipset add "$IPSET_NAME" "$cidr" -exist + done < <(jq -r '(.web + .api + .git + .packages)[]' <<<"$meta" | sort -u) + + # Resolve each allowlisted hostname to its current A records. + local domain ip count + for domain in "${ALLOWED_DOMAINS[@]}"; do + count=0 + while IFS= read -r ip; do + [[ -n "$ip" ]] || continue + ipset add "$IPSET_NAME" "$ip" -exist + (( count += 1 )) + done < <(dig +short A "$domain" | grep -E '^[0-9.]+$' || true) + (( count > 0 )) || die "could not resolve allowlisted host: $domain" + log "allowed ${domain} (${count} addrs)" + done +} + +# Apply the default-deny policy with the allowlist carved out. +apply_rules() { + # Loopback is always fine. + iptables -A INPUT -i lo -j ACCEPT + iptables -A OUTPUT -o lo -j ACCEPT + + # Keep established/related flows (return traffic for allowed connections). + iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT + iptables -A OUTPUT -m state --state ESTABLISHED,RELATED -j ACCEPT + + # DNS resolution must work (for the resolver and for `dig` above). + iptables -A OUTPUT -p udp --dport 53 -j ACCEPT + iptables -A OUTPUT -p tcp --dport 53 -j ACCEPT + + # Outbound to the allowlisted IP set only. + iptables -A OUTPUT -m set --match-set "$IPSET_NAME" dst -j ACCEPT + + # Default deny everything else. + iptables -P INPUT DROP + iptables -P FORWARD DROP + iptables -P OUTPUT DROP +} + +# Prove the policy is actually live before handing control to the agent. +verify() { + curl -fsSL --max-time 10 -o /dev/null https://api.github.com/zen \ + || die "verification failed: GitHub unreachable through allowlist" + if curl -fsSL --max-time 5 -o /dev/null https://example.com 2>/dev/null; then + die "verification failed: egress to example.com succeeded but must be blocked" + fi + log "egress lockdown verified (GitHub reachable, example.com blocked)" +} + +main() { + local tool + for tool in iptables ipset dig jq curl; do + command -v "$tool" >/dev/null 2>&1 || die "missing required tool: $tool" + done + flush_existing + build_allowset + apply_rules + verify + log "firewall initialized" +} + +main "$@" diff --git a/scripts/mint-installation-token.sh b/scripts/mint-installation-token.sh new file mode 100755 index 00000000..297fae17 --- /dev/null +++ b/scripts/mint-installation-token.sh @@ -0,0 +1,143 @@ +#!/usr/bin/env bash +# mint-installation-token.sh — mint a short-lived, repo-scoped GitHub App +# installation access token for the agent sandbox container. +# +# Runs on the HOST. Reads the App private key from the macOS Keychain (never +# touching disk), signs an App JWT, and exchanges it for an installation token +# scoped to a single repository with the minimum permissions an autonomous +# coding agent needs. Prints ONLY the token to stdout so it can be captured: +# +# AGENT_GH_TOKEN="$(scripts/mint-installation-token.sh)" +# +# The App private key stays on the host; only this ≤1h token crosses into the +# container. See .devcontainer/README.md "Scoped credentials". +# +# Configuration (env vars; defaults target the actions-gateway-test App): +# GITHUB_APP_CLIENT_ID — App client ID (Iv23…/Iv1.…), used as the JWT `iss`. +# REQUIRED: GitHub deprecated the numeric App ID as +# `iss` in late 2024. Find it on the App settings page. +# GITHUB_OWNER_REPO — "owner/repo" to scope the token to. +# Default: derived from the `origin` git remote. +# KEYCHAIN_ACCOUNT — Keychain account holding the hex-encoded PEM. +# Default: actions-gateway-test +# KEYCHAIN_SERVICE — Keychain service name. Default: github-app-private-key +# GITHUB_APP_PRIVATE_KEY — path to a PEM file; overrides Keychain (for CI/Linux). +# TOKEN_PERMISSIONS — JSON object of installation permissions. +# Default: {"contents":"write","pull_requests":"write"} +# Add "workflows":"write" only if the agent must edit +# files under .github/workflows. +# +# Requires: jq, curl, openssl (+ `security` & `xxd` when reading from Keychain). + +set -euo pipefail + +readonly GITHUB_API="https://api.github.com" +readonly DEFAULT_PERMISSIONS='{"contents":"write","pull_requests":"write"}' + +log() { printf '[mint-token] %s\n' "$*" >&2; } +die() { printf '[mint-token] ERROR: %s\n' "$*" >&2; exit 1; } + +# Emit the App private key (PEM) to stdout without writing it to disk. +read_pem() { + local key_path="${GITHUB_APP_PRIVATE_KEY:-}" + if [[ -n "$key_path" ]]; then + [[ -f "$key_path" ]] || die "GITHUB_APP_PRIVATE_KEY set but file not found: $key_path" + cat "$key_path" + return + fi + command -v security >/dev/null 2>&1 || die "macOS 'security' not found; set GITHUB_APP_PRIVATE_KEY to a PEM path instead" + command -v xxd >/dev/null 2>&1 || die "'xxd' not found (needed to decode the hex-encoded Keychain entry)" + security find-generic-password -a "$KEYCHAIN_ACCOUNT" -s "$KEYCHAIN_SERVICE" -w 2>/dev/null \ + | xxd -r -p \ + || die "could not read App key from Keychain (account=$KEYCHAIN_ACCOUNT service=$KEYCHAIN_SERVICE)" +} + +# RFC 7515 base64url: base64, strip padding, swap +/ for -_, drop newlines. +b64url() { base64 | tr -d '=' | tr '+/' '-_' | tr -d '\n'; } + +# Sign "$1" (the JWT signing input) with the App key, emit base64url signature. +sign_rs256() { + local input="$1" + # Process substitution keeps the PEM off disk: openssl reads it via /dev/fd. + printf '%s' "$input" \ + | openssl dgst -sha256 -sign <(read_pem) -binary 2>/dev/null \ + | b64url \ + || die "openssl signing failed — is the Keychain entry a valid RSA PEM?" +} + +# curl wrapper: capture body + HTTP code, fail loudly on non-2xx. Body -> stdout. +api() { + local description="$1" method="$2" url="$3"; shift 3 + local response http_code body + response=$(curl -sS -w $'\n%{http_code}' --max-time 30 -X "$method" "$url" \ + -H "Accept: application/vnd.github+json" \ + -H "X-GitHub-Api-Version: 2022-11-28" \ + "$@") || die "$description: curl failed" + http_code="${response##*$'\n'}" + body="${response%$'\n'*}" + if [[ "$http_code" != 2* ]]; then + die "$description failed (HTTP $http_code): $body" + fi + printf '%s' "$body" +} + +main() { + local tool + for tool in jq curl openssl; do + command -v "$tool" >/dev/null 2>&1 || die "missing required tool: $tool" + done + + : "${KEYCHAIN_ACCOUNT:=actions-gateway-test}" + : "${KEYCHAIN_SERVICE:=github-app-private-key}" + local permissions="${TOKEN_PERMISSIONS:-$DEFAULT_PERMISSIONS}" + jq -e . >/dev/null 2>&1 <<<"$permissions" || die "TOKEN_PERMISSIONS is not valid JSON: $permissions" + + local client_id="${GITHUB_APP_CLIENT_ID:-}" + [[ -n "$client_id" ]] || die "GITHUB_APP_CLIENT_ID is required (App client ID, Iv23…/Iv1.…; numeric App ID no longer works as JWT iss)" + + # Resolve owner/repo from env or the origin remote. + local owner_repo="${GITHUB_OWNER_REPO:-}" + if [[ -z "$owner_repo" ]]; then + local origin + origin="$(git config --get remote.origin.url 2>/dev/null)" \ + || die "GITHUB_OWNER_REPO unset and no origin remote to derive it from" + # Strip protocol/host and the trailing .git: git@github.com:o/r.git | https://github.com/o/r.git + owner_repo="$(printf '%s' "$origin" | awk -F'[:/]' '{print $(NF-1)"/"$NF}')" + owner_repo="${owner_repo%.git}" + fi + [[ "$owner_repo" == */* ]] || die "could not resolve owner/repo (got: '$owner_repo')" + local owner="${owner_repo%%/*}" repo="${owner_repo##*/}" + log "scoping token to ${owner}/${repo} with permissions ${permissions}" + + # --- Mint the App JWT (10-minute lifetime, GitHub's max). --- + local now header claims jwt + now="$(date +%s)" + header="$(printf '{"alg":"RS256","typ":"JWT"}' | b64url)" + claims="$(printf '{"iat":%d,"exp":%d,"iss":"%s"}' "$((now - 60))" "$((now + 540))" "$client_id" | b64url)" + jwt="${header}.${claims}.$(sign_rs256 "${header}.${claims}")" + + # --- Discover the installation for this repo. --- + local install_id + install_id="$(api "look up installation" GET "${GITHUB_API}/repos/${owner}/${repo}/installation" \ + -H "Authorization: Bearer ${jwt}" | jq -r '.id')" + [[ "$install_id" =~ ^[0-9]+$ ]] || die "could not resolve installation id for ${owner}/${repo} (is the App installed there?)" + + # --- Exchange the JWT for a scoped installation access token. --- + local body token expires + body="$(jq -nc --arg repo "$repo" --argjson perms "$permissions" \ + '{repositories: [$repo], permissions: $perms}')" + local resp + resp="$(api "exchange JWT for installation token" POST \ + "${GITHUB_API}/app/installations/${install_id}/access_tokens" \ + -H "Authorization: Bearer ${jwt}" \ + -d "$body")" + token="$(jq -r '.token' <<<"$resp")" + expires="$(jq -r '.expires_at' <<<"$resp")" + [[ -n "$token" && "$token" != "null" ]] || die "no token in response: $resp" + log "token minted, expires ${expires}" + + # ONLY the token on stdout. + printf '%s\n' "$token" +} + +main "$@" From c896ba2c0a4ad62c35ace2e6e10d6750433db228 Mon Sep 17 00:00:00 2001 From: Karl Date: Mon, 1 Jun 2026 21:40:33 -0700 Subject: [PATCH 2/3] docs(plan): add agent workflow automation plan Capture the autonomous-agent workflow initiative kicked off by the .devcontainer sandbox: goal, the four levers (approval prompts, parallelism, PR/CI/merge automation, orchestration glue), what's done vs in-flight vs open, and the decisions made (dedicated least-privilege agent App; minimal Anthropic+GitHub egress allowlist). Remaining work tracked as Q62/Q63. --- docs/plan/agent-workflow-automation.md | 49 ++++++++++++++++++++++++++ 1 file changed, 49 insertions(+) create mode 100644 docs/plan/agent-workflow-automation.md diff --git a/docs/plan/agent-workflow-automation.md b/docs/plan/agent-workflow-automation.md new file mode 100644 index 00000000..b2310555 --- /dev/null +++ b/docs/plan/agent-workflow-automation.md @@ -0,0 +1,49 @@ +# Plan: Autonomous agent workflow automation + +**Goal:** Let many Claude Code agents run the `docs/STATUS.md` backlog in parallel with far fewer approval prompts and far fewer manual human steps (no hand-shepherding each PR through review → CI → merge). + +**Status legend:** ✅ done · ▶ in-flight (open PR) · 🔲 ready · 💤 out of repo scope + +## Why + +Today each backlog item is one manually-started worktree session where a human approves many permission prompts, then babysits the PR: open it, review it, wait for CI, relay failures back to the agent, and merge. The aim is to make a session run start-to-merge with the human only spot-checking outcomes. + +## The four levers + +| # | Lever | Status | Where | +|---|---|---|---| +| 1 | Fewer approval prompts | ▶ | container sandbox (PR #107); allowlist done | +| 2 | More parallel agents | ✅ | already worktree-per-session + headless `claude -p` | +| 3 | No PR/CI/merge babysitting | 🔲 | token minting done (PR #107); auto-merge + CI auto-fix open | +| 4 | Orchestration glue | 💤 | `/next` command + watchdog — personal Claude config | + +## Lever 1 — fewer approval prompts + +- ✅ **Read-only allowlist.** Scanned transcripts; the only genuine gap was `golangci-lint`, added to the gitignored `.claude/settings.local.json`. Most read-only usage (`go test`, `kubectl get`, `gh pr list`, git read-only, `grep`/`ls`/`cat`) is already auto-allowed or already listed. +- ▶ **Container sandbox (PR #107).** The real lever for autonomy: run agents with `--dangerously-skip-permissions` inside a container whose blast radius is bounded by three layers — egress firewall, credential-read deny rules, scoped credentials. See [`.devcontainer/README.md`](../../.devcontainer/README.md). Full-bypass mode consults no permission rules, so the container — not the allowlist — is the boundary. + +## Lever 2 — more parallel agents + +Already in place: a git worktree per session. Scaling further needs no repo change — wrap `claude -p "next"` (headless) per worktree, optionally in the PR #107 container, or use cloud agents to keep the laptop free. No open work tracked here. + +## Lever 3 — eliminate PR/CI/merge babysitting + +- ✅ **Scoped token minting (PR #107).** [`scripts/mint-installation-token.sh`](../../scripts/mint-installation-token.sh) mints a ≤1h, single-repo installation token with minimal permissions, reading the App key from Keychain without it ever hitting disk. Verified end-to-end. +- 🔲 **Dedicated agent identity + branch protection (go-live).** The agent must commit as its **own** least-privilege App (`contents:write`+`pull_requests:write`), **not** the `actions-gateway-test` runner App — that App lacks those permissions and carries `administration:write`, which an agent must never hold (confirmed by a 422 during testing). Steps: create the App, store its PEM under Keychain account `actions-gateway-agent`, enable branch protection on `main` (required PR + green CI, no force-push). Branch protection is the only reliable guard against a destructive push — client-side `deny` globs can't gate by target branch. → **Q62** +- 🔲 **Auto-merge + CI auto-fix.** End the relay loop: agent finishes with `gh pr merge --auto --squash` (merges when CI is green); a failed CI run dispatches the [Claude Code GitHub Action](https://github.com/anthropics/claude-code-action) to fix it on the branch; optionally a watchdog reports only genuinely-stuck PRs. Touches `.github/workflows/`. → **Q63** + +## Lever 4 — orchestration glue (out of repo scope) + +These live in personal Claude Code config, not this repo, so they are not Queue items: + +- A deterministic `/next` slash command encoding the full loop (fetch+rebase → pick a Queue item not covered by an open PR → branch → implement → test → PR → enable auto-merge), replacing the freeform "next" prompt so every agent runs it identically. +- A `loop`/`schedule` watchdog that surfaces only PRs needing a human. + +## Decisions made + +- **Agent commits via a new dedicated GitHub App**, not the runner App or a PAT — preserves the ephemeral-token flow already built and keeps least privilege. (Decided 2026-06-01.) +- **Egress allowlist is Anthropic API + GitHub only** — viable because the repo vendors its Go deps, so builds/tests run offline. Drop vendoring → must add the module-proxy hosts (prefer a filtering HTTP proxy for those CDNs). + +## Next concrete step + +Q62: create the dedicated agent App and enable branch protection on `main`. Until then PR #107's container can be built and its firewall self-test exercised, but the agent has no identity to push as. From 0d77cb234949a0c28291229c38670654cce4da08 Mon Sep 17 00:00:00 2001 From: Karl Date: Mon, 1 Jun 2026 21:40:33 -0700 Subject: [PATCH 3/3] docs(status): track agent workflow automation (Progress row + Q62/Q63) --- docs/STATUS.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/docs/STATUS.md b/docs/STATUS.md index ff378cfc..592490c7 100644 --- a/docs/STATUS.md +++ b/docs/STATUS.md @@ -38,6 +38,7 @@ Plan-level view. ✅ = all criteria met. ⚠️ = code shipped, specific pieces | Make UX | `infra` | ✅ | Phase 1 + Phase 2 done — [plan](plan/make.md) | | Docker image speed | `speed` | ✅ | All items done or explicitly closed — [plan](plan/docker-image-speed.md) | | e2e test speed | `speed` `tests` | ✅ | All items done — [plan](plan/e2e-tests-speed.md) | +| Agent workflow automation | `infra` | ⚠️ | Sandbox (egress lockdown + scoped tokens) in PR #107; go-live + auto-merge open as [Q62](#Q62)/[Q63](#Q63) — [plan](plan/agent-workflow-automation.md) | --- @@ -82,6 +83,8 @@ Specific actionable items in priority order. Pick from the top; skip 🚫 items | Q51 | Reconcile documented vs emitted Prometheus metrics | `infra` `docs` `bug` | 🔲 | M | 6 documented metrics never registered in code (headline `pod_creation_latency_seconds` + 5 others). Per-metric decision: implement, re-point, or mark `(planned)`. See [docs-six-layer-audit.md](plan/docs-six-layer-audit.md) Layer 3. | | Q55 | Verify provisioner-test goleak cascade fix held in CI | `tests` `bug` | 🔲 | S | Intermittent ~20-test goleak cascade in `internal/provisioner` fixed by `waitForPodCreated` helper in 59c0714; delete row once CI is clean. If flakes recur, migrate remaining ~18 Eventually-on-Pod sites to the helper. | | Q60 | [Competitive analysis — GAG vs ARC-adjacent runner/queue tooling](design/appendix-d-alternatives-considered.md) | `docs` | 🔲 | M | Competitive analysis vs ARC-adjacent tooling: Kueue, Exostellar (verify the Kueue-under-ARC GPU pattern), KEDA. Expands [appendix-d](design/appendix-d-alternatives-considered.md). Narrow Kueue-vs-admission angle is in [Q59](#Q59). | +| Q62 | [Agent sandbox go-live: dedicated App + branch protection](plan/agent-workflow-automation.md) | `infra` `security` | 🔲 | S | Create a least-privilege agent GitHub App (`contents`+`pull_requests` write) and protect `main`; the `actions-gateway-test` runner App can't be reused (422 — no contents/PR, has `administration:write`). Prerequisite for the PR #107 sandbox. | +| Q63 | [Auto-merge + CI auto-fix wiring](plan/agent-workflow-automation.md) | `infra` | 🚫 | M | Blocked by [Q62](#Q62). Agent ends with `gh pr merge --auto`; failed CI dispatches the Claude Code GitHub Action to fix on-branch. Touches `.github/workflows/`. | | Q17 | [Unit/integration test speed improvements](plan/unit-tests-speed.md) | `speed` `tests` | 💤 | M | low priority; pick up when CI latency is the bottleneck | | Q18 | [alerting.md](plan/docs.md) | `docs` | 💤 | M | deferred until a real Prometheus/Alertmanager setup exists | | Q19 | [Proxy features: allowlist, rate-limit, audit log, TLS, per-RG pool, X25519](design/appendix-g-future-enhancements.md) | `security` | 💤 | L | explicit non-commitments; build only when a named trigger fires |