From d0ba9f11945fff1b661d6056466f9660ed0660a0 Mon Sep 17 00:00:00 2001
From: Karl <karlkfi@gmail.com>
Date: Mon, 1 Jun 2026 21:05:23 -0700
Subject: [PATCH 1/3] feat(devcontainer): add autonomous-agent sandbox with
 egress lockdown + scoped tokens
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Add a containerized sandbox so Claude Code agents can run autonomously
(--dangerously-skip-permissions) without host risk or credential
exfiltration. The container is the boundary; three layers cover its gaps:

- .devcontainer/init-firewall.sh — default-drop egress, allowlisting only
  the Anthropic API and GitHub's published CIDRs (from api.github.com/meta).
  Go builds need no egress because the repo vendors its deps. Self-verifies
  (GitHub reachable, example.com blocked) and exits non-zero otherwise.
- .devcontainer/claude-settings.json — deny rules (secret-file reads, sudo,
  Keychain) baked into the container user's settings so repo settings can't
  relax them. Closes the GitHub exfil channel the firewall can't (GitHub is
  an allowed host).
- scripts/mint-installation-token.sh — host-side; reads the App PEM from the
  macOS Keychain via process substitution (never on disk), mints a <=1h
  repo-scoped installation token with minimal permissions. Verified
  end-to-end.
- .devcontainer/{Dockerfile,devcontainer.json,README.md} — image, wiring,
  and the security model + one-time agent-App setup steps.

The agent commits as a separate least-privilege identity, not the
actions-gateway-test runner App (which lacks contents/pull_requests and
carries administration:write).
---
 .devcontainer/Dockerfile           |  41 +++++++++
 .devcontainer/README.md            | 141 ++++++++++++++++++++++++++++
 .devcontainer/claude-settings.json |  19 ++++
 .devcontainer/devcontainer.json    |  24 +++++
 .devcontainer/init-firewall.sh     | 124 +++++++++++++++++++++++++
 scripts/mint-installation-token.sh | 143 +++++++++++++++++++++++++++++
 6 files changed, 492 insertions(+)
 create mode 100644 .devcontainer/Dockerfile
 create mode 100644 .devcontainer/README.md
 create mode 100644 .devcontainer/claude-settings.json
 create mode 100644 .devcontainer/devcontainer.json
 create mode 100755 .devcontainer/init-firewall.sh
 create mode 100755 scripts/mint-installation-token.sh

diff --git a/.devcontainer/Dockerfile b/.devcontainer/Dockerfile
new file mode 100644
index 00000000..c121633e
--- /dev/null
+++ b/.devcontainer/Dockerfile
@@ -0,0 +1,41 @@
+# Agent sandbox image: Go toolchain + Claude Code + egress-lockdown tooling.
+# Matches the repo's builder base (golang:1.26, see cmd/*/Dockerfile).
+FROM golang:1.26
+
+# Tools the agent and the firewall need.
+RUN apt-get update && apt-get install -y --no-install-recommends \
+      ca-certificates git curl jq ripgrep \
+      iptables ipset dnsutils sudo gnupg \
+    && rm -rf /var/lib/apt/lists/*
+
+# GitHub CLI (from the official apt repo).
+RUN curl -fsSL https://cli.github.com/packages/githubcli-archive-keyring.gpg \
+      | dd of=/usr/share/keyrings/githubcli-archive-keyring.gpg \
+    && echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/githubcli-archive-keyring.gpg] https://cli.github.com/packages stable main" \
+      > /etc/apt/sources.list.d/github-cli.list \
+    && apt-get update && apt-get install -y --no-install-recommends gh \
+    && rm -rf /var/lib/apt/lists/*
+
+# Node + Claude Code CLI.
+RUN curl -fsSL https://deb.nodesource.com/setup_22.x | bash - \
+    && apt-get install -y --no-install-recommends nodejs \
+    && npm install -g @anthropic-ai/claude-code \
+    && rm -rf /var/lib/apt/lists/*
+
+# Unprivileged agent user. It may run ONLY the firewall script as root (so the
+# lockdown can program iptables at start); it has no general sudo.
+RUN useradd -m -s /bin/bash agent \
+    && echo 'agent ALL=(root) NOPASSWD: /usr/local/bin/init-firewall.sh' \
+      > /etc/sudoers.d/agent-firewall \
+    && chmod 0440 /etc/sudoers.d/agent-firewall
+
+COPY init-firewall.sh /usr/local/bin/init-firewall.sh
+RUN chmod 0755 /usr/local/bin/init-firewall.sh
+
+# Container-level deny rules at the user-settings layer (repo settings can't
+# relax these).
+COPY claude-settings.json /home/agent/.claude/settings.json
+RUN chown -R agent:agent /home/agent/.claude
+
+USER agent
+WORKDIR /workspace
diff --git a/.devcontainer/README.md b/.devcontainer/README.md
new file mode 100644
index 00000000..71140fa7
--- /dev/null
+++ b/.devcontainer/README.md
@@ -0,0 +1,141 @@
+# Agent sandbox
+
+A container for running Claude Code autonomously (`--dangerously-skip-permissions`
+or default mode with a broad allowlist) without the agent being able to damage
+your host or exfiltrate credentials.
+
+> **Status: draft.** Review and adapt before wiring into a fleet. Nothing here
+> is committed or active until you build and run it.
+
+## Why a container changes the security model
+
+In full bypass mode (`--dangerously-skip-permissions`), Claude Code consults
+**none** of the `allow`/`ask`/`deny` rules. The container becomes the only
+boundary. That bounds damage to your *host*, but inside the container the agent
+still holds whatever you mounted — push credentials, tokens. So "the container
+is the sandbox" is only half true: it protects the laptop, not the repo or the
+secrets.
+
+This setup adds three layers so broad autonomy stays safe:
+
+| Layer | File | Stops |
+|---|---|---|
+| Network egress allowlist | `init-firewall.sh` | Exfiltration / C2 to any host except the Anthropic API and GitHub |
+| Credential-read deny rules | `claude-settings.json` | The agent reading a key/token and leaking it through an *allowed* channel (e.g. a PR body on GitHub) |
+| Scoped credentials | _operational, see below_ | Catastrophic git ops surviving even if a token leaks |
+
+No single layer is sufficient; they cover each other's gaps.
+
+## Egress allowlist (`init-firewall.sh`)
+
+Default-drops all outbound traffic, then allows DNS, established return flows,
+the Anthropic API, and GitHub's published CIDR ranges (pulled from
+`api.github.com/meta`). It self-verifies at the end: GitHub must be reachable
+and `example.com` must be blocked, or it exits non-zero.
+
+Because the repo **vendors** its Go dependencies, `go build`/`go test` run
+offline — the allowlist deliberately does *not* include the Go module proxy. If
+you drop vendoring, add `proxy.golang.org`/`sum.golang.org` to `ALLOWED_DOMAINS`,
+but note their `storage.googleapis.com` backend is CDN-backed and resolves to
+shifting IPs; for CDN-heavy egress prefer a filtering HTTP CONNECT proxy
+(tinyproxy/squid with a hostname allowlist, `HTTPS_PROXY=...`) over IP rules.
+
+Runs once at container start via `devcontainer.json`'s `postStartCommand`.
+Requires `NET_ADMIN`/`NET_RAW` (granted in `runArgs`).
+
+## Credential-read deny rules (`claude-settings.json`)
+
+Baked into the agent user's `~/.claude/settings.json`, so repo-level settings
+can't relax them (`deny` always wins). Blocks reads of `*.pem`, `*.key`, SSH
+keys, `.env`, `secrets/`, `/run/secrets`, plus `sudo` and Keychain access.
+
+**Known gap:** denying the `Read` *tool* doesn't stop `git` from using a
+credential file (different process), which is what you want — but it also
+doesn't stop the agent from `cat`/`head`-ing that file via Bash, and it can't
+hide an env-var token from `printenv`. Don't rely on secrecy. The robust fix is
+to never put readable long-lived secrets in the container at all — see below.
+
+## Scoped credentials (operational — do this, don't skip it)
+
+The token in the container is the real risk surface. Make a leak survivable:
+
+### One-time: create the agent's GitHub App
+
+Do **not** reuse the `actions-gateway-test` App — that one is the runner
+control plane (`actions:write`, `administration:write`, …) and has no
+`contents`/`pull_requests`. The agent needs its own least-privilege identity.
+
+1. github.com → org `actions-gateway` → Settings → Developer settings → GitHub
+   Apps → **New GitHub App**.
+2. **Repository permissions:** Contents → *Read and write*; Pull requests →
+   *Read and write*; Metadata → *Read* (auto). Add Workflows → *Read and write*
+   **only** if agents edit `.github/workflows/`. Leave everything else *No access*.
+3. **Webhook:** uncheck *Active* (not needed). **Where can this be installed:**
+   *Only on this account*.
+4. Create it → note the **Client ID** (`Iv23…`) → **Generate a private key**
+   (downloads a `.pem`) → **Install** the App on the `github-actions-gateway`
+   repo only.
+5. Store the key in Keychain under a *distinct account* (hex-encoded, matching
+   how the mint script reads it), then shred the download:
+   ```bash
+   security add-generic-password -U -a actions-gateway-agent \
+     -s github-app-private-key \
+     -w "$(xxd -p < ~/Downloads/agent-app.*.private-key.pem | tr -d '\n')"
+   rm -P ~/Downloads/agent-app.*.private-key.pem
+   ```
+
+The mint script then targets this App via env overrides — no code change:
+
+```bash
+GITHUB_APP_CLIENT_ID=Iv23…theNewAppId \
+KEYCHAIN_ACCOUNT=actions-gateway-agent \
+  scripts/mint-installation-token.sh
+```
+
+### Operational rules
+
+1. **Short-lived, repo-scoped token, not the App PEM.** The script above mints
+   an *installation* token (expires in ≤1h), scoped to this one repo with only
+   `contents:write`+`pull_requests:write`. Pass it as `AGENT_GH_TOKEN`. The App
+   private key never leaves the host Keychain.
+2. **Protect `main` server-side.** Enable branch protection requiring PR +
+   green CI and disallowing force-push. This is the only reliable guard against
+   a destructive push — client-side `deny` globs can't reliably tell which
+   branch a `git push` targets. With protection on, even a fully compromised
+   agent can't rewrite `main`.
+3. **Prefer SSH deploy-key-over-agent-socket** if you want no token on disk or
+   in env at all: forward an `ssh-agent` socket holding a deploy key that lacks
+   force-push rights. The key bytes never enter the container.
+
+## Build & run
+
+```bash
+# Build the image
+docker build -t actions-gateway-agent .devcontainer
+
+# Run one autonomous agent (token minted fresh on the host)
+docker run --rm -it \
+  --cap-add=NET_ADMIN --cap-add=NET_RAW \
+  -e AGENT_GH_TOKEN="$(./scripts/mint-installation-token.sh)" \
+  -e ANTHROPIC_API_KEY="$ANTHROPIC_API_KEY" \
+  -v "$PWD:/workspace" \
+  actions-gateway-agent \
+  bash -lc 'sudo /usr/local/bin/init-firewall.sh && claude --dangerously-skip-permissions -p "next"'
+```
+
+(`mint-installation-token.sh` is a stub you still need — it reads the App PEM
+*on the host* and exchanges it for a short-lived installation token. The PEM
+stays on the host; only the token crosses into the container.)
+
+Or open the folder in an editor/CLI that understands `devcontainer.json`.
+
+## Residual risks (be honest about these)
+
+- The agent can still do anything *within* its token's scope: open junk PRs,
+  push to non-protected branches, burn CI minutes. Scope + branch protection
+  cap the damage; they don't eliminate it.
+- The allowlist trusts GitHub wholesale — anything reachable under GitHub's
+  CIDRs (gists, any repo the token can touch) is a possible exfil sink. This is
+  why the token must be narrowly scoped.
+- `api.github.com/meta` ranges are fetched at start; if GitHub changes ranges
+  mid-run, new IPs aren't picked up until the next container start.
diff --git a/.devcontainer/claude-settings.json b/.devcontainer/claude-settings.json
new file mode 100644
index 00000000..19fe3346
--- /dev/null
+++ b/.devcontainer/claude-settings.json
@@ -0,0 +1,19 @@
+{
+  "//": "Container-only deny rules. Baked into the agent user's ~/.claude/settings.json so repo-level settings cannot relax them. deny > ask > allow in Claude Code, so these win regardless of permission mode. These complement (do not replace) the network firewall: the firewall stops exfiltration over arbitrary hosts, but GitHub is an *allowed* host, so a secret read into a PR body would still escape — denying reads of credential material closes that channel.",
+  "permissions": {
+    "deny": [
+      "Read(**/*.pem)",
+      "Read(**/*.key)",
+      "Read(**/id_rsa*)",
+      "Read(**/id_ed25519*)",
+      "Read(**/.ssh/**)",
+      "Read(**/.env)",
+      "Read(**/.env.*)",
+      "Read(**/secrets/**)",
+      "Read(/run/secrets/**)",
+      "Read(/etc/agent/**)",
+      "Bash(security find-generic-password:*)",
+      "Bash(sudo:*)"
+    ]
+  }
+}
diff --git a/.devcontainer/devcontainer.json b/.devcontainer/devcontainer.json
new file mode 100644
index 00000000..87aace58
--- /dev/null
+++ b/.devcontainer/devcontainer.json
@@ -0,0 +1,24 @@
+{
+  "name": "actions-gateway-agent",
+  "build": { "dockerfile": "Dockerfile" },
+
+  // NET_ADMIN/NET_RAW let init-firewall.sh program iptables + ipset.
+  "runArgs": ["--cap-add=NET_ADMIN", "--cap-add=NET_RAW"],
+
+  "workspaceFolder": "/workspace",
+
+  // Lock down egress BEFORE the agent does anything. postStartCommand runs as
+  // the container user, which is allowed to sudo only this one script.
+  "postStartCommand": "sudo /usr/local/bin/init-firewall.sh",
+
+  "remoteUser": "agent",
+
+  // Credentials are injected from the host environment at run time. Prefer a
+  // SHORT-LIVED, repo-scoped token here, not a long-lived PAT or the App PEM.
+  // See README.md "Scoped credentials" — env vars are readable by the agent
+  // process, so the token's blast radius is the real control, not secrecy.
+  "containerEnv": {
+    "GH_TOKEN": "${localEnv:AGENT_GH_TOKEN}",
+    "ANTHROPIC_API_KEY": "${localEnv:ANTHROPIC_API_KEY}"
+  }
+}
diff --git a/.devcontainer/init-firewall.sh b/.devcontainer/init-firewall.sh
new file mode 100755
index 00000000..8b2a0ee7
--- /dev/null
+++ b/.devcontainer/init-firewall.sh
@@ -0,0 +1,124 @@
+#!/usr/bin/env bash
+#
+# init-firewall.sh — lock down container egress to an allowlist.
+#
+# Description:
+#   Runs once at container start (as root, before the agent user starts doing
+#   work). Default-drops all outbound traffic except DNS, established return
+#   flows, and a small allowlist: the Anthropic API (so Claude Code works) and
+#   GitHub (so git/gh work). Everything else — including any attempt to POST a
+#   leaked secret to an arbitrary host — is dropped at the kernel.
+#
+#   Go builds do NOT need egress: this repo vendors its dependencies, so
+#   `go build` / `go test` run fully offline. If you ever drop vendoring, add
+#   the Go module proxy hosts (proxy.golang.org, sum.golang.org, and their
+#   storage.googleapis.com backend) to ALLOWED_DOMAINS — note that CDN-backed
+#   hosts resolve to changing IPs, so a filtering HTTP proxy is more robust
+#   than IP allowlisting for those (see .devcontainer/README.md).
+#
+# Requires: iptables, ipset, dig (dnsutils), jq, curl. Must run with NET_ADMIN.
+#
+# Usage:
+#   sudo ./init-firewall.sh
+
+# --- Strict mode ---
+set -euo pipefail
+
+# Hostnames the agent is permitted to reach (resolved to IPs below).
+readonly ALLOWED_DOMAINS=(
+  "api.anthropic.com"
+  "statsig.anthropic.com"
+)
+
+readonly IPSET_NAME="agent-allow"
+
+log() { printf '[init-firewall] %s\n' "$*" >&2; }
+
+die() { log "ERROR: $*"; exit 1; }
+
+# Remove any pre-existing rules so re-runs are idempotent.
+flush_existing() {
+  local table
+  for table in filter nat mangle; do
+    iptables -t "$table" -F
+    iptables -t "$table" -X
+  done
+  ipset destroy "$IPSET_NAME" 2>/dev/null || true
+}
+
+# Build the set of allowed destination CIDRs/IPs.
+build_allowset() {
+  ipset create "$IPSET_NAME" hash:net
+
+  # GitHub publishes its CIDR ranges via the meta API. Pull web/api/git/
+  # packages so git clone, gh, and HTTPS to github.com all work.
+  local meta
+  meta="$(curl -fsSL --max-time 20 https://api.github.com/meta)" \
+    || die "could not fetch GitHub meta ranges"
+
+  local cidr
+  while IFS= read -r cidr; do
+    [[ -n "$cidr" ]] || continue
+    ipset add "$IPSET_NAME" "$cidr" -exist
+  done < <(jq -r '(.web + .api + .git + .packages)[]' <<<"$meta" | sort -u)
+
+  # Resolve each allowlisted hostname to its current A records.
+  local domain ip count
+  for domain in "${ALLOWED_DOMAINS[@]}"; do
+    count=0
+    while IFS= read -r ip; do
+      [[ -n "$ip" ]] || continue
+      ipset add "$IPSET_NAME" "$ip" -exist
+      (( count += 1 ))
+    done < <(dig +short A "$domain" | grep -E '^[0-9.]+$' || true)
+    (( count > 0 )) || die "could not resolve allowlisted host: $domain"
+    log "allowed ${domain} (${count} addrs)"
+  done
+}
+
+# Apply the default-deny policy with the allowlist carved out.
+apply_rules() {
+  # Loopback is always fine.
+  iptables -A INPUT  -i lo -j ACCEPT
+  iptables -A OUTPUT -o lo -j ACCEPT
+
+  # Keep established/related flows (return traffic for allowed connections).
+  iptables -A INPUT  -m state --state ESTABLISHED,RELATED -j ACCEPT
+  iptables -A OUTPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
+
+  # DNS resolution must work (for the resolver and for `dig` above).
+  iptables -A OUTPUT -p udp --dport 53 -j ACCEPT
+  iptables -A OUTPUT -p tcp --dport 53 -j ACCEPT
+
+  # Outbound to the allowlisted IP set only.
+  iptables -A OUTPUT -m set --match-set "$IPSET_NAME" dst -j ACCEPT
+
+  # Default deny everything else.
+  iptables -P INPUT   DROP
+  iptables -P FORWARD DROP
+  iptables -P OUTPUT  DROP
+}
+
+# Prove the policy is actually live before handing control to the agent.
+verify() {
+  curl -fsSL --max-time 10 -o /dev/null https://api.github.com/zen \
+    || die "verification failed: GitHub unreachable through allowlist"
+  if curl -fsSL --max-time 5 -o /dev/null https://example.com 2>/dev/null; then
+    die "verification failed: egress to example.com succeeded but must be blocked"
+  fi
+  log "egress lockdown verified (GitHub reachable, example.com blocked)"
+}
+
+main() {
+  local tool
+  for tool in iptables ipset dig jq curl; do
+    command -v "$tool" >/dev/null 2>&1 || die "missing required tool: $tool"
+  done
+  flush_existing
+  build_allowset
+  apply_rules
+  verify
+  log "firewall initialized"
+}
+
+main "$@"
diff --git a/scripts/mint-installation-token.sh b/scripts/mint-installation-token.sh
new file mode 100755
index 00000000..297fae17
--- /dev/null
+++ b/scripts/mint-installation-token.sh
@@ -0,0 +1,143 @@
+#!/usr/bin/env bash
+# mint-installation-token.sh — mint a short-lived, repo-scoped GitHub App
+# installation access token for the agent sandbox container.
+#
+# Runs on the HOST. Reads the App private key from the macOS Keychain (never
+# touching disk), signs an App JWT, and exchanges it for an installation token
+# scoped to a single repository with the minimum permissions an autonomous
+# coding agent needs. Prints ONLY the token to stdout so it can be captured:
+#
+#   AGENT_GH_TOKEN="$(scripts/mint-installation-token.sh)"
+#
+# The App private key stays on the host; only this ≤1h token crosses into the
+# container. See .devcontainer/README.md "Scoped credentials".
+#
+# Configuration (env vars; defaults target the actions-gateway-test App):
+#   GITHUB_APP_CLIENT_ID   — App client ID (Iv23…/Iv1.…), used as the JWT `iss`.
+#                            REQUIRED: GitHub deprecated the numeric App ID as
+#                            `iss` in late 2024. Find it on the App settings page.
+#   GITHUB_OWNER_REPO      — "owner/repo" to scope the token to.
+#                            Default: derived from the `origin` git remote.
+#   KEYCHAIN_ACCOUNT       — Keychain account holding the hex-encoded PEM.
+#                            Default: actions-gateway-test
+#   KEYCHAIN_SERVICE       — Keychain service name. Default: github-app-private-key
+#   GITHUB_APP_PRIVATE_KEY — path to a PEM file; overrides Keychain (for CI/Linux).
+#   TOKEN_PERMISSIONS      — JSON object of installation permissions.
+#                            Default: {"contents":"write","pull_requests":"write"}
+#                            Add "workflows":"write" only if the agent must edit
+#                            files under .github/workflows.
+#
+# Requires: jq, curl, openssl (+ `security` & `xxd` when reading from Keychain).
+
+set -euo pipefail
+
+readonly GITHUB_API="https://api.github.com"
+readonly DEFAULT_PERMISSIONS='{"contents":"write","pull_requests":"write"}'
+
+log()  { printf '[mint-token] %s\n' "$*" >&2; }
+die()  { printf '[mint-token] ERROR: %s\n' "$*" >&2; exit 1; }
+
+# Emit the App private key (PEM) to stdout without writing it to disk.
+read_pem() {
+  local key_path="${GITHUB_APP_PRIVATE_KEY:-}"
+  if [[ -n "$key_path" ]]; then
+    [[ -f "$key_path" ]] || die "GITHUB_APP_PRIVATE_KEY set but file not found: $key_path"
+    cat "$key_path"
+    return
+  fi
+  command -v security >/dev/null 2>&1 || die "macOS 'security' not found; set GITHUB_APP_PRIVATE_KEY to a PEM path instead"
+  command -v xxd >/dev/null 2>&1 || die "'xxd' not found (needed to decode the hex-encoded Keychain entry)"
+  security find-generic-password -a "$KEYCHAIN_ACCOUNT" -s "$KEYCHAIN_SERVICE" -w 2>/dev/null \
+    | xxd -r -p \
+    || die "could not read App key from Keychain (account=$KEYCHAIN_ACCOUNT service=$KEYCHAIN_SERVICE)"
+}
+
+# RFC 7515 base64url: base64, strip padding, swap +/ for -_, drop newlines.
+b64url() { base64 | tr -d '=' | tr '+/' '-_' | tr -d '\n'; }
+
+# Sign "$1" (the JWT signing input) with the App key, emit base64url signature.
+sign_rs256() {
+  local input="$1"
+  # Process substitution keeps the PEM off disk: openssl reads it via /dev/fd.
+  printf '%s' "$input" \
+    | openssl dgst -sha256 -sign <(read_pem) -binary 2>/dev/null \
+    | b64url \
+    || die "openssl signing failed — is the Keychain entry a valid RSA PEM?"
+}
+
+# curl wrapper: capture body + HTTP code, fail loudly on non-2xx. Body -> stdout.
+api() {
+  local description="$1" method="$2" url="$3"; shift 3
+  local response http_code body
+  response=$(curl -sS -w $'\n%{http_code}' --max-time 30 -X "$method" "$url" \
+    -H "Accept: application/vnd.github+json" \
+    -H "X-GitHub-Api-Version: 2022-11-28" \
+    "$@") || die "$description: curl failed"
+  http_code="${response##*$'\n'}"
+  body="${response%$'\n'*}"
+  if [[ "$http_code" != 2* ]]; then
+    die "$description failed (HTTP $http_code): $body"
+  fi
+  printf '%s' "$body"
+}
+
+main() {
+  local tool
+  for tool in jq curl openssl; do
+    command -v "$tool" >/dev/null 2>&1 || die "missing required tool: $tool"
+  done
+
+  : "${KEYCHAIN_ACCOUNT:=actions-gateway-test}"
+  : "${KEYCHAIN_SERVICE:=github-app-private-key}"
+  local permissions="${TOKEN_PERMISSIONS:-$DEFAULT_PERMISSIONS}"
+  jq -e . >/dev/null 2>&1 <<<"$permissions" || die "TOKEN_PERMISSIONS is not valid JSON: $permissions"
+
+  local client_id="${GITHUB_APP_CLIENT_ID:-}"
+  [[ -n "$client_id" ]] || die "GITHUB_APP_CLIENT_ID is required (App client ID, Iv23…/Iv1.…; numeric App ID no longer works as JWT iss)"
+
+  # Resolve owner/repo from env or the origin remote.
+  local owner_repo="${GITHUB_OWNER_REPO:-}"
+  if [[ -z "$owner_repo" ]]; then
+    local origin
+    origin="$(git config --get remote.origin.url 2>/dev/null)" \
+      || die "GITHUB_OWNER_REPO unset and no origin remote to derive it from"
+    # Strip protocol/host and the trailing .git: git@github.com:o/r.git | https://github.com/o/r.git
+    owner_repo="$(printf '%s' "$origin" | awk -F'[:/]' '{print $(NF-1)"/"$NF}')"
+    owner_repo="${owner_repo%.git}"
+  fi
+  [[ "$owner_repo" == */* ]] || die "could not resolve owner/repo (got: '$owner_repo')"
+  local owner="${owner_repo%%/*}" repo="${owner_repo##*/}"
+  log "scoping token to ${owner}/${repo} with permissions ${permissions}"
+
+  # --- Mint the App JWT (10-minute lifetime, GitHub's max). ---
+  local now header claims jwt
+  now="$(date +%s)"
+  header="$(printf '{"alg":"RS256","typ":"JWT"}' | b64url)"
+  claims="$(printf '{"iat":%d,"exp":%d,"iss":"%s"}' "$((now - 60))" "$((now + 540))" "$client_id" | b64url)"
+  jwt="${header}.${claims}.$(sign_rs256 "${header}.${claims}")"
+
+  # --- Discover the installation for this repo. ---
+  local install_id
+  install_id="$(api "look up installation" GET "${GITHUB_API}/repos/${owner}/${repo}/installation" \
+    -H "Authorization: Bearer ${jwt}" | jq -r '.id')"
+  [[ "$install_id" =~ ^[0-9]+$ ]] || die "could not resolve installation id for ${owner}/${repo} (is the App installed there?)"
+
+  # --- Exchange the JWT for a scoped installation access token. ---
+  local body token expires
+  body="$(jq -nc --arg repo "$repo" --argjson perms "$permissions" \
+    '{repositories: [$repo], permissions: $perms}')"
+  local resp
+  resp="$(api "exchange JWT for installation token" POST \
+    "${GITHUB_API}/app/installations/${install_id}/access_tokens" \
+    -H "Authorization: Bearer ${jwt}" \
+    -d "$body")"
+  token="$(jq -r '.token' <<<"$resp")"
+  expires="$(jq -r '.expires_at' <<<"$resp")"
+  [[ -n "$token" && "$token" != "null" ]] || die "no token in response: $resp"
+  log "token minted, expires ${expires}"
+
+  # ONLY the token on stdout.
+  printf '%s\n' "$token"
+}
+
+main "$@"

From c896ba2c0a4ad62c35ace2e6e10d6750433db228 Mon Sep 17 00:00:00 2001
From: Karl <karlkfi@gmail.com>
Date: Mon, 1 Jun 2026 21:40:33 -0700
Subject: [PATCH 2/3] docs(plan): add agent workflow automation plan

Capture the autonomous-agent workflow initiative kicked off by the
.devcontainer sandbox: goal, the four levers (approval prompts, parallelism,
PR/CI/merge automation, orchestration glue), what's done vs in-flight vs open,
and the decisions made (dedicated least-privilege agent App; minimal
Anthropic+GitHub egress allowlist). Remaining work tracked as Q62/Q63.
---
 docs/plan/agent-workflow-automation.md | 49 ++++++++++++++++++++++++++
 1 file changed, 49 insertions(+)
 create mode 100644 docs/plan/agent-workflow-automation.md

diff --git a/docs/plan/agent-workflow-automation.md b/docs/plan/agent-workflow-automation.md
new file mode 100644
index 00000000..b2310555
--- /dev/null
+++ b/docs/plan/agent-workflow-automation.md
@@ -0,0 +1,49 @@
+# Plan: Autonomous agent workflow automation
+
+**Goal:** Let many Claude Code agents run the `docs/STATUS.md` backlog in parallel with far fewer approval prompts and far fewer manual human steps (no hand-shepherding each PR through review → CI → merge).
+
+**Status legend:** ✅ done · ▶ in-flight (open PR) · 🔲 ready · 💤 out of repo scope
+
+## Why
+
+Today each backlog item is one manually-started worktree session where a human approves many permission prompts, then babysits the PR: open it, review it, wait for CI, relay failures back to the agent, and merge. The aim is to make a session run start-to-merge with the human only spot-checking outcomes.
+
+## The four levers
+
+| # | Lever | Status | Where |
+|---|---|---|---|
+| 1 | Fewer approval prompts | ▶ | container sandbox (PR #107); allowlist done |
+| 2 | More parallel agents | ✅ | already worktree-per-session + headless `claude -p` |
+| 3 | No PR/CI/merge babysitting | 🔲 | token minting done (PR #107); auto-merge + CI auto-fix open |
+| 4 | Orchestration glue | 💤 | `/next` command + watchdog — personal Claude config |
+
+## Lever 1 — fewer approval prompts
+
+- ✅ **Read-only allowlist.** Scanned transcripts; the only genuine gap was `golangci-lint`, added to the gitignored `.claude/settings.local.json`. Most read-only usage (`go test`, `kubectl get`, `gh pr list`, git read-only, `grep`/`ls`/`cat`) is already auto-allowed or already listed.
+- ▶ **Container sandbox (PR #107).** The real lever for autonomy: run agents with `--dangerously-skip-permissions` inside a container whose blast radius is bounded by three layers — egress firewall, credential-read deny rules, scoped credentials. See [`.devcontainer/README.md`](../../.devcontainer/README.md). Full-bypass mode consults no permission rules, so the container — not the allowlist — is the boundary.
+
+## Lever 2 — more parallel agents
+
+Already in place: a git worktree per session. Scaling further needs no repo change — wrap `claude -p "next"` (headless) per worktree, optionally in the PR #107 container, or use cloud agents to keep the laptop free. No open work tracked here.
+
+## Lever 3 — eliminate PR/CI/merge babysitting
+
+- ✅ **Scoped token minting (PR #107).** [`scripts/mint-installation-token.sh`](../../scripts/mint-installation-token.sh) mints a ≤1h, single-repo installation token with minimal permissions, reading the App key from Keychain without it ever hitting disk. Verified end-to-end.
+- 🔲 **Dedicated agent identity + branch protection (go-live).** The agent must commit as its **own** least-privilege App (`contents:write`+`pull_requests:write`), **not** the `actions-gateway-test` runner App — that App lacks those permissions and carries `administration:write`, which an agent must never hold (confirmed by a 422 during testing). Steps: create the App, store its PEM under Keychain account `actions-gateway-agent`, enable branch protection on `main` (required PR + green CI, no force-push). Branch protection is the only reliable guard against a destructive push — client-side `deny` globs can't gate by target branch. → **Q62**
+- 🔲 **Auto-merge + CI auto-fix.** End the relay loop: agent finishes with `gh pr merge --auto --squash` (merges when CI is green); a failed CI run dispatches the [Claude Code GitHub Action](https://github.com/anthropics/claude-code-action) to fix it on the branch; optionally a watchdog reports only genuinely-stuck PRs. Touches `.github/workflows/`. → **Q63**
+
+## Lever 4 — orchestration glue (out of repo scope)
+
+These live in personal Claude Code config, not this repo, so they are not Queue items:
+
+- A deterministic `/next` slash command encoding the full loop (fetch+rebase → pick a Queue item not covered by an open PR → branch → implement → test → PR → enable auto-merge), replacing the freeform "next" prompt so every agent runs it identically.
+- A `loop`/`schedule` watchdog that surfaces only PRs needing a human.
+
+## Decisions made
+
+- **Agent commits via a new dedicated GitHub App**, not the runner App or a PAT — preserves the ephemeral-token flow already built and keeps least privilege. (Decided 2026-06-01.)
+- **Egress allowlist is Anthropic API + GitHub only** — viable because the repo vendors its Go deps, so builds/tests run offline. Drop vendoring → must add the module-proxy hosts (prefer a filtering HTTP proxy for those CDNs).
+
+## Next concrete step
+
+Q62: create the dedicated agent App and enable branch protection on `main`. Until then PR #107's container can be built and its firewall self-test exercised, but the agent has no identity to push as.

From 0d77cb234949a0c28291229c38670654cce4da08 Mon Sep 17 00:00:00 2001
From: Karl <karlkfi@gmail.com>
Date: Mon, 1 Jun 2026 21:40:33 -0700
Subject: [PATCH 3/3] docs(status): track agent workflow automation (Progress
 row + Q62/Q63)

---
 docs/STATUS.md | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/docs/STATUS.md b/docs/STATUS.md
index ff378cfc..592490c7 100644
--- a/docs/STATUS.md
+++ b/docs/STATUS.md
@@ -38,6 +38,7 @@ Plan-level view. ✅ = all criteria met. ⚠️ = code shipped, specific pieces
 | Make UX | `infra` | ✅ | Phase 1 + Phase 2 done — [plan](plan/make.md) |
 | Docker image speed | `speed` | ✅ | All items done or explicitly closed — [plan](plan/docker-image-speed.md) |
 | e2e test speed | `speed` `tests` | ✅ | All items done — [plan](plan/e2e-tests-speed.md) |
+| Agent workflow automation | `infra` | ⚠️ | Sandbox (egress lockdown + scoped tokens) in PR #107; go-live + auto-merge open as [Q62](#Q62)/[Q63](#Q63) — [plan](plan/agent-workflow-automation.md) |
 
 ---
 
@@ -82,6 +83,8 @@ Specific actionable items in priority order. Pick from the top; skip 🚫 items
 | <a id="Q51"></a>Q51 | Reconcile documented vs emitted Prometheus metrics | `infra` `docs` `bug` | 🔲 | M | 6 documented metrics never registered in code (headline `pod_creation_latency_seconds` + 5 others). Per-metric decision: implement, re-point, or mark `(planned)`. See [docs-six-layer-audit.md](plan/docs-six-layer-audit.md) Layer 3. |
 | <a id="Q55"></a>Q55 | Verify provisioner-test goleak cascade fix held in CI | `tests` `bug` | 🔲 | S | Intermittent ~20-test goleak cascade in `internal/provisioner` fixed by `waitForPodCreated` helper in 59c0714; delete row once CI is clean. If flakes recur, migrate remaining ~18 Eventually-on-Pod sites to the helper. |
 | <a id="Q60"></a>Q60 | [Competitive analysis — GAG vs ARC-adjacent runner/queue tooling](design/appendix-d-alternatives-considered.md) | `docs` | 🔲 | M | Competitive analysis vs ARC-adjacent tooling: Kueue, Exostellar (verify the Kueue-under-ARC GPU pattern), KEDA. Expands [appendix-d](design/appendix-d-alternatives-considered.md). Narrow Kueue-vs-admission angle is in [Q59](#Q59). |
+| <a id="Q62"></a>Q62 | [Agent sandbox go-live: dedicated App + branch protection](plan/agent-workflow-automation.md) | `infra` `security` | 🔲 | S | Create a least-privilege agent GitHub App (`contents`+`pull_requests` write) and protect `main`; the `actions-gateway-test` runner App can't be reused (422 — no contents/PR, has `administration:write`). Prerequisite for the PR #107 sandbox. |
+| <a id="Q63"></a>Q63 | [Auto-merge + CI auto-fix wiring](plan/agent-workflow-automation.md) | `infra` | 🚫 | M | Blocked by [Q62](#Q62). Agent ends with `gh pr merge --auto`; failed CI dispatches the Claude Code GitHub Action to fix on-branch. Touches `.github/workflows/`. |
 | <a id="Q17"></a>Q17 | [Unit/integration test speed improvements](plan/unit-tests-speed.md) | `speed` `tests` | 💤 | M | low priority; pick up when CI latency is the bottleneck |
 | <a id="Q18"></a>Q18 | [alerting.md](plan/docs.md) | `docs` | 💤 | M | deferred until a real Prometheus/Alertmanager setup exists |
 | <a id="Q19"></a>Q19 | [Proxy features: allowlist, rate-limit, audit log, TLS, per-RG pool, X25519](design/appendix-g-future-enhancements.md) | `security` | 💤 | L | explicit non-commitments; build only when a named trigger fires |