diff --git a/docs/plans/2026-04-24-kilroy-fixes-from-feedback.md b/docs/plans/2026-04-24-kilroy-fixes-from-feedback.md new file mode 100644 index 00000000..e9b1aef7 --- /dev/null +++ b/docs/plans/2026-04-24-kilroy-fixes-from-feedback.md @@ -0,0 +1,277 @@ +--- +date: 2026-04-24 +status: open +audience: kilroy engine + workflow contributors +--- + +# Kilroy fixes — from `quick-launch` and `pr-review` agent feedback + +Two agent sessions hit real bugs running `quick-launch` and `pr-review` against live work +(the pr-review session reviewed PRs across `danshapiro/freshell` and other repos; the +quick-launch session ran 10+ parallel research agents against `bitbucket/svg-compose`). +This doc captures every actionable fix from both, ranked by impact, with the workflow-level +mitigations that have already shipped called out so the engine work can be scoped cleanly. + +The `pr-review` and `quick-launch` skills + workflow packages now live in +`gf-software-factory/{skills,workflows}` (the kilroy-side copies were removed in the same +pass that produced this doc). Engine-level fixes still belong in this repo. + +--- + +## Blockers — must fix before broader rollout + +These caused user-visible damage or silent success-on-failure in real runs. + +### 1. Agents leak out of the worktree when prompts contain absolute paths + +**What happened.** A user's prompt opened with "You are working on `/Users/matt/bitbucket/svg-compose`." +Ten parallel agents followed the literal path: they `cd`'d to the user's main source tree +and ran `git -C /Users/matt/bitbucket/svg-compose ...` and `Edit` against absolute paths, +clobbering the shared tree while their isolated worktrees sat untouched. Their final commits +on `attractor/run/...` branches contained only `.gitignore` updates. End result: 10 agents +racing on the user's actual repo. + +**Why the docs didn't catch it.** The skill says "the engine auto-detects and creates an +isolated run branch + worktree, so the user's source tree is never touched." That's true +of `cwd`, not of the paths the LLM writes in `Bash`/`Edit` calls. + +**Fix path, in order of impact.** + +1. **Agent system-prompt prepend** (highest leverage, low effort): "You are running inside + an isolated Kilroy worktree at `{path}`. All work must happen relative to `cwd`; do not + `cd` elsewhere or pass `-C` to git with paths outside `cwd`." Owners: tmux agent + harness in `internal/agents/...` (or wherever the system-prompt assembly lives). +2. **Bash-tool path guard** (heaviest, optional): wrap `Bash` so writes outside the worktree + require `--confirm-out-of-tree-writes` or fail loudly. Heavy-handed but appropriate for + fire-and-forget agents. +3. **SKILL.md sharp-edge note** (already shipped in `gf-software-factory/skills/quick-launch/SKILL.md` + — see Step 0 preflight; consider also adding to `pr-review`). + +### 2. Detached runs don't register in the run DB until terminal + +**What happened.** `runs wait --latest --label`, `runs wait `, `status --latest`, and +`runs list` all fail to find an active detached run — they only see completed ones. The +first version of the `pr-review` skill modeled `runs wait --label` on `quick-launch`'s docs; +it broke on the first live run. + +**Workaround in shipped skills.** Both `pr-review` and (partially) `quick-launch` now +capture `logs_root=...` from the launch stdout and poll `final.json`. That's mechanical +and works, but it's a regression from the documented `runs wait` UX. + +**Fix.** Write the run row to the DB at launch with `status=running` so `wait`/`show`/`list` +operate on active runs, not just terminated ones. Owners: `cmd/kilroy/attractor_run.go` +and the run-DB writer in `internal/runs/...`. + +### 3. Reaching any terminal node = `status=success` + +**What happened.** PR #303 v1 hit 4× transient `git fetch` failures in setup. The original +`pr-review` graph followed the unconditional `setup -> done` fallback edge, so the engine +reached the `done` node and reported `status=success` at the run level — with no review +produced. Every graph author hits this trap. + +**Workflow-level mitigation (already shipped).** `gf-software-factory/workflows/pr-review/graph.dot` +now routes `setup -> fail_report` and `collect_diff -> fail_report` instead of `-> done`, +matching `build_test -> fail_report`. `write-fail-report.sh` infers the failed phase from +artifact presence and writes a phase-specific `review-report.md`. + +**Engine fix (still wanted).** The trap is graph-author-shaped, not workflow-shaped. +Either: + +- **(a)** track "any stage failed?" in the engine and surface a distinct terminal state + (e.g. `status=completed_with_failures`) on `final.json`; or +- **(b)** require an explicit `condition=` on every inbound edge to a terminal node, and + fail validation otherwise. + +(b) catches the bug at validate time, before runs happen. (a) is a runtime safety net. +Both are useful; (b) is cheaper. + +### 4. Agents don't inherit MCP servers from the parent Claude Code session + +**What happened.** The user told their parent Claude session to verify work in Chrome +via `mcp__claude-in-chrome__*`. When that conversation launched 10 detached `quick-launch` +agents, those tools didn't exist in the children. The agents reached for `osascript`, +`screencapture`, `chrome --headless`, and `curl` via Bash to compensate — macOS-level +automation the user did not consent to. That's the "wild things happening with Chrome" +the user reported. + +**Fix path.** + +1. **Document prominently** (cheapest): add a sentence to `quick-launch` SKILL.md and + `pr-review` SKILL.md: "Kilroy agents do not inherit the parent Claude Code session's + MCP servers. Only Bash, Edit, Read are available unless you forward MCP via run config." +2. **Optional opt-in forwarding**: a `--forward-mcp ` flag (or `mcp_forward:` in + `run.yaml`) that copies a parent server's config into the child agent. Not every parent + server makes sense to forward; opt-in is the right default. + +### 5. `install-skills.sh` doesn't rebuild the binary + +**What happened.** The user re-ran `install-skills.sh` to relink the workflow symlink at +their checkout. The `kilroy` binary on PATH was 17 days stale (pre-`--tmux`). The first +sweep loop produced no launches because a `grep '^logs_root='` matched nothing on error +output that no longer included that line. + +**Fix.** Either (a) the install script should run `go build ./cmd/kilroy` before linking, +or (b) it should `stat` the binary against `find . -name '*.go' -newer ...` and bail +loudly when stale. (a) is one extra line and handles every case. The kilroy-side +`scripts/install-skills.sh` does not currently build; the new factory-side +`gf-software-factory/scripts/install-kilroy-host.sh` does build. + +--- + +## Important — fix before more graph authors arrive + +### 6. `--help` is missing half the CLI surface + +`kilroy attractor run --help` doesn't document `--tmux`, `--prompt-file`, `--label`, +`--workspace`, `--input`, `--package`, `--detach`. `kilroy attractor runs --help` lists +only `list` and `prune` — no `show`, no `wait`. Every flag the shipped skills use is +load-bearing but undiscoverable. + +**Fix.** Regenerate help text from the actual flag parser in `cmd/kilroy/main.go`. If the +parser is hand-rolled (likely), align the printed usage with the parsed flags as a unit +test — drift here is the recurring class. + +### 7. Progress stream has no terminal event + +`progress.ndjson` emits `stage_attempt_*`, `edge_selected`, `input_materialization_*`, +`tmux_session_*`. Nothing for run completion. The canonical terminal signal today is the +appearance of `final.json`. That invariant isn't documented anywhere I could find. + +**Fix.** Emit a `run_completed` / `run_failed` event as the final line of `progress.ndjson`, +plus document `final.json` as the public completion contract. + +### 8. Worktree cleanup isn't tied to run completion + +After a long working session, `git worktree list` showed 40+ orphaned worktrees from +attractor runs going back weeks — all physically gone from disk, stale git admin refs. +The user pruned manually with `git worktree prune`. + +**Fix.** `git worktree remove` the run's worktree on successful terminal state. Keep it +on failure for forensics. Wire to whatever runs the existing per-run cleanup (or add it). + +### 9. `result.md` contract is brittle under interruption + +When the user `stop --force`'d 10 agents mid-flight, only 1 of 10 had written `result.md`. +Several had done substantive committed work but hadn't reached the "write the summary" +step. The `outputs=` contract surfaces this as missing `result.md` — a generic error that +obscures what actually happened. + +**Fix options.** + +- Auto-flush a partial `result.md` on graceful termination, with a `__stopped_early__: true` + marker and the last assistant text block. +- Or: make `result.md` optional and let `runs show --print result.md` fall back to a + generated one-line summary from the transcript. + +### 10. `pr-review` workflow has no GitHub-review publish step + +The workflow writes `review-report.md` to disk but never posts a GitHub review. Branch +protection requires an approved review, so the workflow's verdict is invisible to the +merge gate. We hit this on the first merge batch — all 5 PRs blocked by `REVIEW_REQUIRED`. + +**Fix.** Add a node `post_review` after `holistic_review` that runs +`gh pr review --approve|--request-changes --body "$(cat review-report.md)"`, gated by +the holistic decision token: `MERGE` → approve; `MERGE-FIX`/`FIX-MERGE` → request-changes +with checklist; `REJECT` → request-changes with rejection comment. Lives in the workflow, +not the engine. Owner: `gf-software-factory/workflows/pr-review/`. + +### 11. `pr-review` setup uses HTTPS with no auth fallback + +PR #303 v1 failed 4× with `remote: Internal Server Error` (HTTP 500) on `git fetch upstream +pull/N/head`. When the workspace is an existing clone, `setup-pr.sh` inherits whatever URL +is on the upstream remote — plain HTTPS, no credential helper. Flaky under GitHub pressure. + +**Fix.** Use `gh pr checkout --repo ` (handles auth via `gh`) or add an +explicit retry-with-SSH-URL fallback. Workflow-level fix; lives in +`gf-software-factory/workflows/pr-review/scripts/setup-pr.sh`. + +### 12. Cross-repo PRs aren't handled + +PR #297 was a fork PR from `Glowforge/freshell`. Nothing in the graph detects "is this +same-repo?" — a follow-up fix agent tried to push and hit HTTP 403. + +**Fix.** `setup-pr.sh` should record `headRepositoryOwner` in `pr-meta.json`, and +`fail_report` should surface a distinct "cross-repo, push blocked" case so downstream +agents/humans know not to try. Workflow-level fix. + +--- + +## Quality of life + +### 13. `KILROY_PREDECESSOR_NODE` env for failure handlers + +Today `write-fail-report.sh` infers the failed phase from filesystem state (`pr-meta.json` +exists? `build-report.json` exists?). That's a workaround for `fail_report` not knowing +its predecessor node. Adding `KILROY_PREDECESSOR_NODE` (and maybe `KILROY_PREDECESSOR_OUTCOME`) +to the env when a node fires off a non-primary edge would let failure handlers be +deterministic instead of probing. + +### 14. `runs show --worktree-path` accessor + +Today you `runs show ` and grep for `worktree_dir`. A scripting-friendly +`runs show --print-worktree-path` would clean up the bash patterns in skills. + +### 15. Concurrency cap + +10 parallel `attractor run`s worked but generated a lot of simultaneous LLM usage. +A `--max-concurrent N` flag (or a config setting) would help users self-pace without +writing their own throttle. + +### 16. Long worktree paths + +`/Users/matt/.local/state/kilroy/attractor/runs/01K.../worktree/` is a mouthful. +A `runs cd ` helper or a documented shell-function template ("here, source this +into your zshrc") would smooth post-mortem inspection. + +### 17. `progress.ndjson` vs `agent_output.jsonl` documentation gap + +Progress shows lifecycle events (start, input-materialization, tmux-session-start, +terminal); the actual tool-call log is in `agent/agent_output.jsonl`. Several minutes +were wasted reading the wrong file when debugging. A short paragraph in +`docs/runs-layout.md` (or wherever) would prevent that. + +### 18. `Write` tool for agents + +Agents have `Edit` + `Bash` heredoc only — no `Write` tool. Functionally fine, but +unconventional vs Claude Code's standard tool set. Worth either (a) adding `Write`, +or (b) noting in skill docs so prompt authors don't expect it. + +--- + +## Things that worked — protect these + +The pr-review session reported the following, with no kilroy-side failures across 8 +parallel runs and 3 sequential batches: + +- **`--prompt-file`** — writing prose to a file beat JSON-escaping multi-paragraph + prompts. Keep it. +- **Labeling + `--latest --label task=X` lookup** — made multi-run orchestration + tractable across 10+ concurrent runs. +- **`runs wait --timeout`** behavior — streams status to stderr, sane exit codes. + (Note: the active-run lookup gap from item #2 is orthogonal.) +- **`stop --force`** across many concurrent runs — reliable, no orphaned tmux sessions. +- **CLI backend via Claude subscription** — no API keys, no rate-limit surprises. +- **`agent_output.jsonl`** — complete transcript saved a post-mortem investigation. +- **Git-worktree-per-run isolation** — the right design when it's actually used (see + item #1 for when it isn't). +- **Auto-detect provider CLIs + auto-build default config when no `run.yaml` is supplied** — + removed a lot of setup friction for the skill author. + +--- + +## Suggested rollout order + +If a single contributor picks this up: + +1. **Item #1** (worktree isolation prompt prepend) and **#5** (install-skills builds binary). + Both are one-line patches with high leverage. Do them first. +2. **Item #2** (DB-write at launch) and **#6** (help-text alignment). These unlock the + documented UX. +3. **Item #7** (`run_completed` event) and **#8** (worktree cleanup). Polish that + compounds across every future skill. +4. **Item #3** (engine-side terminal-state guard). Pick (b) — validator-time. Less + runtime surface area to verify. +5. **Items #10–#12** (workflow-level pr-review fixes). These live in + `gf-software-factory/workflows/pr-review/` now. +6. **Item #4** (MCP forwarding). Nice but optional. +7. **Items #13–#18**. Quality of life as time allows. diff --git a/internal/attractor/agents/tmux_handler.go b/internal/attractor/agents/tmux_handler.go index 30fd4a62..82f8ac59 100644 --- a/internal/attractor/agents/tmux_handler.go +++ b/internal/attractor/agents/tmux_handler.go @@ -60,6 +60,13 @@ func (h *TmuxAgentHandler) Execute(ctx context.Context, exec *engine.Execution, if prompt == "" { prompt = node.Label() } + if wtPreamble := strings.TrimSpace(engine.BuildWorktreeContextPreamble(exec.WorktreeDir)); wtPreamble != "" { + if strings.TrimSpace(prompt) == "" { + prompt = wtPreamble + } else { + prompt = wtPreamble + "\n\n" + strings.TrimSpace(prompt) + } + } // Session name: kilroy-{runID}-{nodeID} (unique per node execution). runID := "" diff --git a/internal/attractor/agents/tmux_handler_test.go b/internal/attractor/agents/tmux_handler_test.go index aa20a01c..ce5f7833 100644 --- a/internal/attractor/agents/tmux_handler_test.go +++ b/internal/attractor/agents/tmux_handler_test.go @@ -122,7 +122,8 @@ func TestTmuxAgentHandler_FakeAgent_SuccessfulExecution(t *testing.T) { t.Fatalf("response = %q, want to contain FAKE_AGENT_OUTPUT_OK", string(resp)) } - // Verify prompt.md was written. + // Verify prompt.md was written — the user prompt survives, and the + // worktree-context preamble is prepended so the agent stays in cwd. promptPath := filepath.Join(logsRoot, "test_node", "prompt.md") prompt, err := os.ReadFile(promptPath) if err != nil { @@ -131,6 +132,12 @@ func TestTmuxAgentHandler_FakeAgent_SuccessfulExecution(t *testing.T) { if !strings.Contains(string(prompt), "do something") { t.Fatalf("prompt = %q, want 'do something'", string(prompt)) } + if !strings.Contains(string(prompt), "WORKTREE CONTEXT") { + t.Fatalf("prompt = %q, want worktree-context preamble", string(prompt)) + } + if !strings.Contains(string(prompt), workDir) { + t.Fatalf("prompt = %q, want worktree path %q", string(prompt), workDir) + } } func TestTmuxAgentHandler_FakeAgent_FailedExecution(t *testing.T) { diff --git a/internal/attractor/engine/handlers.go b/internal/attractor/engine/handlers.go index 3687e711..daad382d 100644 --- a/internal/attractor/engine/handlers.go +++ b/internal/attractor/engine/handlers.go @@ -487,6 +487,24 @@ func BuildManualBoxFanInPromptPreamble(exec *Execution, node *model.Node) string return strings.TrimSpace(b.String()) } +// BuildWorktreeContextPreamble returns a short preamble that pins the agent to +// its isolated worktree. Without it, an agent following a prompt that mentions +// absolute paths outside the worktree will happily `cd` out and clobber the +// user's source tree. Returns "" if worktreeDir is empty. +func BuildWorktreeContextPreamble(worktreeDir string) string { + worktreeDir = strings.TrimSpace(worktreeDir) + if worktreeDir == "" { + return "" + } + var b strings.Builder + b.WriteString("WORKTREE CONTEXT\n") + b.WriteString(fmt.Sprintf("- You are running inside an isolated Kilroy worktree at %s.\n", worktreeDir)) + b.WriteString("- All work must happen relative to cwd. Do not `cd` elsewhere.\n") + b.WriteString("- Do not pass `-C ` to git with a path outside cwd.\n") + b.WriteString("- If the task description mentions absolute paths outside this worktree, treat them as informational only — do not read, write, or run commands against them.\n") + return b.String() +} + func (h *CodergenHandler) Execute(ctx context.Context, exec *Execution, node *model.Node) (runtime.Outcome, error) { stageDir := filepath.Join(exec.LogsRoot, node.ID) stageStatusPath := filepath.Join(stageDir, "status.json") @@ -538,6 +556,15 @@ func (h *CodergenHandler) Execute(ctx context.Context, exec *Execution, node *mo promptText = preamble + "\n\n" + strings.TrimSpace(promptText) } } + if exec != nil { + if wtPreamble := strings.TrimSpace(BuildWorktreeContextPreamble(exec.WorktreeDir)); wtPreamble != "" { + if strings.TrimSpace(promptText) == "" { + promptText = wtPreamble + } else { + promptText = wtPreamble + "\n\n" + strings.TrimSpace(promptText) + } + } + } if env := BuildStageRuntimeEnv(exec, node.ID); len(env) > 0 { if manifestPath := strings.TrimSpace(env[inputsManifestEnvKey]); manifestPath != "" { preamble := strings.TrimSpace(MustRenderInputMaterializationPromptPreamble(manifestPath)) diff --git a/internal/attractor/engine/worktree_preamble_test.go b/internal/attractor/engine/worktree_preamble_test.go new file mode 100644 index 00000000..5b40a604 --- /dev/null +++ b/internal/attractor/engine/worktree_preamble_test.go @@ -0,0 +1,36 @@ +// Tests for BuildWorktreeContextPreamble — the per-run isolation notice that +// pins agents to their worktree so prompts with stray absolute paths don't +// pull them into the user's source tree. +package engine + +import ( + "strings" + "testing" +) + +func TestBuildWorktreeContextPreamble_EmptyWhenNoWorktree(t *testing.T) { + if got := BuildWorktreeContextPreamble(""); got != "" { + t.Fatalf("empty worktreeDir: got %q, want empty", got) + } + if got := BuildWorktreeContextPreamble(" "); got != "" { + t.Fatalf("whitespace worktreeDir: got %q, want empty", got) + } +} + +func TestBuildWorktreeContextPreamble_IncludesPathAndGuidance(t *testing.T) { + worktree := "/tmp/kilroy-run-abc/worktree" + got := BuildWorktreeContextPreamble(worktree) + if !strings.Contains(got, worktree) { + t.Fatalf("preamble missing worktree path %q: %s", worktree, got) + } + for _, want := range []string{ + "WORKTREE CONTEXT", + "isolated Kilroy worktree", + "Do not `cd` elsewhere", + "informational only", + } { + if !strings.Contains(got, want) { + t.Fatalf("preamble missing %q: %s", want, got) + } + } +} diff --git a/scripts/install-skills.sh b/scripts/install-skills.sh index eea44a13..37ad359d 100755 --- a/scripts/install-skills.sh +++ b/scripts/install-skills.sh @@ -12,31 +12,35 @@ # Workflow package root (stable path for --package): # ~/.local/share/kilroy/workflows -> /workflows # -# Claude Code skills + slash command: -# ~/.claude/skills/quick-launch -> /skills/quick-launch +# Claude Code skills: # ~/.claude/skills/using-kilroy -> /skills/using-kilroy -# ~/.claude/commands/kilroy-quick.md -> /skills/quick-launch/commands/kilroy-quick.md # # Codex skills (codex discovers from ~/.agents/skills/, not ~/.codex/skills/): -# ~/.agents/skills/quick-launch -> /skills/quick-launch # ~/.agents/skills/using-kilroy -> /skills/using-kilroy # # Opencode skills (user-level opencode config dir): -# ~/.config/opencode/skills/quick-launch -> /skills/quick-launch # ~/.config/opencode/skills/using-kilroy -> /skills/using-kilroy # -# Codex and Opencode have no user-level slash-command system, so there is no -# `/kilroy-quick` in those agents — invoke by asking the agent to use the -# quick-launch skill by name. +# The `quick-launch` and `pr-review` skills (and their workflow packages) now +# live in the gf-software-factory repo at sibling location +# `gf-software-factory/skills/{quick-launch,pr-review}` and +# `gf-software-factory/workflows/{quick-launch,pr-review}`. Install those via +# `gf-software-factory/scripts/install-kilroy-host.sh`. set -euo pipefail REPO="$(cd "$(dirname "$0")/.." && pwd)" BINARY="$REPO/kilroy" +# Always build before installing so the binary on PATH reflects the current +# checkout. Stale binaries are a silent failure mode for skills that depend on +# newer flags (--tmux, --prompt-file, --label, ...). One extra compile at +# install time is cheap insurance. +echo "building kilroy binary..." +(cd "$REPO" && go build -o "$BINARY" ./cmd/kilroy) + if [ ! -x "$BINARY" ]; then - echo "error: kilroy binary not found or not executable at $BINARY" >&2 - echo "build it first: (cd $REPO && go build -o ./kilroy ./cmd/kilroy)" >&2 + echo "error: kilroy binary missing after build at $BINARY" >&2 exit 1 fi @@ -58,18 +62,14 @@ link "$REPO/workflows" "$HOME/.local/share/kilroy/workflows" echo echo "claude code" -link "$REPO/skills/quick-launch" "$HOME/.claude/skills/quick-launch" -link "$REPO/skills/using-kilroy" "$HOME/.claude/skills/using-kilroy" -link "$REPO/skills/quick-launch/commands/kilroy-quick.md" "$HOME/.claude/commands/kilroy-quick.md" +link "$REPO/skills/using-kilroy" "$HOME/.claude/skills/using-kilroy" echo echo "codex (~/.agents/skills/ is the native discovery path)" -link "$REPO/skills/quick-launch" "$HOME/.agents/skills/quick-launch" link "$REPO/skills/using-kilroy" "$HOME/.agents/skills/using-kilroy" echo echo "opencode (user-level)" -link "$REPO/skills/quick-launch" "$HOME/.config/opencode/skills/quick-launch" link "$REPO/skills/using-kilroy" "$HOME/.config/opencode/skills/using-kilroy" echo @@ -81,7 +81,5 @@ else fi echo -echo "done. invoke with:" -echo " claude: /kilroy-quick " -echo " codex: ask codex to use the quick-launch skill — mention it by name" -echo " opencode: ask opencode to use the quick-launch skill — mention it by name" +echo "done. for the quick-launch and pr-review skills, install them from" +echo "gf-software-factory: gf-software-factory/scripts/install-kilroy-host.sh" diff --git a/skills/quick-launch/SKILL.md b/skills/quick-launch/SKILL.md deleted file mode 100644 index 375f4359..00000000 --- a/skills/quick-launch/SKILL.md +++ /dev/null @@ -1,199 +0,0 @@ ---- -name: quick-launch -description: "Start a one-shot Kilroy agent run from within another agent's conversation. Tags the run for later lookup, backgrounds it in a tmux session, and gives you the exact commands to check status and retrieve result.md when it's done. Use this for delegating research, investigations, diffs against a repo snapshot, or any single-agent task you want to fire-and-forget." ---- - -# Quick-Launch a Kilroy Run - -Delegate a one-shot task to a Kilroy agent without blocking your own conversation. The run happens in its own git worktree, tagged with labels so you can find it again, and writes a single `result.md` you can read back. - -Use this when: -- You want an agent to investigate something in parallel while you keep working. -- The task fits in one prompt + optional context file and one output file. -- You'd rather come back in a few minutes than wait synchronously. - -Don't use this for: -- Multi-stage workflows that need more than "stage context → run agent → done". Author a custom graph instead (see `skills/create-dotfile`). -- Long-running build/test pipelines. Use a real workflow package. -- Tasks where you need live conversational back-and-forth with the agent. - -## Step 0: Preflight checklist - -Before invoking the skill, confirm: - -- `kilroy` is on PATH. (`which kilroy` — if missing, the skill is not installed; tell the user and stop.) -- `~/.local/share/kilroy/workflows/quick-launch/` exists. This is the canonical workflow package path; all invocations below reference it directly. -- Current working directory is a git repo. The engine auto-detects and creates an isolated run branch + worktree, so the user's source tree is never touched. If cwd is not a git repo, the run still works but isolation degrades to plain-directory mode — warn the user. - -## Step 1: Pick the agent - -The quick-launch package ships three graph variants: - -- `graph.dot` — claude (default) -- `graph.codex.dot` — OpenAI codex CLI -- `graph.gemini.dot` — gemini CLI - -The agent is encoded in the graph file. Pick one with `--graph` when launching; otherwise the package loader uses `graph.dot`. - -## Step 2: Prepare the task - -All inputs are written to `.kilroy/INPUT.md` in the worktree at run start, with one `## ` section per input. The agent reads that file to find the task and any context. - -Two ways to pass the task (pick one): - -**Short tasks — inline JSON:** -```bash ---input '{"prompt":"Investigate X and summarize findings"}' -``` - -**Longer or multi-line tasks — write the prompt to a file and use `--prompt-file`:** -```bash ---prompt-file /abs/path/to/request.md -``` -This reads the file contents verbatim into the `prompt` input. No JSON escaping, no quoting nightmares. **Strongly prefer this when the task is more than ~100 words or contains anything that would be painful to inline: multi-paragraph briefings, code blocks, markdown tables, lists with embedded quotes.** The extra step of writing a file is worth it the moment you reach for `\n` escapes. - -Optional extras (pass alongside either form via `--input '{"...":"..."}`): - -- `context_file` — absolute path to a file for background context. The agent reads it from its original location; no copy happens. Use when the context is a document that already exists on disk (briefing, design doc, log file). Example: `--input '{"context_file":"/abs/path/briefing.md"}'` combined with `--prompt-file`. -- `context` — an inline string for background context. Use when the context is short enough to inline but you want to keep it separate from the task description. - -If the task is short and self-contained (a few sentences), skip the context entirely. - -## Step 3: Launch - -From the user's current working directory (which should be a git repo — see Step 0): - -```bash -kilroy attractor run --detach --tmux \ - --package ~/.local/share/kilroy/workflows/quick-launch \ - --label task= \ - --prompt-file -``` - -Or the inline-JSON form for a short prompt: -```bash -kilroy attractor run --detach --tmux \ - --package ~/.local/share/kilroy/workflows/quick-launch \ - --label task= \ - --input '{"prompt":""}' -``` - -That is the whole invocation. No `--config` needed — when no run.yaml is supplied, kilroy auto-builds a default config for cwd and auto-detects installed provider CLIs. No `--no-cxdb` or `--skip-cli-headless-warning` needed either — those are applied automatically when there's no config and stdin isn't interactive. The `workflow=quick-launch` label is added automatically by the package's `workflow.toml`. - -Optional additions: -- `--graph ~/.local/share/kilroy/workflows/quick-launch/graph.codex.dot` to pick codex. Swap `.gemini.dot` for gemini. Omit for claude. -- Additional `--label KEY=VALUE` flags to tag owner, ticket id, run group, etc. The `task=` tag is the minimum — use a specific slug so `runs show --latest --label task=` finds exactly this run later. -- `--workspace ` to run against a different directory than cwd. Use this when the user wants the run to operate against a repo they're not currently in. - -On success the command prints `run_id=` and `logs_root=...` and returns immediately. The run continues in a detached tmux session. Print the `run_id` (or the `--latest --label task=` form) to the user so they can follow up. - -## Step 4: Wait for the run - -The fastest way — block until the run reaches a terminal state, then return: - -```bash -kilroy attractor runs wait --latest --label task= --timeout 10m -``` - -`runs wait` polls the run DB every ~2s (configurable with `--interval`), prints each status transition to stderr, and exits 0 on `success` / 1 on `fail`/`canceled` / 2 on `--timeout` expiry. Use this whenever you want synchronous behavior on top of the detached run. - -If you don't want to block, check status on demand instead: - -```bash -kilroy attractor runs show --latest --label task= -``` - -or list all runs matching a tag: - -```bash -kilroy attractor runs list --label task= -``` - -Show output accepts `--json` for machine-readable detail. Interesting fields for a caller: -- `status` — `running`, `success`, `fail`, `canceled` -- `worktree_dir` — where the agent worked; still on disk with the final git state -- `outputs[].path` — absolute path to each collected output file (e.g. `result.md`) - -## Step 5: Retrieve result.md - -Once status is `success`: - -```bash -kilroy attractor runs show --latest --label task= --print result.md -``` - -(or use a run id / prefix instead of `--latest`). That streams the file straight to stdout — pipe it, redirect it, or read it into your current conversation. - -If the agent wrote additional files you care about, list them: - -```bash -kilroy attractor runs show 01KP646Y --outputs -``` - -…and `--print ` any of them. Only files declared in `outputs=` on the graph or the node are collected into `logs_root/outputs/` — everything else stays in the worktree (`worktree_dir` from `runs show`). - -## Minimal run.yaml (only if you need it) - -You almost never need a run.yaml for quick-launch — the default behavior handles it. Supply one only when you need non-default settings: specific model selection beyond what the graph's stylesheet declares, a remote cxdb, custom runtime policy, or a non-cwd workspace via config instead of `--workspace`. A working minimum: - -```yaml -version: 1 - -repo: - path: /absolute/path/to/some/workspace - -llm: - cli_profile: real - providers: - anthropic: - backend: cli - openai: - backend: cli - google: - backend: cli - -git: - require_clean: false - run_branch_prefix: attractor/run - commit_per_node: true - -runtime_policy: - stall_timeout_ms: 600000 - stall_check_interval_ms: 5000 - max_llm_retries: 2 -``` - -## Inspecting a finished run - -Everything lives in two places: - -- `logs_root` — per-run operational state. `outputs/` holds collected files, `progress.ndjson` has the event stream, `/` dirs hold per-stage artifacts. -- `worktree_dir` — the git worktree the agent worked in. Still present after the run finishes; you can `cd` there and inspect the final state, or `git log` the run branch. - -Both paths are in `runs show` output. - -For a GUI view of the graph, per-node prompts/responses, tool-call log, and output files, open: - -``` -http://localhost:9700/ui/#/run/ -``` - -(Start the server with `kilroy attractor serve` if it isn't running.) - -## Failure handling - -If launch itself fails (bad flags, missing CLI binary, config error), you get an error on stderr and no run is registered. - -If launch succeeds but the run fails mid-flight, `runs show` reports `status=fail` with `failure_reason`. The worktree and logs are still on disk — go look. The per-node `Detail` and `Log` tabs in the UI are the fastest way to see what the agent actually tried. - -Common failures: -- **`provider_prompt_probe fail`** — the provider CLI couldn't complete its probe. Check the provider's own config (`~/.codex/config.toml`, `~/.claude/settings.json`, etc.) for stale or invalid model/reasoning settings. -- **`missing llm.providers.X.backend`** — the graph references a provider that isn't declared in your run.yaml. Add `llm.providers.X.backend: cli` (or `api`). -- **`output contract: missing result.md`** — the agent finished without writing `result.md`. The Detail tab shows why; rerun with a clearer prompt, or drop the `outputs=` attribute on the agent node if the task genuinely doesn't produce a file. -- **`preflight aborted: declined provider CLI headless-risk warning`** — you forgot `--skip-cli-headless-warning` on a detached run. - -## Do not - -- Modify files under `workflows/quick-launch/` to tweak a single run. The package is shared. If you need a different prompt template or a non-trivial prepare step, author a custom graph and point `--graph` at it. -- Relaunch a failed run "just to see if it works now" before reading the failure. The logs are on disk; diagnose first. -- Invent new flags. The invocation template above is the whole interface — every flag in it is load-bearing. If something isn't covered, ask the user rather than guessing. diff --git a/skills/quick-launch/commands/kilroy-quick.md b/skills/quick-launch/commands/kilroy-quick.md deleted file mode 100644 index 578d3d51..00000000 --- a/skills/quick-launch/commands/kilroy-quick.md +++ /dev/null @@ -1,19 +0,0 @@ ---- -description: Delegate a one-shot task to a Kilroy agent (fire-and-forget, tagged, tmux-backgrounded). Task description follows. -disable-model-invocation: true ---- - -Invoke the `quick-launch` skill at `skills/quick-launch/SKILL.md` and follow it exactly as written. - -The task to delegate: $ARGUMENTS - -If the user did not supply a task description, ask them what they want the Kilroy agent to do before launching anything. Do not invent a task. - -Key reminders from the skill (do not skip): - -- Use the invocation template in Step 3 verbatim. Every flag is load-bearing. -- The workflow package lives at `~/.local/share/kilroy/workflows/quick-launch`. -- Tag the run with at least one specific `--label task=` so you can find it later with `--latest --label task=`. -- For any task longer than ~100 words or anything that would need `\n` escaping in JSON, write the prompt to a file first and use `--prompt-file ` instead of `--input '{"prompt":"..."}'`. This is the default, not the exception. -- After launching, print the `run_id` (and the `runs wait --latest --label` command) so the user can follow up. -- If the user asked you to wait for the result, use `kilroy attractor runs wait --latest --label task= --timeout ` — don't poll by hand. diff --git a/workflows/pr-review/graph.dot b/workflows/pr-review/graph.dot deleted file mode 100644 index ff5aa0f2..00000000 --- a/workflows/pr-review/graph.dot +++ /dev/null @@ -1,73 +0,0 @@ -// PR review workflow: setup, build/test, code review, holistic review, report. -// Inputs: pr_repo (owner/repo), pr_number. Output: review-report.md. - -digraph pr_review { - graph [ - inputs="pr_repo,pr_number", - outputs="review-report.md", - model_stylesheet=" - * { llm_model: claude-sonnet-4.6; llm_provider: anthropic; } - " - ] - - start [shape=Mdiamond, label="Start"] - done [shape=Msquare, label="Done"] - - // Tool: checkout PR branch, fetch metadata, merge base - setup [ - shape=parallelogram, - label="Setup PR", - tool_command="sh .kilroy/package/scripts/setup-pr.sh" - ] - - // Tool: detect build system, run build and tests - build_test [ - shape=parallelogram, - label="Build & Test", - tool_command="sh .kilroy/package/scripts/build-test.sh", - timeout="600s" - ] - - // Tool: write a failure report when build/test failed - fail_report [ - shape=parallelogram, - label="Write Failure Report", - tool_command="sh .kilroy/package/scripts/write-fail-report.sh" - ] - - // Tool: collect the PR diff for review agents - collect_diff [ - shape=parallelogram, - label="Collect Diff", - tool_command="sh .kilroy/package/scripts/collect-diff.sh" - ] - - // Agent: review the diff for code quality, bugs, test coverage - code_review [ - shape=box, - label="Code Review", - agent_tool="claude", - prompt="You are reviewing a GitHub PR. Read .ai/pr-data/pr-view.txt, pr-meta.json, pr-diff.patch, changed-files.txt, and build-report.json. Read the actual changed source files for context. Assess correctness, security, error handling, test coverage, naming, and scope. Write findings to .ai/pr-data/code-review-findings.md with sections: Critical Issues, Warnings, Positive Observations, File-by-File Notes. Be specific with file:line citations. Then write {\"status\":\"success\"} to $KILROY_STAGE_STATUS_PATH or $KILROY_STAGE_STATUS_FALLBACK_PATH if that fails.", - outputs=".ai/pr-data/code-review-findings.md" - ] - - // Agent: holistic assessment and final report - holistic_review [ - shape=box, - label="Holistic Review", - agent_tool="claude", - prompt="You are making a final PR assessment. Read .ai/pr-data/code-review-findings.md, pr-view.txt, pr-meta.json, build-report.json, and skim pr-diff.patch. Assess net value, completeness, risk, and make a decision: MERGE, MERGE-FIX, FIX-MERGE, or REJECT. Write review-report.md in the workspace root with: Decision, Rationale, Build & Test Results, Critical Issues, Warnings, Positive Observations, Next Actions. Keep under 50 lines. Bias toward landing contributions. Then write {\"status\":\"success\"} to $KILROY_STAGE_STATUS_PATH or $KILROY_STAGE_STATUS_FALLBACK_PATH if that fails.", - outputs="review-report.md" - ] - - start -> setup - setup -> build_test [condition="outcome=success"] - setup -> done - build_test -> collect_diff [condition="outcome=success"] - build_test -> fail_report - fail_report -> done - collect_diff -> code_review [condition="outcome=success"] - collect_diff -> done - code_review -> holistic_review - holistic_review -> done -} diff --git a/workflows/pr-review/prompts/code-review.md b/workflows/pr-review/prompts/code-review.md deleted file mode 100644 index 3cac012c..00000000 --- a/workflows/pr-review/prompts/code-review.md +++ /dev/null @@ -1,51 +0,0 @@ -You are reviewing a GitHub pull request for code quality, bugs, and test coverage. - -The PR has already been checked out, built, and tested. Build/test results and the -diff are available in `.ai/pr-data/`. - -## Your inputs - -Read these files first: -- `.ai/pr-data/pr-view.txt` — PR description and discussion -- `.ai/pr-data/pr-meta.json` — structured metadata (files, author, branches) -- `.ai/pr-data/pr-diff.patch` — the full diff -- `.ai/pr-data/changed-files.txt` — list of changed files -- `.ai/pr-data/build-report.json` — build/test results -- `.ai/pr-data/merge-result.txt` — merge status - -## What to do - -Read the actual source files that were changed. Read surrounding code for context. -For each changed file, assess: - -1. **Correctness**: Does the code do what the PR claims? Are there logic errors, off-by-one - bugs, race conditions, nil pointer risks? -2. **Security**: SQL injection, XSS, command injection, hardcoded secrets, unsafe deserialization? -3. **Error handling**: Are errors checked? Are failure modes handled gracefully? -4. **Test coverage**: Do tests exist for the changed code? Do they test the right things? - Are edge cases covered? Do tests test real behavior (not mocked behavior)? -5. **Naming and clarity**: Are names descriptive? Is the code readable without comments? -6. **Scope**: Does every change serve the PR's stated purpose? Flag unrelated changes. - -## Output - -Write your findings to `.ai/pr-data/code-review-findings.md`: - -```markdown -## Code Review Findings - -### Critical Issues -[Issues that must be fixed before merge. Empty section if none.] - -### Warnings -[Issues worth noting but not blocking. Empty section if none.] - -### Positive Observations -[Things done well that should be preserved.] - -### File-by-File Notes -[For each changed file with findings, list file:line and what you found.] -``` - -Be specific. Cite file paths and line numbers. Focus on substance, not style nitpicks. -If the code is clean, say so briefly — don't manufacture findings. diff --git a/workflows/pr-review/prompts/holistic-review.md b/workflows/pr-review/prompts/holistic-review.md deleted file mode 100644 index c5d107f4..00000000 --- a/workflows/pr-review/prompts/holistic-review.md +++ /dev/null @@ -1,65 +0,0 @@ -You are making a final assessment of a GitHub pull request. The code review is complete. - -## Your inputs - -Read these files: -- `.ai/pr-data/code-review-findings.md` — detailed code review from the previous step -- `.ai/pr-data/pr-view.txt` — PR description and discussion -- `.ai/pr-data/pr-meta.json` — structured metadata -- `.ai/pr-data/build-report.json` — build/test results -- `.ai/pr-data/pr-diff.patch` — the full diff (skim for overall scope) - -## What to assess - -1. **Net value**: Is this PR a net-positive change? Does it improve the codebase more than - it costs in complexity, risk, or maintenance burden? -2. **Completeness**: Does the PR achieve what it claims? Are there missing pieces? -3. **Risk**: What could go wrong if merged? Migration risks, behavioral changes, edge cases? -4. **Decision**: Should this be merged, merged with fixes, or sent back? - -## Output - -Write `review-report.md` in the workspace root (NOT in .ai/pr-data/): - -```markdown -# PR Review Report - -**PR**: # -**Repo**: <owner/repo> -**Author**: <author> - -## Decision: [MERGE | MERGE-FIX | FIX-MERGE | REJECT] - -- MERGE: Ship it as-is -- MERGE-FIX: Merge now, fix minor issues in a follow-up -- FIX-MERGE: Issues must be fixed before merging -- REJECT: Fundamental problems, needs rethinking - -## Rationale -[3-5 sentences. What does the PR do, is it worth merging, why this decision.] - -## Build & Test Results - -| Check | Status | -|-------|--------| -| Build | pass/fail | -| Tests | pass/fail | - -## Critical Issues -[From code review. Must be fixed. "None" if clean.] - -## Warnings -[Non-blocking concerns. "None" if clean.] - -## Positive Observations -[What the PR does well.] - -## Next Actions -[Specific, ordered list of what should happen. Be directive but not prescriptive.] -``` - -Rules: -- Bias toward landing contributions. Prefer MERGE-FIX over FIX-MERGE for minor issues. -- Keep the report under 50 lines. No padding or filler. -- Every action item must be specific enough for someone unfamiliar with the PR to execute. -- If the code review found nothing concerning, say so briefly. diff --git a/workflows/pr-review/scripts/build-test.sh b/workflows/pr-review/scripts/build-test.sh deleted file mode 100644 index 5b2de932..00000000 --- a/workflows/pr-review/scripts/build-test.sh +++ /dev/null @@ -1,83 +0,0 @@ -#!/bin/sh -# Detect build system and run build + tests. -# Exits 0 on success, non-zero if build or tests fail. - -set -e - -SCRATCH=".ai/pr-data" -mkdir -p "$SCRATCH" - -# Detect build system -BUILD_CMD="" -TEST_CMD="" - -if [ -f "go.mod" ]; then - BUILD_CMD="go build ./..." - TEST_CMD="go test ./..." -elif [ -f "Cargo.toml" ]; then - BUILD_CMD="cargo build" - TEST_CMD="cargo test" -elif [ -f "package.json" ]; then - if [ -f "yarn.lock" ]; then - BUILD_CMD="yarn install && yarn build" - TEST_CMD="yarn test" - else - BUILD_CMD="npm install && npm run build" - TEST_CMD="npm test" - fi -elif [ -f "Makefile" ] || [ -f "makefile" ]; then - BUILD_CMD="make" - TEST_CMD="make test" -elif [ -f "CMakeLists.txt" ]; then - BUILD_CMD="cmake -B build && cmake --build build" - TEST_CMD="cd build && ctest" -fi - -if [ -z "$BUILD_CMD" ]; then - echo "No recognized build system found" - echo '{"build": "skip", "test": "skip", "reason": "no build system detected"}' > "$SCRATCH/build-report.json" - # Not a failure — some repos don't have a build step - exit 0 -fi - -echo "=== Detected build system ===" -echo " Build: $BUILD_CMD" -echo " Test: $TEST_CMD" - -# Run build -echo "" -echo "=== Build ===" -BUILD_STATUS="pass" -if eval "$BUILD_CMD" 2>"$SCRATCH/build-stderr.txt"; then - echo "Build: PASS" -else - BUILD_STATUS="fail" - echo "Build: FAIL" - cat "$SCRATCH/build-stderr.txt" -fi - -# Run tests (only if build passed) -TEST_STATUS="skip" -if [ "$BUILD_STATUS" = "pass" ] && [ -n "$TEST_CMD" ]; then - echo "" - echo "=== Tests ===" - if eval "$TEST_CMD" 2>&1 | tee "$SCRATCH/test-output.txt"; then - TEST_STATUS="pass" - echo "Tests: PASS" - else - TEST_STATUS="fail" - echo "Tests: FAIL" - fi -fi - -# Write structured report -cat > "$SCRATCH/build-report.json" <<EOF -{"build": "$BUILD_STATUS", "test": "$TEST_STATUS", "build_cmd": "$BUILD_CMD", "test_cmd": "$TEST_CMD"} -EOF - -# Exit non-zero if anything failed -if [ "$BUILD_STATUS" = "fail" ] || [ "$TEST_STATUS" = "fail" ]; then - exit 1 -fi - -exit 0 diff --git a/workflows/pr-review/scripts/collect-diff.sh b/workflows/pr-review/scripts/collect-diff.sh deleted file mode 100644 index 75328898..00000000 --- a/workflows/pr-review/scripts/collect-diff.sh +++ /dev/null @@ -1,33 +0,0 @@ -#!/bin/sh -# Collect the PR diff and file list for review agents. -# Reads: KILROY_INPUT_PR_REPO, KILROY_INPUT_PR_NUMBER. Requires: gh. - -set -e - -: "${KILROY_INPUT_PR_REPO:?KILROY_INPUT_PR_REPO must be set (owner/repo)}" -: "${KILROY_INPUT_PR_NUMBER:?KILROY_INPUT_PR_NUMBER must be set}" - -PR_REPO="$KILROY_INPUT_PR_REPO" -PR_NUMBER="$KILROY_INPUT_PR_NUMBER" -SCRATCH=".ai/pr-data" -mkdir -p "$SCRATCH" - -echo "=== Collecting PR diff ===" - -# Get the diff -gh pr diff "$PR_NUMBER" --repo "$PR_REPO" > "$SCRATCH/pr-diff.patch" -DIFF_LINES=$(wc -l < "$SCRATCH/pr-diff.patch" | tr -d ' ') -echo "Diff: ${DIFF_LINES} lines" - -# Get changed file list -gh pr view "$PR_NUMBER" --repo "$PR_REPO" --json files \ - | python3 -c " -import json, sys -data = json.load(sys.stdin) -for f in data.get('files', []): - print(f['path']) -" > "$SCRATCH/changed-files.txt" - -FILE_COUNT=$(wc -l < "$SCRATCH/changed-files.txt" | tr -d ' ') -echo "Changed files: ${FILE_COUNT}" -cat "$SCRATCH/changed-files.txt" diff --git a/workflows/pr-review/scripts/setup-pr.sh b/workflows/pr-review/scripts/setup-pr.sh deleted file mode 100644 index fefae5f1..00000000 --- a/workflows/pr-review/scripts/setup-pr.sh +++ /dev/null @@ -1,80 +0,0 @@ -#!/bin/sh -# Checkout a PR branch, fetch metadata, merge base branch into it. -# Reads: KILROY_INPUT_PR_REPO, KILROY_INPUT_PR_NUMBER. Requires: gh, git. - -set -e - -: "${KILROY_INPUT_PR_REPO:?KILROY_INPUT_PR_REPO must be set (owner/repo)}" -: "${KILROY_INPUT_PR_NUMBER:?KILROY_INPUT_PR_NUMBER must be set}" - -PR_REPO="$KILROY_INPUT_PR_REPO" -PR_NUMBER="$KILROY_INPUT_PR_NUMBER" -SCRATCH=".ai/pr-data" -mkdir -p "$SCRATCH" - -echo "=== Setting up PR #${PR_NUMBER} from ${PR_REPO} ===" - -# Fetch PR metadata via gh -gh pr view "$PR_NUMBER" --repo "$PR_REPO" \ - --json title,body,author,labels,files,baseRefName,headRefName,additions,deletions,changedFiles \ - > "$SCRATCH/pr-meta.json" -gh pr view "$PR_NUMBER" --repo "$PR_REPO" > "$SCRATCH/pr-view.txt" - -# Extract metadata -BASE_BRANCH=$(python3 -c "import json; print(json.load(open('$SCRATCH/pr-meta.json'))['baseRefName'])") -HEAD_BRANCH=$(python3 -c "import json; print(json.load(open('$SCRATCH/pr-meta.json'))['headRefName'])") -TITLE=$(python3 -c "import json; print(json.load(open('$SCRATCH/pr-meta.json'))['title'])") - -# Clone the repo if we're in an empty workspace -if [ ! -e ".git" ]; then - echo "No git repo found — cloning ${PR_REPO}..." - git clone "https://github.com/${PR_REPO}.git" _repo - # Move contents up (clone into subdir, then move to workspace root) - mv _repo/.git . - mv _repo/* . 2>/dev/null || true - mv _repo/.* . 2>/dev/null || true - rmdir _repo 2>/dev/null || true -fi - -# Add upstream remote for the PR's repo -REMOTE_URL="https://github.com/${PR_REPO}.git" -if ! git remote get-url upstream >/dev/null 2>&1; then - git remote add upstream "$REMOTE_URL" -fi - -# Fetch PR ref and base branch -echo "Fetching PR #${PR_NUMBER} head and base (${BASE_BRANCH})..." -git fetch upstream "pull/${PR_NUMBER}/head" --quiet -git fetch upstream "$BASE_BRANCH" --quiet - -# Checkout PR at FETCH_HEAD -RUN_TAG=$(echo "${KILROY_RUN_ID:-$$}" | tail -c 9) -LOCAL_BRANCH="pr-review/${PR_NUMBER}-${RUN_TAG}" -git checkout -b "$LOCAL_BRANCH" FETCH_HEAD --quiet - -PR_SHA=$(git rev-parse HEAD) -echo "PR branch HEAD: $PR_SHA (local: $LOCAL_BRANCH)" - -# Merge base branch into PR branch -echo "Merging upstream/${BASE_BRANCH} into PR branch..." -if git merge "upstream/${BASE_BRANCH}" --no-edit 2>"$SCRATCH/merge-stderr.txt"; then - echo "MERGE_STATUS=clean" > "$SCRATCH/merge-result.txt" - echo "Merge: clean" -else - echo "MERGE_STATUS=conflict" > "$SCRATCH/merge-result.txt" - git diff --name-only --diff-filter=U > "$SCRATCH/conflict-files.txt" 2>/dev/null || true - git merge --abort || true - echo "Merge: CONFLICTS (see $SCRATCH/conflict-files.txt)" -fi - -# Summary -FILES=$(python3 -c "import json; print(json.load(open('$SCRATCH/pr-meta.json'))['changedFiles'])") -ADDS=$(python3 -c "import json; print(json.load(open('$SCRATCH/pr-meta.json'))['additions'])") -DELS=$(python3 -c "import json; print(json.load(open('$SCRATCH/pr-meta.json'))['deletions'])") -AUTHOR=$(python3 -c "import json; print(json.load(open('$SCRATCH/pr-meta.json'))['author']['login'])") - -echo "" -echo "PR #${PR_NUMBER}: ${TITLE}" -echo " Author: ${AUTHOR}" -echo " Branch: ${HEAD_BRANCH} -> ${BASE_BRANCH}" -echo " Files changed: ${FILES} (+${ADDS} / -${DELS})" diff --git a/workflows/pr-review/scripts/write-fail-report.sh b/workflows/pr-review/scripts/write-fail-report.sh deleted file mode 100644 index 909b3571..00000000 --- a/workflows/pr-review/scripts/write-fail-report.sh +++ /dev/null @@ -1,59 +0,0 @@ -#!/bin/sh -# Generate review-report.md from build/test failure data. -# Called when build_test exits non-zero. - -SCRATCH=".ai/pr-data" - -# Read build report if it exists -BUILD_STATUS="unknown" -TEST_STATUS="unknown" -if [ -f "$SCRATCH/build-report.json" ]; then - BUILD_STATUS=$(python3 -c "import json; print(json.load(open('$SCRATCH/build-report.json'))['build'])" 2>/dev/null || echo "unknown") - TEST_STATUS=$(python3 -c "import json; print(json.load(open('$SCRATCH/build-report.json'))['test'])" 2>/dev/null || echo "unknown") -fi - -# Read PR title -TITLE="(unknown)" -if [ -f "$SCRATCH/pr-meta.json" ]; then - TITLE=$(python3 -c "import json; print(json.load(open('$SCRATCH/pr-meta.json'))['title'])" 2>/dev/null || echo "(unknown)") -fi - -# Build the failure report -cat > review-report.md <<EOF -# PR Review Report - -**PR**: #${KILROY_INPUT_PR_NUMBER} — ${TITLE} -**Repo**: ${KILROY_INPUT_PR_REPO} -**Result**: BUILD/TEST FAILURE — code review skipped - -## Build & Test Results - -| Check | Status | -|---------|--------| -| Build | ${BUILD_STATUS} | -| Tests | ${TEST_STATUS} | - -## Details - -The PR failed build or test checks. Code review was skipped because the code -does not compile or pass tests in its current state. - -EOF - -# Append build stderr if available -if [ -f "$SCRATCH/build-stderr.txt" ] && [ -s "$SCRATCH/build-stderr.txt" ]; then - echo "### Build Errors" >> review-report.md - echo '```' >> review-report.md - head -100 "$SCRATCH/build-stderr.txt" >> review-report.md - echo '```' >> review-report.md -fi - -# Append test output if available -if [ -f "$SCRATCH/test-output.txt" ] && [ -s "$SCRATCH/test-output.txt" ]; then - echo "### Test Output" >> review-report.md - echo '```' >> review-report.md - tail -50 "$SCRATCH/test-output.txt" >> review-report.md - echo '```' >> review-report.md -fi - -echo "Wrote review-report.md (build/test failure)" diff --git a/workflows/pr-review/workflow.toml b/workflows/pr-review/workflow.toml deleted file mode 100644 index 6019470a..00000000 --- a/workflows/pr-review/workflow.toml +++ /dev/null @@ -1,21 +0,0 @@ -# PR review workflow: checkout, build/test, LLM code review, holistic review, report. -# Requires: gh CLI, git. Agent nodes use claude via --tmux. - -name = "pr-review" -description = "Review a GitHub pull request: build, test, code review, holistic assessment." -version = "1.0.0" - -[[inputs]] -name = "pr_repo" -description = "GitHub repository in owner/repo format" -required = true - -[[inputs]] -name = "pr_number" -description = "Pull request number" -required = true - -[defaults] -labels = { workflow = "pr-review" } - -outputs = ["review-report.md"] diff --git a/workflows/quick-launch/graph.codex.dot b/workflows/quick-launch/graph.codex.dot deleted file mode 100644 index 5cd002cc..00000000 --- a/workflows/quick-launch/graph.codex.dot +++ /dev/null @@ -1,27 +0,0 @@ -// Quick-launch variant: same workflow, codex agent. -// See graph.dot for the canonical version and docs. - -digraph quick_launch_codex { - graph [ - inputs="prompt", - outputs="result.md", - model_stylesheet=" - * { llm_provider: openai; llm_model: gpt-5-codex; } - " - ] - - start [shape=Mdiamond, label="Start"] - done [shape=Msquare, label="Done"] - - agent [ - shape=box, - label="Run agent", - agent_tool="codex", - llm_provider="openai", - prompt="Read .kilroy/INPUT.md for the user's request. The '## prompt' section is your assignment. If a '## context_file' section exists, read the file at that path for background context. If a '## context' section exists instead, use it as inline context. Complete the task and write your findings to result.md at the workspace root. Keep result.md self-contained — the caller will read it directly. When finished, write {\"status\":\"success\"} to $KILROY_STAGE_STATUS_PATH or $KILROY_STAGE_STATUS_FALLBACK_PATH if that fails.", - outputs="result.md" - ] - - start -> agent - agent -> done -} diff --git a/workflows/quick-launch/graph.dot b/workflows/quick-launch/graph.dot deleted file mode 100644 index c8d2074b..00000000 --- a/workflows/quick-launch/graph.dot +++ /dev/null @@ -1,40 +0,0 @@ -// Quick-launch: run a single agent against an inline prompt (plus optional -// context file) and collect result.md. -// -// Inputs: prompt (required), context_file or context (optional). -// Output: result.md at the worktree root. -// -// All inputs land in .kilroy/INPUT.md which the engine writes once at run -// start from the structured --input map. The agent reads that file directly; -// no prepare/stage step is needed because --input values are already sectioned -// as "## prompt", "## context_file", etc. -// -// Default agent: claude. For codex/gemini, use --graph graph.codex.dot or -// graph.gemini.dot. - -digraph quick_launch { - graph [ - inputs="prompt", - outputs="result.md", - model_stylesheet=" - * { llm_provider: anthropic; llm_model: claude-sonnet-4.6; } - " - ] - - start [shape=Mdiamond, label="Start"] - done [shape=Msquare, label="Done"] - - // Agent: read .kilroy/INPUT.md (which holds the user's prompt + any - // context_file path or inline context), do the work, write result.md - // to the worktree root. - agent [ - shape=box, - label="Run agent", - agent_tool="claude", - prompt="Read .kilroy/INPUT.md for the user's request. The '## prompt' section is your assignment. If a '## context_file' section exists, read the file at that path for background context. If a '## context' section exists instead, use it as inline context. Complete the task and write your findings to result.md at the workspace root. Keep result.md self-contained — the caller will read it directly. When finished, write {\"status\":\"success\"} to $KILROY_STAGE_STATUS_PATH or $KILROY_STAGE_STATUS_FALLBACK_PATH if that fails.", - outputs="result.md" - ] - - start -> agent - agent -> done -} diff --git a/workflows/quick-launch/graph.gemini.dot b/workflows/quick-launch/graph.gemini.dot deleted file mode 100644 index 12b60e4d..00000000 --- a/workflows/quick-launch/graph.gemini.dot +++ /dev/null @@ -1,27 +0,0 @@ -// Quick-launch variant: same workflow, gemini agent. -// See graph.dot for the canonical version and docs. - -digraph quick_launch_gemini { - graph [ - inputs="prompt", - outputs="result.md", - model_stylesheet=" - * { llm_provider: google; llm_model: gemini-2.5-pro; } - " - ] - - start [shape=Mdiamond, label="Start"] - done [shape=Msquare, label="Done"] - - agent [ - shape=box, - label="Run agent", - agent_tool="gemini", - llm_provider="google", - prompt="Read .kilroy/INPUT.md for the user's request. The '## prompt' section is your assignment. If a '## context_file' section exists, read the file at that path for background context. If a '## context' section exists instead, use it as inline context. Complete the task and write your findings to result.md at the workspace root. Keep result.md self-contained — the caller will read it directly. When finished, write {\"status\":\"success\"} to $KILROY_STAGE_STATUS_PATH or $KILROY_STAGE_STATUS_FALLBACK_PATH if that fails.", - outputs="result.md" - ] - - start -> agent - agent -> done -} diff --git a/workflows/quick-launch/workflow.toml b/workflows/quick-launch/workflow.toml deleted file mode 100644 index 92384a37..00000000 --- a/workflows/quick-launch/workflow.toml +++ /dev/null @@ -1,26 +0,0 @@ -# Quick-launch workflow: stage a task + optional context, then run one agent. -# Intended for investigation/research tasks kicked off from an agentic chat. - -name = "quick-launch" -description = "Single-agent task runner: stage a prompt (and optional context file), run the agent, collect result.md" -version = "0.1.0" - -outputs = ["result.md"] - -[[inputs]] -name = "prompt" -description = "Task description for the agent (short form). Required." -required = true - -[[inputs]] -name = "context_file" -description = "Absolute path to a file to stage as .kilroy/CONTEXT.md for the agent to read. Optional." -required = false - -[[inputs]] -name = "context" -description = "Inline context string to stage as .kilroy/CONTEXT.md. Ignored if context_file is set. Optional." -required = false - -[defaults] -labels = { workflow = "quick-launch" }