feat(devops): add auto-evolution loop (PR review + BMAD pipeline) by houko · Pull Request #94 · librefang/librefang-registry

houko · 2026-05-14T06:14:53Z

Summary

Extends the DevOps Hand to periodically scan configured GitHub repos and (a) review open PRs via the existing code-reviewer sub-agent and (b) triage open issues, dispatching actionable ones to a new implementer sub-agent that runs the Brainstorm → Architect → PRD → Implement pipeline scaled by bmad_strictness and produces a draft PR.

PR review path: pulls diff, asks code-reviewer for a structured verdict, posts a single GitHub review (COMMENT or REQUEST_CHANGES; never auto-APPROVE)
Issue path: label-first triage with single-prompt LLM fallback (bug-fix | feature | needs-info | skip); actionable ones get a draft PR via the BMAD pipeline
New sub-agent: [agents.implementer] with strict guardrails (failing test first for bugs, BMAD.md committed with the change, no merging, no push to protected branches)

Safety floor (always on)

Draft PRs only — Hand never marks PRs ready-for-review and never merges
Never pushes to main / master / protected branches
Never --force / --no-verify / --amend against a remote branch
Escalates to devops_queue.json on workspace Cargo.toml, migrations, secrets, or >30 changed files
Per-tick token budget capped at 70% so subsequent ticks have headroom

Surface area

HAND.toml: +246 lines — new routing aliases, 4 new settings (auto_evolve, evolution_repos, evolution_check_interval, bmad_strictness), Phase 7 — Evolution Loop in main agent prompt, [agents.implementer] block, 3 new dashboard metrics
SKILL.md: +365 lines — Issue Triage Playbook, PR Review Automation, Bug Fix Playbook, BMAD Feature Pipeline, Draft PR Creation
README.md: updated settings table + Auto-Evolution Mode section + required GitHub token scopes

Test plan

taplo lint hands/devops/HAND.toml — passed
taplo fmt --check hands/devops/HAND.toml — passed
python scripts/validate.py --type hands — passed
python scripts/validate.py (full registry) — passed
Smoke test in a sandbox librefang daemon with auto_evolve = true, evolution_repos = "<your-test-repo>", observe one tick produces a COMMENT review on an open PR
Trigger one bug-fix issue through to draft PR creation end-to-end
Verify the safety floor blocks: try pointing at a protected branch, try a >30-file change, try a path containing .env

Out of scope (intentional)

i18n translations for the 4 new settings — added English only, deferring to language-aware contributors
Cross-Hand event wiring with evolution-pilot Hand — keeping everything inside DevOps Hand for now; see "is this too bloated?" discussion threads if we want to split later
Webhook-driven triggering (currently cron-driven via evolution_check_interval only)

…ipeline) Extends the DevOps Hand to periodically scan configured GitHub repos and: - review open PRs via the existing code-reviewer sub-agent, posting a single COMMENT review back to GitHub (never auto-APPROVE) - triage open issues via labels first, single-prompt LLM fallback - dispatch actionable issues (bug-fix / feature) to a new implementer sub-agent which runs the BMAD pipeline (Brainstorm -> Architect -> PRD -> Implement) scaled by bmad_strictness and produces a DRAFT PR Safety floor (always on): - draft PRs only, never auto-ready, never merge - never push to main/master/protected branches - escalates to devops_queue.json when touching workspace Cargo.toml, migrations, secrets, or >30 changed files - 70% per-turn token budget cap so subsequent ticks have headroom New settings: auto_evolve, evolution_repos, evolution_check_interval, bmad_strictness. New sub-agent: agents.implementer. New SKILL.md sections: Issue Triage Playbook, PR Review Automation, Bug Fix Playbook, BMAD Feature Pipeline, Draft PR Creation. Three new dashboard metrics: prs_reviewed, issues_processed, draft_prs_opened.

chatgpt-codex-connector · 2026-05-14T06:15:00Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

Blocking (5): - add max_changed_files setting (was referenced in implementer prompt but never defined) - drop metering_query reference (tool isn't in tools = [...] list); agent self-paces against budget instead - fix \n\n literal in jq --arg for issue cross-link comment; compose body in shell with printf so newlines survive - resolve BASE_BRANCH via /repos/owner/repo .default_branch instead of relying on an undefined variable - complete reviewer-verdict → GitHub review-event mapping (4 cases, not just request_changes); block routes through REQUEST_CHANGES with a blocking-prefix in the body, approve downgrades to COMMENT Medium (5): - correct Phase 6 → Phase 7 in the auto-evolution settings comment - remove schedule_create busy-loop confusion; Phase 7 fires per-turn while the Hand is already frequency = "continuous", with cadence enforced via devops_evolution_cursor memory key - generalize the forbid-main-worktree wording — discover and honor whatever pre-commit / pre-push / commit-msg hooks the upstream repo configures (was librefang-specific) - clarify the AI-attribution rule: ban LLM-vendor attribution (Claude, GPT, 🤖, etc.) but allow process attribution (DevOps Hand → implementer) for traceability - add USER_TYPE = "Bot" short-circuit that was extracted but never applied (bots get a token-cheap skip, not a deep review) Style (2): - document the four event_publish event names (devops_evolution_*) in a new SKILL.md table alongside the memory-keys table - justify implementer's max_history_messages = 100 with a comment (BMAD 4 phases × cargo build/test chains needs headroom)

D1 -- show SUMMARY_BODY (and VERDICT) assignment in PR review snippet: add explicit jq -r .summary / .verdict extraction from reviewer_output.json so the agent reading SKILL.md doesn't have to infer where these come from. D2 -- reword strict-mode wait semantics in both HAND.toml and SKILL.md: 'Stop. Wait...' was misleading because the agent loop has no in-turn pause primitive. Now spells out: end the current turn after queueing, let the continuous tick re-read the queue, resume on approved / skip on pending / abandon on rejected. Explicitly forbids busy-wait and sleep loops. D3 -- restructure bot / huge-diff short-circuit so agent-tool calls are expressed as numbered agent steps, not as '# memory_store ...' comments inside a bash block. The bash block now only extracts cheap signals; the decision and the tool calls are clearly agent-level. D4 -- remove the misleading 'exit 0' from the short-circuit bash and add a one-liner noting that exit 0 inside shell_exec only ends one shell session, not the Phase 7 pass; the agent must choose to move on.

houko added 2 commits May 14, 2026 15:24

houko merged commit d215388 into main May 14, 2026
3 checks passed

houko deleted the feat/devops-evolution branch May 14, 2026 06:54

houko mentioned this pull request May 14, 2026

docs(hands): add Auto-Evolution Mode page (companion to registry#94) librefang/librefang#5029

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(devops): add auto-evolution loop (PR review + BMAD pipeline)#94

feat(devops): add auto-evolution loop (PR review + BMAD pipeline)#94
houko merged 3 commits into
mainfrom
feat/devops-evolution

houko commented May 14, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

houko commented May 14, 2026

Summary

Safety floor (always on)

Surface area

Test plan

Out of scope (intentional)

Uh oh!

chatgpt-codex-connector Bot commented May 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant