feat(agent): ynh agent run — autonomous agent loop driver by eyelock · Pull Request #154 · eyelock/ynh

eyelock · 2026-05-12T05:41:35Z

Summary

Adds ynh agent run — an autonomous agent loop driver embedded in the ynh binary. The loop spawns a vendor agent subprocess, runs sensors after each turn, synthesises feedback, and enforces budgets until convergence or a halt condition.

Companion PR: eyelock/TermQ#248

Changes

Core loop (internal/agent/)

loop.go — RunLoop: plan phase, act loop, sensor execution, convergence check, stuckness watchdog, interactive approval, budget enforcement, NDJSON trajectory emission
worker.go — WorkerBackend / WorkerSession interfaces; wire-format details stay inside each implementation
budget.go — Budget: turn/token/wall-clock limits with typed exit codes and BudgetType enum
watchdog.go — Watchdog: edit-loop detection + no-progress detection (sensor hash unchanged K turns)
sensor.go — RunSensor wrapping ynh sensors run; SensorHash for watchdog; --sensor-overlay-json pass-through
control.go — ControlReader: JSON control messages for interactive approval/interrupt via stdin
trajectory.go — TrajectoryWriter: typed NDJSON event stream (type, timestamp, synthesized_feedback, budget enum, total_turns/total_tokens)

Backends

claude.go — ClaudeBackend: long-lived subprocess, stream-json mode
codex.go — CodexBackend: long-lived subprocess, codex exec --json
cursor.go — CursorBackend: per-turn subprocess with --resume <chatId>

CLI (cmd/ynh/)

agent.go — ynh agent run with flags: --harness, --task, --backend, --sandbox, --model, --max-turns, --max-tokens, --max-wall, --convergence-sensor, --worktree, --emit-jsonl, --sensor-overlay, --interactive, --no-plan
sensors.go — --sensor-overlay-json flag; shallow JSON merge applied before execution
cliformat.go — disabled HTML escaping in structured error envelope so <harness-name> renders literally

Exit codes

Code	Meaning
0	Converged
10	Turn cap
11	Token budget
12	Wall-clock limit
13	Stuck
14	Tamper detected
20	Worker error
30	User aborted

Testing

make check passes — 0 lint issues, all tests green, both binaries build
make e2e passes — full E2E suite green
Unit tests for all new packages: budget, watchdog, trajectory, control, sensor, claude, codex, cursor, loop integration, cliformat

🤖 Generated with Claude Code

Implements the Phase 1 loop driver as `ynh agent run`, embedding it in the ynh binary per the agent-loop plan. The loop driver is the missing orchestration layer that sits above ynh's sensor execution: it spawns a vendor agent subprocess, runs sensors between turns, synthesises feedback, and enforces budgets and stuckness limits until all sensors converge. Key design decisions: - WorkerBackend interface isolates all wire-format details inside each backend; the loop driver never sees stream-json specifics. Claude Code is the only v1 backend; the interface is ready for Codex (Phase 4) with ~200 incremental lines. - Sensor execution shells out to `ynh sensors run` (already shipped in v0.3.1) so loop-driver policy (pass/fail thresholds) stays separate from ynh's mechanical execution. - NDJSON trajectory writer emits one event per line to a JSONL file or stdout; TermQ's Inspector drives off this stream. - Stdin control protocol (approve_plan, reject_plan, interrupt, approve_turn, replace_feedback) allows TermQ and CI to steer the loop without polling. - Budget (turns/tokens/wall-clock) and stuckness watchdog (edit-loop + no-progress) are enforced in-process with typed exit codes (10-30) for CI integration. - srt sandbox support via --sandbox srt|none. - Plan/Act phase split: first turn writes plan.md, awaits approval, then enters the act loop. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Add CodexBackend (codex exec --json) and CursorBackend (per-turn subprocess with --resume) so all three vendor CLIs are supported - Fix trajectory wire format to match TermQ consumer expectations: Event.Kind serialises as "type" (not "kind"), Event.Timestamp as "timestamp" (not "time") - BudgetExceededData gains a typed Budget field ("turns"/"tokens"/"wall_clock") - SessionEndData gains TotalTurns and TotalTokens on all exit paths - TurnApprovalData field renamed to SynthesizedFeedback (JSON: synthesized_feedback) - Budget.Exceeded() returns BudgetType as a third value; loop driver threads it through to the trajectory event Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Cover NDJSON parsing, content accumulation, usage tracking, EOF handling, unknown-event skipping, Send wire format, and cursor session state (pending queue, firstTurn flag, Close no-op). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…tching Loop driver accepts --sensor-overlay <json> (e.g. '{"build":{"source": {"command":"make fast"}}}') and passes each sensor's overlay to ynh sensors run via the new --sensor-overlay-json flag. ynh performs a shallow JSON field-merge over the base harness declaration before executing the sensor, keeping all execution logic inside ynh. TermQ uses this to let users tweak sensor declarations per-session in the Inspector without modifying the installed harness. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Go's json.Marshal escapes < and > as </> by default. Switching to json.NewEncoder + SetEscapeHTML(false) so usage strings like <harness-name> render literally in terminals and CI logs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Cover unstructured vs structured mode, JSON field values, no-HTML-escape behaviour (verifies SetEscapeHTML(false) is effective), and trailing newline. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Promotes release/v0.4.0 to main for stable release. ## What's in 0.4.0 **Features** - `ynh agent run` — autonomous agent loop driver (#154) - `--instructions` flag for per-invocation context injection on `ynh run` (#152) - Focus, profile, hook, and MCP editing commands — `ynh focus/profile/hook/mcp add/remove/update` (#160) **Fixes** - Schema-3 pointer-form local installs: read/write symmetry (#158) - \`ynh ls\`: derive load id from entry namespace, not hard-coded \`local/\` (#159) - Include/delegate edits route to source dir for local installs (#149) - Dead-code cleanup in harness loader (#150) **Docs** - New \`docs/focus.md\` reference page - Tutorial chain realigned to current sidebar ordering - \`--focus\` flag and \`YNH_FOCUS\` env var documented across \`ynh run\` and \`ynd preview/diff/export\` (#161) **Tests** - Unit coverage push: \`internal/harness\` 48.5% → 84.5%; agent and vendor cheap wins (#162) - E2E coverage for ynd-side focus handling, \`ynd fmt\` edge cases, focus clear-profile round trip (#163) **CI/release** - \`-trimpath\` added to goreleaser build flags (#156)

David Collie and others added 6 commits May 11, 2026 17:31

test(cli): add tests for structured error envelope format

b89da63

Cover unstructured vs structured mode, JSON field values, no-HTML-escape behaviour (verifies SetEscapeHTML(false) is effective), and trailing newline. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

eyelock changed the title ~~feat(agent): ynh agent run — autonomous loop driver (Phase 1)~~ feat(agent): ynh agent run — autonomous agent loop driver May 12, 2026

eyelock marked this pull request as ready for review May 12, 2026 05:45

eyelock merged commit bee32ac into develop May 12, 2026
6 checks passed

eyelock deleted the feat/yna-loop-agent branch May 12, 2026 05:45

eyelock mentioned this pull request May 13, 2026

release: v0.4.0 #164

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(agent): ynh agent run — autonomous agent loop driver#154

feat(agent): ynh agent run — autonomous agent loop driver#154
eyelock merged 6 commits into
developfrom
feat/yna-loop-agent

eyelock commented May 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

eyelock commented May 12, 2026

Summary

Changes

Exit codes

Testing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant