feat(agent): ynh agent run — autonomous agent loop driver#154
Merged
Conversation
Implements the Phase 1 loop driver as `ynh agent run`, embedding it in the ynh binary per the agent-loop plan. The loop driver is the missing orchestration layer that sits above ynh's sensor execution: it spawns a vendor agent subprocess, runs sensors between turns, synthesises feedback, and enforces budgets and stuckness limits until all sensors converge. Key design decisions: - WorkerBackend interface isolates all wire-format details inside each backend; the loop driver never sees stream-json specifics. Claude Code is the only v1 backend; the interface is ready for Codex (Phase 4) with ~200 incremental lines. - Sensor execution shells out to `ynh sensors run` (already shipped in v0.3.1) so loop-driver policy (pass/fail thresholds) stays separate from ynh's mechanical execution. - NDJSON trajectory writer emits one event per line to a JSONL file or stdout; TermQ's Inspector drives off this stream. - Stdin control protocol (approve_plan, reject_plan, interrupt, approve_turn, replace_feedback) allows TermQ and CI to steer the loop without polling. - Budget (turns/tokens/wall-clock) and stuckness watchdog (edit-loop + no-progress) are enforced in-process with typed exit codes (10-30) for CI integration. - srt sandbox support via --sandbox srt|none. - Plan/Act phase split: first turn writes plan.md, awaits approval, then enters the act loop. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add CodexBackend (codex exec --json) and CursorBackend (per-turn
subprocess with --resume) so all three vendor CLIs are supported
- Fix trajectory wire format to match TermQ consumer expectations:
Event.Kind serialises as "type" (not "kind"), Event.Timestamp as
"timestamp" (not "time")
- BudgetExceededData gains a typed Budget field ("turns"/"tokens"/"wall_clock")
- SessionEndData gains TotalTurns and TotalTokens on all exit paths
- TurnApprovalData field renamed to SynthesizedFeedback (JSON: synthesized_feedback)
- Budget.Exceeded() returns BudgetType as a third value; loop driver
threads it through to the trajectory event
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Cover NDJSON parsing, content accumulation, usage tracking, EOF handling, unknown-event skipping, Send wire format, and cursor session state (pending queue, firstTurn flag, Close no-op). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tching
Loop driver accepts --sensor-overlay <json> (e.g. '{"build":{"source":
{"command":"make fast"}}}') and passes each sensor's overlay to
ynh sensors run via the new --sensor-overlay-json flag. ynh performs a
shallow JSON field-merge over the base harness declaration before
executing the sensor, keeping all execution logic inside ynh.
TermQ uses this to let users tweak sensor declarations per-session in
the Inspector without modifying the installed harness.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Go's json.Marshal escapes < and > as </> by default. Switching to json.NewEncoder + SetEscapeHTML(false) so usage strings like <harness-name> render literally in terminals and CI logs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Cover unstructured vs structured mode, JSON field values, no-HTML-escape behaviour (verifies SetEscapeHTML(false) is effective), and trailing newline. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Merged
eyelock
added a commit
that referenced
this pull request
May 13, 2026
Promotes release/v0.4.0 to main for stable release. ## What's in 0.4.0 **Features** - `ynh agent run` — autonomous agent loop driver (#154) - `--instructions` flag for per-invocation context injection on `ynh run` (#152) - Focus, profile, hook, and MCP editing commands — `ynh focus/profile/hook/mcp add/remove/update` (#160) **Fixes** - Schema-3 pointer-form local installs: read/write symmetry (#158) - \`ynh ls\`: derive load id from entry namespace, not hard-coded \`local/\` (#159) - Include/delegate edits route to source dir for local installs (#149) - Dead-code cleanup in harness loader (#150) **Docs** - New \`docs/focus.md\` reference page - Tutorial chain realigned to current sidebar ordering - \`--focus\` flag and \`YNH_FOCUS\` env var documented across \`ynh run\` and \`ynd preview/diff/export\` (#161) **Tests** - Unit coverage push: \`internal/harness\` 48.5% → 84.5%; agent and vendor cheap wins (#162) - E2E coverage for ynd-side focus handling, \`ynd fmt\` edge cases, focus clear-profile round trip (#163) **CI/release** - \`-trimpath\` added to goreleaser build flags (#156)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
ynh agent run— an autonomous agent loop driver embedded in the ynh binary. The loop spawns a vendor agent subprocess, runs sensors after each turn, synthesises feedback, and enforces budgets until convergence or a halt condition.Companion PR: eyelock/TermQ#248
Changes
Core loop (
internal/agent/)loop.go—RunLoop: plan phase, act loop, sensor execution, convergence check, stuckness watchdog, interactive approval, budget enforcement, NDJSON trajectory emissionworker.go—WorkerBackend/WorkerSessioninterfaces; wire-format details stay inside each implementationbudget.go—Budget: turn/token/wall-clock limits with typed exit codes andBudgetTypeenumwatchdog.go—Watchdog: edit-loop detection + no-progress detection (sensor hash unchanged K turns)sensor.go—RunSensorwrappingynh sensors run;SensorHashfor watchdog;--sensor-overlay-jsonpass-throughcontrol.go—ControlReader: JSON control messages for interactive approval/interrupt via stdintrajectory.go—TrajectoryWriter: typed NDJSON event stream (type,timestamp,synthesized_feedback,budgetenum,total_turns/total_tokens)Backends
claude.go—ClaudeBackend: long-lived subprocess, stream-json modecodex.go—CodexBackend: long-lived subprocess,codex exec --jsoncursor.go—CursorBackend: per-turn subprocess with--resume <chatId>CLI (
cmd/ynh/)agent.go—ynh agent runwith flags:--harness,--task,--backend,--sandbox,--model,--max-turns,--max-tokens,--max-wall,--convergence-sensor,--worktree,--emit-jsonl,--sensor-overlay,--interactive,--no-plansensors.go—--sensor-overlay-jsonflag; shallow JSON merge applied before executioncliformat.go— disabled HTML escaping in structured error envelope so<harness-name>renders literallyExit codes
Testing
make checkpasses — 0 lint issues, all tests green, both binaries buildmake e2epasses — full E2E suite green🤖 Generated with Claude Code