feat(agent): agent loop — ynh agent run integration, fleet UI, and Inspector#248
Draft
eyelock wants to merge 30 commits into
Draft
feat(agent): agent loop — ynh agent run integration, fleet UI, and Inspector#248eyelock wants to merge 30 commits into
eyelock wants to merge 30 commits into
Conversation
First slice of the agent loop capability (see .claude/plans/2026-04-29-feat-agent-loop.md). Pure data model; no behavior changes. Adds AgentBackend, AgentMode, AgentInteractionMode, AgentStatus, AgentBudget, and AgentConfig as Codable/Sendable value types in TermQCore. Extends TerminalCard with one optional `agentConfig` field — non-agent cards stay nil; pre-agent JSON decodes cleanly. Tests cover defaults, Codable round-trip, stable raw-value wire format, and legacy-JSON backward compatibility. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Second slice of the agent loop capability (see .claude/plans/2026-04-29-feat-agent-loop.md). Pure UI scaffolding; feature-flagged off by default behind `feature.agentTab`. Adds an AgentSessionsSidebarTab placeholder view (header + empty state) and refactors SidebarView's tab picker around a `visibleTabs` computed property so optional tabs gated by feature flags compose without touching the picker each time. Live data wiring lands in a later slice. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Third slice of the agent loop capability (see .claude/plans/2026-04-29-feat-agent-loop.md). UI-only. The Agent Sessions tab now reads from BoardViewModel and renders one row per card whose agentConfig is set. Each row shows title, qualified harness name, and a colour-coded status pill reflecting AgentStatus. Empty state preserved when no agent cards exist. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fourth slice of the agent loop capability (see .claude/plans/2026-04-29-feat-agent-loop.md). Adds BoardViewModel.createAgentCard which creates a TerminalCard with agentConfig populated from a harness identifier. Surfaces a "Launch as Agent" context-menu item on harness sidebar rows; gated on the feature.agentTab flag (the closure is nil when the agent tab is off, so the menu item disappears). After creation, the sidebar switches to the Agent Sessions tab so the new card is immediately visible. The new BoardViewModel method takes primitives (harnessId, title, description) rather than the Harness struct to avoid a Column type collision between TermQCore.Column (class) and TermQShared.Column (struct), which would surface as ambiguous-symbol errors elsewhere in the file if BoardViewModel imported TermQShared. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fifth slice of the agent loop capability (see .claude/plans/2026-04-29-feat-agent-loop.md). UI-only. Wraps each agent-session row in a plain-style Button so clicking selects the card via the existing BoardViewModel.selectCard plumbing. Selected rows highlight with an accent-coloured background and elevated harness-subtitle contrast, matching the visual idiom of other selectable list rows in TermQ. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sixth slice of the agent loop capability (see .claude/plans/2026-04-29-feat-agent-loop.md). UI-only. Selecting a card whose agentConfig is set now routes to a new AgentInspectorView in the main pane instead of attempting to attach a terminal. The skeleton surfaces session config (backend, mode, interaction, formatted budget, short session id) and a trajectory placeholder. Run / Stop buttons are present but disabled with a "Loop driver not yet wired" tooltip; later slices fill in the live per-turn sensor + feedback wire as the loop driver lands. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Seventh slice of the agent loop capability (see .claude/plans/2026-04-29-feat-agent-loop.md). Surfaces the feature.agentTab flag through Settings → Tools instead of requiring `defaults write`. Mirrors the Strings.Settings.Ynh / ynhSection pattern: a short description, single toggle bound to feature.agentTab, help tooltip explaining what the tab adds. English-only strings landed; localization fan-out to the ~30 other locales is a follow-up via /localization. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Eighth slice of the agent loop capability (see .claude/plans/2026-04-29-feat-agent-loop.md). Fans the four `settings.agent.*` keys added in the previous slice out to all 39 non-English locales via the localizer agent. Each file gets a "// MARK: - Settings — Agent Loop" block immediately before the existing YNH section, with native translations matching the tone of nearby strings. Items flagged for native review by the localizer agent: - zh-HK: 测證器 (sensors) — confirm preferred term vs. zh-Hans/Hant - he: niqqud on "Experimental" may be inconsistent with file style - ko: kept "Agent" as English token to match surrounding YNH strings Zero NEEDS TRANSLATION markers per project policy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Ninth slice of the agent loop capability (see .claude/plans/2026-04-29-feat-agent-loop.md). AgentLoopProcess is an actor that spawns a long-running NDJSON-emitting subprocess (the future ynh-agent binary, or a test stub today) and streams parsed TrajectoryEvent values via AsyncStream. Supports send(line:) for stdin feedback and stop() for SIGTERM termination. Also adds the TrajectoryEvent value type to TermQCore: type and timestamp parsed from the line; the original JSON preserved in payloadJSON for downstream typed decoding (typed event schemas land in the next slice). Lifecycle synchronisation point worth noting: stdout drain and process termination race naturally. Initial implementation finished the stream from the termination handler and lost trailing events. Fixed via a termination continuation so readStdoutLines awaits termination after EOF — guaranteeing status == .exited by the time consumers see the stream end. Tests cover parseLine on its own plus subprocess streaming, invalid lines dropped, double-start, send-before-start. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tenth slice of the agent loop capability (see .claude/plans/2026-04-29-feat-agent-loop.md). Defines the wire-format contract between TermQ and the future ynh-agent loop driver binary. TrajectoryEventPayload is an enum with typed cases for the eight events whose shapes are confident at this stage — session_start, plan, turn_start, sensor_result, stuck_detected, budget_exceeded, converged, session_end — plus a forward-compatible .other(type:json:) fallback so loop-driver upgrades can introduce new event types without breaking TermQ. TrajectoryEvent.decoded() dispatches on the type discriminator and gracefully degrades to .other when known types have missing or malformed required fields. Wire format snake_case keys match the YNH structured-output convention. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Eleventh slice of the agent loop capability (see .claude/plans/2026-04-29-feat-agent-loop.md). Adds AgentSessionController (@mainactor ObservableObject, one per agent card) that wraps AgentLoopProcess and exposes @published events and status arrays for SwiftUI binding. AgentSessionRegistry vends one controller per card.id so the Inspector and any other surface looking at the same card observe the same stream. Inspector now reads agent.loopDriverCommand from UserDefaults; Run spawns `/bin/sh -c "<cmd>"` so the same key works for both production (path to ynh-agent + args) and development (a one-line stub script). Run / Stop buttons are gated on canRun + isRunning. The trajectory section renders incoming events live with timestamp + type + a typed per-variant summary line built from TrajectoryEvent.decoded(). Tests cover the controller's stream-into-events lifecycle, event clearing on re-run, reset(), and registry per-id semantics. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Twelfth slice of the agent loop capability (see
.claude/plans/2026-04-29-feat-agent-loop.md).
The controller now mirrors the loop driver's terminal-state events into
the card's agentConfig.status so the sidebar status pill reflects the
session in real time. Mapping:
start() → .running
event "converged" → .converged
event "stuck_detected" → .stuck
event "budget_exceeded" → .errored
stream ended, exit 0,
no terminal event seen → .converged (inferred)
stream ended, exit != 0,
no terminal event seen → .errored (inferred)
stream ended after a
terminal event already
flipped the status → unchanged (no-downgrade guard)
Card lookup goes through an injectable closure (default uses
BoardViewModel.shared.card(for:)) so the seven new tests can drive a
synthetic TerminalCard without touching the singleton.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Thirteenth slice of the agent loop capability (see .claude/plans/2026-04-29-feat-agent-loop.md). Adds TrajectoryWriter — opens an append-mode FileHandle on <appSupport>/TermQ[-Debug]/agent-sessions/<sessionId>/trajectory.jsonl and writes each event's payloadJSON one line at a time. Failures are silent; the in-memory event stream remains the source of truth. The controller picks up the writer via an injectable factory so tests keep persistence pointed at a temp directory. AgentSessionRegistry wires the production factory; ad-hoc controllers (used by all existing unit tests) default to writerFactory == nil so the test suite never touches the real app-support directory. Sets up the replay surface for a future Transcript viewer and CI-artifact ingest. Six new tests cover writer behaviour and the controller-side persistence wiring. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fourteenth slice of the agent loop capability (see .claude/plans/2026-04-29-feat-agent-loop.md). When the Inspector opens for an agent card, the controller now reads the per-session trajectory.jsonl off disk (if one exists) and populates the events array. Effect: past sessions render in the trajectory list across app restarts and when navigating between agent cards. loadPersistedEvents() is a no-op when events are already populated, when the session is currently running, when the card has no agent config, or when no file exists — so it's safe to call from .onAppear and .onChange(of:) without guards in the view. TrajectoryWriter exposes a public fileURL(for:baseDirectory:) helper for symmetric read access. The controller takes a transcriptBaseURL override for tests; production resolves the same default location as the writer. Four new tests cover the populate path, no-op cases, and a card without agent config. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fifteenth slice of the agent loop capability (see
.claude/plans/2026-04-29-feat-agent-loop.md).
Implements the safety-model plan/act flow: when the loop driver emits
a `plan` event, the controller flips card status to
`.awaitingPlanApproval` and the Inspector renders an orange-tinted
panel above the trajectory containing the plan content (monospaced,
selectable, scrollable up to 320pt) plus Reject / Approve buttons
(Approve bound to the default action).
Two new controller methods, both gated on the awaitingPlanApproval
state:
approvePlan() — writes {"action":"approve_plan"} to the driver's
stdin and flips status back to .running; the driver is expected to
resume work.
rejectPlan() — writes {"action":"reject_plan"}, then SIGTERMs the
driver and flips to .errored.
This commits the TermQ↔ynh-agent control protocol: stdin carries
NDJSON action messages. Future actions (interrupt, edit-feedback,
sensor-overlay-replace) extend the same surface.
Four tests cover: plan event flips status; approvePlan returns to
running; rejectPlan terminates with .errored; approvePlan no-op when
not awaiting. The two integration tests use a long-running stub
process and force the status manually rather than try to game shell
pipe-buffering after `read`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Seventeenth slice of the agent loop capability (see .claude/plans/2026-04-29-feat-agent-loop.md). Lets the user edit a card's agentConfig via Settings → Edit Card. The new Agent tab is gated on viewModel.hasAgentConfig so it only shows for cards launched as agent sessions; non-agent cards see no change. Editable knobs: - Backend (Claude Code / Codex) - Mode (plan / act) - Interaction (auto / confirm / tweak) - Max turns (Stepper 1–500, step 5) - Max tokens (Stepper, displayed as 200k / 1M) - Max wall-clock minutes (Stepper 1–720, step 5) Identity fields (sessionId, harness, status) are NOT exposed and are preserved on save — verified by a test. A non-agent card with non- default values in the agent form fields does NOT have an agentConfig injected on save — also verified. Localization: 9 new keys in en.lproj only; other locales fall back to English. /localization fan-out is a follow-up slice when convenient. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Eighteenth slice of the agent loop capability (see .claude/plans/2026-04-29-feat-agent-loop.md). Fans the nine `editor.agent.*` keys added in slice 17 out to all 39 non-English locales via the localizer agent. Each file gets a "// MARK: - Editor — Agent" block placed immediately before the existing "// MARK: - Settings — Agent Loop" block, with native translations matching the tone of nearby editor strings. Notable judgement calls (preserved here for traceability): - Term "Agent" kept as English loanword in CJK, RTL, Greek, and most European locales where settings.agent.section already treats it as technical; translated only where existing patterns Koreanise/ Japanise/Cyrillicise it (ja エージェント, ko 에이전트, ru/uk Агент, etc.). - "Backend" reused per-locale from harnesses.launch.backend so the editor matches the harness launch sheet. - "tokens" left untransliterated except in CJK (トークン数 / 토큰 수) and Arabic (الرموز). - Wall-clock minute abbreviation matches each locale's convention: (мин), (хв), (分), (분), (د), (λεπτ.), (dk), (דק׳), (นาที), (phút), (mnt), (perc), (मिनट), default (min). Also fixes a flaky test that surfaced once the broader suite ran: testEvent_planFlipsCardToAwaitingApproval relied on observing an intermediate status while a transient subprocess was alive, but the stub exits before the polling loop can catch it, after which handleStreamEnd flips status to .converged. Reworked as a direct unit test that injects a synthetic plan event into handleEventForCardStatus (internal access via @testable). The race is a test artefact only — in production the loop driver doesn't exit while awaiting plan approval. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nineteenth slice of the agent loop capability (see .claude/plans/2026-04-29-feat-agent-loop.md). The handleStreamEnd no-downgrade guard previously checked only .converged / .stuck / .errored. If the loop driver died while the user was still deciding on a plan (.awaitingPlanApproval), the default path inferred .converged on exit 0 — visually marking a hung session as successful, which is wrong. Adds an explicit awaitingPlanApproval branch: if the stream ends mid-approval the session is dead and the card flips to .errored regardless of exit code. Surfaced as a side observation when slice 18 fixed the related test flake. Test verifies the new branch end-to-end. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Twentieth slice of the agent loop capability (see .claude/plans/2026-04-29-feat-agent-loop.md). Adds loopDriverCommand: String to AgentConfig — empty string means "inherit the global agent.loopDriverCommand UserDefault". Custom init(from:) keeps backward compatibility with pre-slice-20 saved JSON (decodeIfPresent ?? ""). Editor surfaces it in the Agent tab as a multi-line monospace TextField with placeholder "Inherit global default" and a help tooltip that explains the inheritance rule. Identity fields (sessionId, harness, status) remain non-editable; budget and mode/interaction/ backend pickers from slice 17 are unchanged. Inspector now resolves the effective command: - per-card (trimmed, non-empty) wins - else fall back to the global @AppStorage canRun and the Run button click both go through effectiveCommand. Help text on disabled Run reflects both surfaces. Three new tests: - AgentConfig defaults loopDriverCommand to "" - AgentConfig backward compat: legacy JSON missing the field decodes - Editor load/save round-trips the per-card override - Editor load on a non-agent card clears any stale string English-only string for the new field; locale fan-out is a follow-up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Twenty-second slice of the agent loop capability (see
.claude/plans/2026-04-29-feat-agent-loop.md).
Fans the three loop-driver-command keys added in slice 20 out to all
39 non-English locales via the localizer agent. Three keys per file
(label, placeholder, help) added inside the existing
"// MARK: - Editor — Agent" block.
Notable judgement calls:
- ja: placeholder uses "グローバルデフォルトを使用" ("use global
default") rather than literal "inherit" — Japanese UI copy avoids
the inheritance metaphor for settings fields.
- ar: "inherit" has no concise UI idiom; rendered as
"استخدام الإعداد الافتراضي العام" (use the global default).
- de: "Binary" kept as English borrowed noun in help text (matches
the terse style of existing German agent strings).
- "ynh-agent" and "agent.loopDriverCommand" preserved verbatim
across all locales (product/key names).
Zero NEEDS TRANSLATION markers per project policy.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Twenty-third slice of the agent loop capability (see .claude/plans/2026-04-29-feat-agent-loop.md). Adds an MCP tool `termq_agents` so a Claude Code (or other MCP-client) session running in TermQ can discover and inspect the agent sessions running alongside it. Returns each card's id, name, column, working directory, and an `agent` block carrying sessionId, harness, backend, mode, interactionMode, status, budget, and any per-card loopDriverCommand override. Optional `status` argument filters by AgentStatus raw value. To support this: - New `AgentConfigSummary` (Sendable, in TermQShared) parallels TermQCore.AgentConfig the same way Card parallels TerminalCard. Wire format is identical so a single board.json agentConfig block decodes cleanly into either type. - `Card` (TermQShared) gains an optional `agentConfig` field; custom decoder uses decodeIfPresent so legacy cards still parse. - `AgentSessionOutput` in OutputTypes provides the JSON shape the tool returns; failable initialiser collapses non-agent cards to nil at the call site. Six new TermQSharedTests cover round-trip, legacy JSON, Card decoding with and without agentConfig, and AgentSessionOutput's nil-for-non- agent behaviour. Existing toolCount assertions updated 10 → 11. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Twenty-fourth slice of the agent loop capability (see .claude/plans/2026-04-29-feat-agent-loop.md). stop() now sends an `interrupt` action over stdin (per the slice-16 control protocol) and waits a short grace period before falling back to SIGTERM. Default grace 1.5s; tunable via graceSeconds parameter. Effect: the loop driver gets a chance to flush buffered trajectory state (incomplete sensor runs, pending feedback) and emit a clean session_end event before being killed. Reserved for the user-driven Stop button; the rejectPlan flow still fires SIGTERM directly because rejection is decisive. Test verifies the graceful path: a stub that exits on interrupt sees exit code 0, never receives SIGTERM. Existing tests that use stop() purely as cleanup pass `graceSeconds: 0` so they don't add wall-clock to the suite. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the flat raw-event trajectory list with a turn-grouped view: sensor results now render with colour-coded ✓/✗ icons, duration, and summary text; each turn gets a header row showing pass/fail counts. A "Last Sensors" strip above the trajectory shows the most recent run results at a glance without scrolling. Adds the sensor overlay editor: a sheet that loads declared sensors from ynh, lets the user set a role override and (for focus sensors) edit the prompt inline against the harness baseline, and persists the result as sensor-overlays.json in the session directory. Wire format matches the --sensor-overlay flag the ynh-agent loop driver will consume in Phase 1. Also fixes a rebase regression where addAgentTerminal referenced safePaste and backend members removed from NewTerminalDefaults in the 0.10.0 merge. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…cales 22 new keys for the per-turn Inspector UI (last sensors strip, turn headers, sensor pass/fail labels) and the sensor overlay editor modal (title, role picker, prompt fields, source kind badges, error states). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds the awaitingTurnApproval UI flow: the loop driver emits a turn_approval_required event carrying the synthesized feedback it would inject as the next worker turn; TermQ surfaces it in an editable section above the trajectory. The user can review, edit, and send — the edited text is transmitted via replace_feedback before approve_turn, so older loop driver builds ignore the edit safely while Phase 1 will honour it. handleStreamEnd now treats awaitingTurnApproval the same as awaitingPlanApproval: driver death mid-approval flips the card to errored rather than inferring from exit code. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 new keys: turn approval section title (with turn number), Send button, and Reset button. Localizer flagged Greek "ανατροδοότησης" as an unusual back-formation for "feedback" — native review advised. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds parallel fleet sessions: N cards share a fleetId, each launched with a distinct worktree path. The fleet sidebar groups sessions under a collapsible header showing aggregate status and a "Promote winner" shortcut for converged sessions. AgentSessionController.resolveCommand() injects the active sensor overlay JSON as --sensor-overlay before launching ynh agent run, using single-quote shell escaping for safe /bin/sh -c invocation. Adds cursor to AgentBackend and fleetId to AgentConfig (backward- compatible JSON decode, defaults to nil). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds the Fleet section (21 strings) to all 39 locale files including aggregate status, promote-winner labels, and launch sheet fields. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
a1ab07b to
e3dfc9f
Compare
3 tasks
Adds a read-only trajectory replay view (AgentTranscriptViewerView) for reviewing past sessions or importing CI JSONL artifacts. An "Open Transcript" button in the Agent Sessions sidebar header triggers a file importer; the viewer shows the same turn-grouped sensor/event layout as the live Inspector. Extracts shared trajectory view components (TurnGroupView, SensorResultRow, TrajectoryEventRow, TurnGroup, buildTurnGroups, eventSummary) into AgentTrajectoryComponents.swift so both the live Inspector and the replay viewer share one implementation. Adds .github/workflows/agent.yml — manual-dispatch workflow that runs ynh agent run in the ghcr.io/eyelock/ynh Docker image and uploads the trajectory JSONL as an artifact for local replay. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a read-only trajectory replay view (AgentTranscriptViewerView) for reviewing past sessions or importing CI JSONL artifacts. An "Open Transcript" button in the Agent Sessions sidebar header triggers a file importer; the viewer shows the same turn-grouped sensor/event layout as the live Inspector. Extracts shared trajectory view components (TurnGroupView, SensorResultRow, TrajectoryEventRow, buildTurnGroups) into AgentTrajectoryComponents.swift — shared by Inspector and viewer. Adds .github/workflows/agent.yml — manual-dispatch workflow that runs ynh agent run in the ghcr.io/eyelock/ynh Docker image and uploads the trajectory JSONL as an artifact for local replay. Fleet i18n: adds fleet.open.transcript.help to all 39 locales. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Full implementation of the TermQ agent loop layer, wiring `ynh agent run` as the loop driver through a native SwiftUI Inspector, approval gates, sensor overlay injection, parallel fleet sessions, and a transcript viewer for replaying CI artifacts locally.
Changes
Core types (
TermQCore)AgentConfig: addsfleetId: UUID?(backward-compatible JSON decode),cursorbackend,loopDriverCommandper-card overrideTrajectoryEventPayloadschema matchingynh agent runNDJSON wire formatServices
AgentLoopProcess: subprocess launcher wrappingynh agent run, NDJSON stdout parser, stdin control channelAgentSessionController.resolveCommand(): injects active sensor overlay JSON as--sensor-overlaybefore launch, single-quote shell-safe.jsonlsidecar; hydrated on Inspector openAgentConfig.statuson the cardUI — Inspector
awaitingPlanApprovalshows approve/reject sheetconfirminteraction modeUI — Fleet
AgentFleetLaunchSheet: harness ID, task, session count (2–5), base worktree pathBoardViewModel.createFleet(): creates N cards sharing afleetId, each with a distinct worktree path and pre-builtloopDriverCommandUI — Transcript viewer (Phase 6)
AgentTranscriptViewerView: read-only replay of any.jsonltrajectory fileAgentTrajectoryComponents.swift(reused by live Inspector and viewer)Settings
@AppStorage("agent.loopDriverCommand")) with per-card overrideMCP
termq_agentstool exposes live agent sessions to MCP clientsCI
.github/workflows/agent.yml: manual-dispatch workflow runningynh agent runin theghcr.io/eyelock/ynhDocker image; trajectory JSONL uploaded as artifact (30-day retention) for local replay in the transcript viewerLocalization
Wire format contract (YNH side)
ynh agent runemits NDJSON with discriminator"type", timestamp"timestamp". Events:session_start,turn_start,turn_end,plan_ready,turn_approval_required,budget_exceeded,session_end,error.Control stdin:
approve_plan,reject_plan,approve_turn,replace_feedback,interrupt.Testing
make checkpasses (build + lint + format + tests)ynh agent runagainst a real harness (scheduled for tonight's test session)Related
feat/yna-loop-agentPR feat(ui): Harness and Marketplace sidebars adopt tree layout with grouping #154 — loop driver, wire format, Claude Code / Codex / Cursor backends,--sensor-overlayflag (installed locally, test before merging)🤖 Generated with Claude Code