Skip to content

feat(agent): agent loop — ynh agent run integration, fleet UI, and Inspector#248

Draft
eyelock wants to merge 30 commits into
developfrom
feat/coding-agent
Draft

feat(agent): agent loop — ynh agent run integration, fleet UI, and Inspector#248
eyelock wants to merge 30 commits into
developfrom
feat/coding-agent

Conversation

@eyelock
Copy link
Copy Markdown
Owner

@eyelock eyelock commented Apr 29, 2026

Summary

Full implementation of the TermQ agent loop layer, wiring `ynh agent run` as the loop driver through a native SwiftUI Inspector, approval gates, sensor overlay injection, parallel fleet sessions, and a transcript viewer for replaying CI artifacts locally.

Changes

Core types (TermQCore)

  • AgentConfig: adds fleetId: UUID? (backward-compatible JSON decode), cursor backend, loopDriverCommand per-card override
  • Typed TrajectoryEventPayload schema matching ynh agent run NDJSON wire format

Services

  • AgentLoopProcess: subprocess launcher wrapping ynh agent run, NDJSON stdout parser, stdin control channel
  • AgentSessionController.resolveCommand(): injects active sensor overlay JSON as --sensor-overlay before launch, single-quote shell-safe
  • Trajectory persistence: events written to .jsonl sidecar; hydrated on Inspector open
  • Status writer: trajectory events drive AgentConfig.status on the card

UI — Inspector

  • Live trajectory view with turn-by-turn breakdown
  • Sensor overlay editor (per-session, per-sensor JSON overrides)
  • Plan/act approval gate: awaitingPlanApproval shows approve/reject sheet
  • Turn approval section for confirm interaction mode
  • Graceful stop: interrupt signal then SIGTERM fallback

UI — Fleet

  • AgentFleetLaunchSheet: harness ID, task, session count (2–5), base worktree path
  • BoardViewModel.createFleet(): creates N cards sharing a fleetId, each with a distinct worktree path and pre-built loopDriverCommand
  • Sidebar fleet grouping: collapsible headers with aggregate status colour + "Promote winner" shortcut for converged sessions

UI — Transcript viewer (Phase 6)

  • AgentTranscriptViewerView: read-only replay of any .jsonl trajectory file
  • "Open Transcript" button in sidebar header triggers a file importer
  • Shows session summary (harness, total turns/tokens, exit code) + full turn-grouped event list
  • Shared trajectory view components extracted to AgentTrajectoryComponents.swift (reused by live Inspector and viewer)

Settings

  • Agent Loop section in Settings → Tools
  • Global loop driver command (@AppStorage("agent.loopDriverCommand")) with per-card override

MCP

  • termq_agents tool exposes live agent sessions to MCP clients

CI

  • .github/workflows/agent.yml: manual-dispatch workflow running ynh agent run in the ghcr.io/eyelock/ynh Docker image; trajectory JSONL uploaded as artifact (30-day retention) for local replay in the transcript viewer

Localization

  • All strings translated to 39 locales

Wire format contract (YNH side)

ynh agent run emits NDJSON with discriminator "type", timestamp "timestamp". Events: session_start, turn_start, turn_end, plan_ready, turn_approval_required, budget_exceeded, session_end, error.
Control stdin: approve_plan, reject_plan, approve_turn, replace_feedback, interrupt.

Testing

  • make check passes (build + lint + format + tests)
  • Overlay injection unit tests (4 cases including single-quote escaping)
  • Fleet codable round-trip and backward-compat decode tests
  • Cursor backend raw-value stability test
  • End-to-end: run ynh agent run against a real harness (scheduled for tonight's test session)

Related

🤖 Generated with Claude Code

@eyelock eyelock marked this pull request as draft April 29, 2026 15:53
David Collie and others added 28 commits May 11, 2026 10:35
First slice of the agent loop capability (see
.claude/plans/2026-04-29-feat-agent-loop.md). Pure data model; no
behavior changes.

Adds AgentBackend, AgentMode, AgentInteractionMode, AgentStatus,
AgentBudget, and AgentConfig as Codable/Sendable value types in
TermQCore. Extends TerminalCard with one optional `agentConfig` field —
non-agent cards stay nil; pre-agent JSON decodes cleanly.

Tests cover defaults, Codable round-trip, stable raw-value wire format,
and legacy-JSON backward compatibility.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Second slice of the agent loop capability (see
.claude/plans/2026-04-29-feat-agent-loop.md). Pure UI scaffolding;
feature-flagged off by default behind `feature.agentTab`.

Adds an AgentSessionsSidebarTab placeholder view (header + empty state)
and refactors SidebarView's tab picker around a `visibleTabs` computed
property so optional tabs gated by feature flags compose without
touching the picker each time. Live data wiring lands in a later slice.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Third slice of the agent loop capability (see
.claude/plans/2026-04-29-feat-agent-loop.md). UI-only.

The Agent Sessions tab now reads from BoardViewModel and renders one row
per card whose agentConfig is set. Each row shows title, qualified
harness name, and a colour-coded status pill reflecting AgentStatus.
Empty state preserved when no agent cards exist.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fourth slice of the agent loop capability (see
.claude/plans/2026-04-29-feat-agent-loop.md).

Adds BoardViewModel.createAgentCard which creates a TerminalCard with
agentConfig populated from a harness identifier. Surfaces a
"Launch as Agent" context-menu item on harness sidebar rows; gated on
the feature.agentTab flag (the closure is nil when the agent tab is
off, so the menu item disappears). After creation, the sidebar
switches to the Agent Sessions tab so the new card is immediately
visible.

The new BoardViewModel method takes primitives (harnessId, title,
description) rather than the Harness struct to avoid a Column type
collision between TermQCore.Column (class) and TermQShared.Column
(struct), which would surface as ambiguous-symbol errors elsewhere in
the file if BoardViewModel imported TermQShared.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fifth slice of the agent loop capability (see
.claude/plans/2026-04-29-feat-agent-loop.md). UI-only.

Wraps each agent-session row in a plain-style Button so clicking
selects the card via the existing BoardViewModel.selectCard plumbing.
Selected rows highlight with an accent-coloured background and
elevated harness-subtitle contrast, matching the visual idiom of
other selectable list rows in TermQ.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sixth slice of the agent loop capability (see
.claude/plans/2026-04-29-feat-agent-loop.md). UI-only.

Selecting a card whose agentConfig is set now routes to a new
AgentInspectorView in the main pane instead of attempting to attach a
terminal. The skeleton surfaces session config (backend, mode,
interaction, formatted budget, short session id) and a trajectory
placeholder. Run / Stop buttons are present but disabled with a
"Loop driver not yet wired" tooltip; later slices fill in the live
per-turn sensor + feedback wire as the loop driver lands.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Seventh slice of the agent loop capability (see
.claude/plans/2026-04-29-feat-agent-loop.md).

Surfaces the feature.agentTab flag through Settings → Tools instead of
requiring `defaults write`. Mirrors the Strings.Settings.Ynh /
ynhSection pattern: a short description, single toggle bound to
feature.agentTab, help tooltip explaining what the tab adds.

English-only strings landed; localization fan-out to the ~30 other
locales is a follow-up via /localization.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Eighth slice of the agent loop capability (see
.claude/plans/2026-04-29-feat-agent-loop.md).

Fans the four `settings.agent.*` keys added in the previous slice out
to all 39 non-English locales via the localizer agent. Each file gets
a "// MARK: - Settings — Agent Loop" block immediately before the
existing YNH section, with native translations matching the tone of
nearby strings.

Items flagged for native review by the localizer agent:
 - zh-HK: 测證器 (sensors) — confirm preferred term vs. zh-Hans/Hant
 - he: niqqud on "Experimental" may be inconsistent with file style
 - ko: kept "Agent" as English token to match surrounding YNH strings

Zero NEEDS TRANSLATION markers per project policy.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Ninth slice of the agent loop capability (see
.claude/plans/2026-04-29-feat-agent-loop.md).

AgentLoopProcess is an actor that spawns a long-running NDJSON-emitting
subprocess (the future ynh-agent binary, or a test stub today) and
streams parsed TrajectoryEvent values via AsyncStream. Supports
send(line:) for stdin feedback and stop() for SIGTERM termination.

Also adds the TrajectoryEvent value type to TermQCore: type and
timestamp parsed from the line; the original JSON preserved in
payloadJSON for downstream typed decoding (typed event schemas
land in the next slice).

Lifecycle synchronisation point worth noting: stdout drain and process
termination race naturally. Initial implementation finished the stream
from the termination handler and lost trailing events. Fixed via a
termination continuation so readStdoutLines awaits termination after
EOF — guaranteeing status == .exited by the time consumers see the
stream end.

Tests cover parseLine on its own plus subprocess streaming, invalid
lines dropped, double-start, send-before-start.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tenth slice of the agent loop capability (see
.claude/plans/2026-04-29-feat-agent-loop.md).

Defines the wire-format contract between TermQ and the future ynh-agent
loop driver binary. TrajectoryEventPayload is an enum with typed cases
for the eight events whose shapes are confident at this stage —
session_start, plan, turn_start, sensor_result, stuck_detected,
budget_exceeded, converged, session_end — plus a forward-compatible
.other(type:json:) fallback so loop-driver upgrades can introduce new
event types without breaking TermQ.

TrajectoryEvent.decoded() dispatches on the type discriminator and
gracefully degrades to .other when known types have missing or
malformed required fields.

Wire format snake_case keys match the YNH structured-output convention.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Eleventh slice of the agent loop capability (see
.claude/plans/2026-04-29-feat-agent-loop.md).

Adds AgentSessionController (@mainactor ObservableObject, one per agent
card) that wraps AgentLoopProcess and exposes @published events and
status arrays for SwiftUI binding. AgentSessionRegistry vends one
controller per card.id so the Inspector and any other surface looking
at the same card observe the same stream.

Inspector now reads agent.loopDriverCommand from UserDefaults; Run
spawns `/bin/sh -c "<cmd>"` so the same key works for both production
(path to ynh-agent + args) and development (a one-line stub script).
Run / Stop buttons are gated on canRun + isRunning. The trajectory
section renders incoming events live with timestamp + type + a typed
per-variant summary line built from TrajectoryEvent.decoded().

Tests cover the controller's stream-into-events lifecycle, event
clearing on re-run, reset(), and registry per-id semantics.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Twelfth slice of the agent loop capability (see
.claude/plans/2026-04-29-feat-agent-loop.md).

The controller now mirrors the loop driver's terminal-state events into
the card's agentConfig.status so the sidebar status pill reflects the
session in real time. Mapping:

  start()                  → .running
  event "converged"        → .converged
  event "stuck_detected"   → .stuck
  event "budget_exceeded"  → .errored
  stream ended, exit 0,
    no terminal event seen → .converged (inferred)
  stream ended, exit != 0,
    no terminal event seen → .errored (inferred)
  stream ended after a
    terminal event already
    flipped the status     → unchanged (no-downgrade guard)

Card lookup goes through an injectable closure (default uses
BoardViewModel.shared.card(for:)) so the seven new tests can drive a
synthetic TerminalCard without touching the singleton.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Thirteenth slice of the agent loop capability (see
.claude/plans/2026-04-29-feat-agent-loop.md).

Adds TrajectoryWriter — opens an append-mode FileHandle on
<appSupport>/TermQ[-Debug]/agent-sessions/<sessionId>/trajectory.jsonl
and writes each event's payloadJSON one line at a time. Failures are
silent; the in-memory event stream remains the source of truth.

The controller picks up the writer via an injectable factory so tests
keep persistence pointed at a temp directory. AgentSessionRegistry
wires the production factory; ad-hoc controllers (used by all existing
unit tests) default to writerFactory == nil so the test suite never
touches the real app-support directory.

Sets up the replay surface for a future Transcript viewer and
CI-artifact ingest. Six new tests cover writer behaviour and the
controller-side persistence wiring.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fourteenth slice of the agent loop capability (see
.claude/plans/2026-04-29-feat-agent-loop.md).

When the Inspector opens for an agent card, the controller now reads
the per-session trajectory.jsonl off disk (if one exists) and
populates the events array. Effect: past sessions render in the
trajectory list across app restarts and when navigating between
agent cards.

loadPersistedEvents() is a no-op when events are already populated,
when the session is currently running, when the card has no agent
config, or when no file exists — so it's safe to call from .onAppear
and .onChange(of:) without guards in the view.

TrajectoryWriter exposes a public fileURL(for:baseDirectory:) helper
for symmetric read access. The controller takes a transcriptBaseURL
override for tests; production resolves the same default location as
the writer.

Four new tests cover the populate path, no-op cases, and a card
without agent config.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fifteenth slice of the agent loop capability (see
.claude/plans/2026-04-29-feat-agent-loop.md).

Implements the safety-model plan/act flow: when the loop driver emits
a `plan` event, the controller flips card status to
`.awaitingPlanApproval` and the Inspector renders an orange-tinted
panel above the trajectory containing the plan content (monospaced,
selectable, scrollable up to 320pt) plus Reject / Approve buttons
(Approve bound to the default action).

Two new controller methods, both gated on the awaitingPlanApproval
state:

  approvePlan() — writes {"action":"approve_plan"} to the driver's
  stdin and flips status back to .running; the driver is expected to
  resume work.

  rejectPlan() — writes {"action":"reject_plan"}, then SIGTERMs the
  driver and flips to .errored.

This commits the TermQ↔ynh-agent control protocol: stdin carries
NDJSON action messages. Future actions (interrupt, edit-feedback,
sensor-overlay-replace) extend the same surface.

Four tests cover: plan event flips status; approvePlan returns to
running; rejectPlan terminates with .errored; approvePlan no-op when
not awaiting. The two integration tests use a long-running stub
process and force the status manually rather than try to game shell
pipe-buffering after `read`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Seventeenth slice of the agent loop capability (see
.claude/plans/2026-04-29-feat-agent-loop.md).

Lets the user edit a card's agentConfig via Settings → Edit Card. The
new Agent tab is gated on viewModel.hasAgentConfig so it only shows
for cards launched as agent sessions; non-agent cards see no change.

Editable knobs:
 - Backend (Claude Code / Codex)
 - Mode (plan / act)
 - Interaction (auto / confirm / tweak)
 - Max turns (Stepper 1–500, step 5)
 - Max tokens (Stepper, displayed as 200k / 1M)
 - Max wall-clock minutes (Stepper 1–720, step 5)

Identity fields (sessionId, harness, status) are NOT exposed and are
preserved on save — verified by a test. A non-agent card with non-
default values in the agent form fields does NOT have an agentConfig
injected on save — also verified.

Localization: 9 new keys in en.lproj only; other locales fall back to
English. /localization fan-out is a follow-up slice when convenient.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Eighteenth slice of the agent loop capability (see
.claude/plans/2026-04-29-feat-agent-loop.md).

Fans the nine `editor.agent.*` keys added in slice 17 out to all 39
non-English locales via the localizer agent. Each file gets a
"// MARK: - Editor — Agent" block placed immediately before the
existing "// MARK: - Settings — Agent Loop" block, with native
translations matching the tone of nearby editor strings.

Notable judgement calls (preserved here for traceability):
 - Term "Agent" kept as English loanword in CJK, RTL, Greek, and most
   European locales where settings.agent.section already treats it as
   technical; translated only where existing patterns Koreanise/
   Japanise/Cyrillicise it (ja エージェント, ko 에이전트, ru/uk Агент,
   etc.).
 - "Backend" reused per-locale from harnesses.launch.backend so the
   editor matches the harness launch sheet.
 - "tokens" left untransliterated except in CJK (トークン数 / 토큰 수)
   and Arabic (الرموز).
 - Wall-clock minute abbreviation matches each locale's convention:
   (мин), (хв), (分), (분), (د), (λεπτ.), (dk), (דק׳), (นาที),
   (phút), (mnt), (perc), (मिनट), default (min).

Also fixes a flaky test that surfaced once the broader suite ran:
testEvent_planFlipsCardToAwaitingApproval relied on observing an
intermediate status while a transient subprocess was alive, but the
stub exits before the polling loop can catch it, after which
handleStreamEnd flips status to .converged. Reworked as a direct unit
test that injects a synthetic plan event into handleEventForCardStatus
(internal access via @testable). The race is a test artefact only —
in production the loop driver doesn't exit while awaiting plan
approval.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nineteenth slice of the agent loop capability (see
.claude/plans/2026-04-29-feat-agent-loop.md).

The handleStreamEnd no-downgrade guard previously checked only
.converged / .stuck / .errored. If the loop driver died while the
user was still deciding on a plan (.awaitingPlanApproval), the
default path inferred .converged on exit 0 — visually marking a hung
session as successful, which is wrong.

Adds an explicit awaitingPlanApproval branch: if the stream ends
mid-approval the session is dead and the card flips to .errored
regardless of exit code. Surfaced as a side observation when slice 18
fixed the related test flake.

Test verifies the new branch end-to-end.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Twentieth slice of the agent loop capability (see
.claude/plans/2026-04-29-feat-agent-loop.md).

Adds loopDriverCommand: String to AgentConfig — empty string means
"inherit the global agent.loopDriverCommand UserDefault". Custom
init(from:) keeps backward compatibility with pre-slice-20 saved JSON
(decodeIfPresent ?? "").

Editor surfaces it in the Agent tab as a multi-line monospace
TextField with placeholder "Inherit global default" and a help
tooltip that explains the inheritance rule. Identity fields (sessionId,
harness, status) remain non-editable; budget and mode/interaction/
backend pickers from slice 17 are unchanged.

Inspector now resolves the effective command:
  - per-card (trimmed, non-empty) wins
  - else fall back to the global @AppStorage
canRun and the Run button click both go through effectiveCommand.
Help text on disabled Run reflects both surfaces.

Three new tests:
  - AgentConfig defaults loopDriverCommand to ""
  - AgentConfig backward compat: legacy JSON missing the field decodes
  - Editor load/save round-trips the per-card override
  - Editor load on a non-agent card clears any stale string

English-only string for the new field; locale fan-out is a follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Twenty-second slice of the agent loop capability (see
.claude/plans/2026-04-29-feat-agent-loop.md).

Fans the three loop-driver-command keys added in slice 20 out to all
39 non-English locales via the localizer agent. Three keys per file
(label, placeholder, help) added inside the existing
"// MARK: - Editor — Agent" block.

Notable judgement calls:
 - ja: placeholder uses "グローバルデフォルトを使用" ("use global
   default") rather than literal "inherit" — Japanese UI copy avoids
   the inheritance metaphor for settings fields.
 - ar: "inherit" has no concise UI idiom; rendered as
   "استخدام الإعداد الافتراضي العام" (use the global default).
 - de: "Binary" kept as English borrowed noun in help text (matches
   the terse style of existing German agent strings).
 - "ynh-agent" and "agent.loopDriverCommand" preserved verbatim
   across all locales (product/key names).

Zero NEEDS TRANSLATION markers per project policy.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Twenty-third slice of the agent loop capability (see
.claude/plans/2026-04-29-feat-agent-loop.md).

Adds an MCP tool `termq_agents` so a Claude Code (or other MCP-client)
session running in TermQ can discover and inspect the agent sessions
running alongside it. Returns each card's id, name, column, working
directory, and an `agent` block carrying sessionId, harness, backend,
mode, interactionMode, status, budget, and any per-card
loopDriverCommand override. Optional `status` argument filters by
AgentStatus raw value.

To support this:
 - New `AgentConfigSummary` (Sendable, in TermQShared) parallels
   TermQCore.AgentConfig the same way Card parallels TerminalCard.
   Wire format is identical so a single board.json agentConfig block
   decodes cleanly into either type.
 - `Card` (TermQShared) gains an optional `agentConfig` field;
   custom decoder uses decodeIfPresent so legacy cards still parse.
 - `AgentSessionOutput` in OutputTypes provides the JSON shape the
   tool returns; failable initialiser collapses non-agent cards to
   nil at the call site.

Six new TermQSharedTests cover round-trip, legacy JSON, Card decoding
with and without agentConfig, and AgentSessionOutput's nil-for-non-
agent behaviour.

Existing toolCount assertions updated 10 → 11.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Twenty-fourth slice of the agent loop capability (see
.claude/plans/2026-04-29-feat-agent-loop.md).

stop() now sends an `interrupt` action over stdin (per the slice-16
control protocol) and waits a short grace period before falling back
to SIGTERM. Default grace 1.5s; tunable via graceSeconds parameter.

Effect: the loop driver gets a chance to flush buffered trajectory
state (incomplete sensor runs, pending feedback) and emit a clean
session_end event before being killed. Reserved for the user-driven
Stop button; the rejectPlan flow still fires SIGTERM directly because
rejection is decisive.

Test verifies the graceful path: a stub that exits on interrupt sees
exit code 0, never receives SIGTERM. Existing tests that use stop()
purely as cleanup pass `graceSeconds: 0` so they don't add wall-clock
to the suite.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the flat raw-event trajectory list with a turn-grouped view:
sensor results now render with colour-coded ✓/✗ icons, duration, and
summary text; each turn gets a header row showing pass/fail counts. A
"Last Sensors" strip above the trajectory shows the most recent run
results at a glance without scrolling.

Adds the sensor overlay editor: a sheet that loads declared sensors
from ynh, lets the user set a role override and (for focus sensors)
edit the prompt inline against the harness baseline, and persists
the result as sensor-overlays.json in the session directory. Wire
format matches the --sensor-overlay flag the ynh-agent loop driver
will consume in Phase 1.

Also fixes a rebase regression where addAgentTerminal referenced
safePaste and backend members removed from NewTerminalDefaults in
the 0.10.0 merge.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…cales

22 new keys for the per-turn Inspector UI (last sensors strip, turn
headers, sensor pass/fail labels) and the sensor overlay editor modal
(title, role picker, prompt fields, source kind badges, error states).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds the awaitingTurnApproval UI flow: the loop driver emits a
turn_approval_required event carrying the synthesized feedback it
would inject as the next worker turn; TermQ surfaces it in an
editable section above the trajectory. The user can review, edit,
and send — the edited text is transmitted via replace_feedback
before approve_turn, so older loop driver builds ignore the edit
safely while Phase 1 will honour it.

handleStreamEnd now treats awaitingTurnApproval the same as
awaitingPlanApproval: driver death mid-approval flips the card to
errored rather than inferring from exit code.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 new keys: turn approval section title (with turn number), Send
button, and Reset button. Localizer flagged Greek "ανατροδοότησης"
as an unusual back-formation for "feedback" — native review advised.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds parallel fleet sessions: N cards share a fleetId, each launched
with a distinct worktree path. The fleet sidebar groups sessions under
a collapsible header showing aggregate status and a "Promote winner"
shortcut for converged sessions.

AgentSessionController.resolveCommand() injects the active sensor
overlay JSON as --sensor-overlay before launching ynh agent run,
using single-quote shell escaping for safe /bin/sh -c invocation.

Adds cursor to AgentBackend and fleetId to AgentConfig (backward-
compatible JSON decode, defaults to nil).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds the Fleet section (21 strings) to all 39 locale files including
aggregate status, promote-winner labels, and launch sheet fields.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@eyelock eyelock force-pushed the feat/coding-agent branch from a1ab07b to e3dfc9f Compare May 12, 2026 05:37
@eyelock eyelock changed the title feat(agent): add Agent Loop capability foundation feat(agent): agent loop — ynh agent run integration, fleet UI, and Inspector May 12, 2026
David Collie and others added 2 commits May 12, 2026 07:05
Adds a read-only trajectory replay view (AgentTranscriptViewerView)
for reviewing past sessions or importing CI JSONL artifacts. An
"Open Transcript" button in the Agent Sessions sidebar header
triggers a file importer; the viewer shows the same turn-grouped
sensor/event layout as the live Inspector.

Extracts shared trajectory view components (TurnGroupView,
SensorResultRow, TrajectoryEventRow, TurnGroup, buildTurnGroups,
eventSummary) into AgentTrajectoryComponents.swift so both the live
Inspector and the replay viewer share one implementation.

Adds .github/workflows/agent.yml — manual-dispatch workflow that
runs ynh agent run in the ghcr.io/eyelock/ynh Docker image and
uploads the trajectory JSONL as an artifact for local replay.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a read-only trajectory replay view (AgentTranscriptViewerView)
for reviewing past sessions or importing CI JSONL artifacts. An
"Open Transcript" button in the Agent Sessions sidebar header
triggers a file importer; the viewer shows the same turn-grouped
sensor/event layout as the live Inspector.

Extracts shared trajectory view components (TurnGroupView,
SensorResultRow, TrajectoryEventRow, buildTurnGroups) into
AgentTrajectoryComponents.swift — shared by Inspector and viewer.

Adds .github/workflows/agent.yml — manual-dispatch workflow that
runs ynh agent run in the ghcr.io/eyelock/ynh Docker image and
uploads the trajectory JSONL as an artifact for local replay.

Fleet i18n: adds fleet.open.transcript.help to all 39 locales.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant