feat(agent): long-running task loop with LLM-planned proposals by robbie-c · Pull Request #2264 · PostHog/code

robbie-c · 2026-05-20T21:22:29Z

Summary

Adds an auto-continuing turn loop the agent can enter for tasks with an objective, measurable success criterion (test pass count, bundle size, error count, etc.). Inspired by Anthropic's Ralph Wiggum plugin but redesigned around the desktop agent framework instead of a bash loop.

How it works

User types /long-running-task <description> (or the agent self-initiates).
A brief planning prompt goes to the agent, with planning instructions attached as hidden _meta context. The agent explores first — reads relevant files, runs measurement commands, asks clarifying questions via AskUserQuestion only when intent is genuinely ambiguous.
The agent emits a proposal as a <long-running-task-config>{...}</long-running-task-config> JSON block (goal, successCriterion, marker, maxIterations, approach).
The renderer parses the block on end_turn and shows an inline confirmation card above the prompt input — editable fields, Start / Edit / Cancel.
On Start, the renderer calls a new tRPC mutation that sets server-side state, then sends a kickoff prompt.
After each end_turn while the task is active, the loop in claude-agent.ts pushes a continuation user message into session.input and keeps the while-loop alive instead of returning. The agent exits when:
- It emits the configured marker (default <TASK_COMPLETE>) in its assistant text
- It hits the max-iteration cap (default 20) — gets one wrap-up turn before exiting
- The user clicks Stop loop in the pill
- The user cancels normally
Mid-loop user messages pause auto-continuation for that turn so the user can steer.

A pill above the prompt input shows iteration count + goal + Stop button while a task is active.

Files of note

packages/agent/src/adapters/claude/long-running-task/utils.ts — schemas, continuation builder, proposal parser, broadcast helpers
packages/agent/src/adapters/claude/claude-agent.ts — loop hook in the result.subtype === "success" branch + new START_LONG_RUNNING_TASK / STOP_LONG_RUNNING_TASK extMethods
packages/agent/src/adapters/claude/session/instructions.ts — appended system prompt teaching the proposal workflow
apps/code/src/renderer/components/long-running-task/ — pill + proposal card
apps/code/src/renderer/stores/longRunningTaskStore.ts — Zustand store keyed by taskRunId

Known limitations

Local sessions only. The slash command early-returns with a toast on cloud sessions — extMethod doesn't work over the cloud path.
Sentinel marker is trust-based. The agent could in theory emit the marker without actually verifying. The system prompt addendum is explicit about this; the iteration cap is the hard safety net. We can revisit with structural verification later if the trust-based version misfires.

Test plan

/long-running-task with an iterative task (e.g. fix N failing tests). Agent explores, asks if needed, emits proposal block, card renders.
Approve the proposal card → pill appears with 0/20, loop kicks in.
Counter increments on each end_turn until marker is emitted or cap is hit.
Edit the proposal in the card before approving — edited values persist when starting.
Click Stop loop mid-flight — current turn finishes, no more continuations injected.
Send a steering message mid-loop — agent responds to user, loop stays active and continues after.
Hit the iteration cap deliberately (set max=2 in a clearly-impossible task) — wrap-up turn runs.
/long-running-task on a cloud task → toast says not supported.
Cancel the entire prompt mid-loop → loop exits cleanly.

Created with PostHog Code

Adds an auto-continuing turn loop that the agent enters when the user approves a proposal block. While active, the harness pushes a continuation user message after each end_turn until the agent emits a configured marker, hits a max-iteration cap, or the user steers/stops. The /long-running-task slash command sends a brief planning prompt with hidden context — the agent explores the codebase, asks clarifying questions via AskUserQuestion when intent is ambiguous, then emits the config as a <long-running-task-config> JSON block. The renderer parses that block, shows an inline confirmation card (editable goal/criterion/marker/max), and on Start sets the server-side state via a new tRPC mutation + sends the kickoff prompt. While a task is active, a pill above the prompt input shows iterations N/M + goal + a Stop button. Mid-loop user messages pause auto-continuation for that turn so the user can steer. Generated-By: PostHog Code Task-Id: c1333c34-d768-47e2-9dda-d2e456ab701a

Refactors the long-running task helpers to be adapter-agnostic and adds the same loop primitive to the codex adapter. - decideLongRunningTaskStep now returns the continuation text instead of pushing it; caller routes it through the right turn mechanism (session.input push for claude, new PromptRequest for codex) - LRT system-prompt instructions extracted to a shared module that both adapters append to their system prompt - codex runPrompt wraps codexConnection.prompt() in a while loop — each iteration is a fresh ACP turn, no broadcast on continuations so the user only sees the original prompt in chat - codex-client forwards every session notification to the session's notificationHistory so getLatestAssistantText finds the agent's text for marker + proposal detection - codex extMethod handles START / STOP and emits the same update notifications the renderer already listens for - codex spawn instructions get LRT instructions appended, mirroring the Claude APPENDED_INSTRUCTIONS path Generated-By: PostHog Code Task-Id: c1333c34-d768-47e2-9dda-d2e456ab701a

robbie-c added 2 commits May 20, 2026 22:40

robbie-c force-pushed the posthog-code/long-running-task branch from 9246890 to fa85fa1 Compare May 20, 2026 21:47

robbie-c closed this May 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(agent): long-running task loop with LLM-planned proposals#2264

feat(agent): long-running task loop with LLM-planned proposals#2264
robbie-c wants to merge 2 commits into
mainfrom
posthog-code/long-running-task

robbie-c commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

robbie-c commented May 20, 2026

Summary

How it works

Files of note

Known limitations

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant