Skip to content

Commit 3b809b9

Browse files
authored
Merge branch 'main' into run-store-write-adapter
2 parents 76f3494 + ca43ab8 commit 3b809b9

2 files changed

Lines changed: 41 additions & 1 deletion

File tree

docs/ai-chat/custom-agents.mdx

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -179,6 +179,38 @@ for await (const turn of session) {
179179
}
180180
```
181181

182+
## Stopping generation
183+
184+
The frontend stops a turn with [`transport.stopGeneration(chatId)`](/ai-chat/frontend#stop-generation), which writes a stop signal to the session's input stream. It aborts the current turn's generation but keeps the run alive, so the next message continues on the same session.
185+
186+
`turn.signal` is a combined stop-and-cancel `AbortSignal`, fresh each turn. Pass it to `streamText` so the stop reaches the model, then let `turn.complete()` finish the turn:
187+
188+
```ts trigger/my-chat.ts
189+
for await (const turn of session) {
190+
const result = streamText({
191+
model: anthropic("claude-sonnet-4-5"),
192+
messages: turn.messages,
193+
abortSignal: turn.signal, // fires on a user stop OR a run cancel
194+
stopWhen: stepCountIs(15),
195+
});
196+
197+
await turn.complete(result);
198+
199+
if (turn.stopped) {
200+
// user stopped this turn — the partial response is already accumulated
201+
}
202+
}
203+
```
204+
205+
On a stop, `turn.complete()` cleans up the aborted parts of the partial response, accumulates it as its own assistant message, and writes turn-complete. The run does not end — the loop continues to the next turn.
206+
207+
Read `turn.stopped` to tell a user stop from a full run cancel:
208+
209+
- **User stop** (`transport.stopGeneration`): `turn.signal` aborts, `turn.stopped` is `true`, the partial response is accumulated, and the run stays alive for the next message.
210+
- **Run cancel** (cancelled, expired, or `maxDuration` exceeded): `turn.signal` aborts, `turn.stopped` is `false`, and `turn.complete()` returns without accumulating because the run is ending.
211+
212+
A hand-rolled loop wires this itself with `chat.createStopSignal()` and `chat.cleanupAbortedParts()`. Two things `createSession` handles for you are easy to get wrong there — see the [hand-rolled loop checklist](#hand-rolled-loop-checklist).
213+
182214
## Hand-rolled loop with primitives
183215

184216
For full control, skip `createSession` and compose the primitives directly:

docs/ai-chat/patterns/human-in-the-loop.mdx

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ Turn N:
2020
LLM streams text → calls askUser tool (no execute)
2121
streamText ends with tool-call in `input-available` state
2222
onTurnComplete fires (finishReason = "tool-calls")
23-
Agent idle
23+
Agent suspends (compute freed) — maxDuration does not tick while paused
2424
2525
Frontend:
2626
Renders question + option buttons from tool input
@@ -36,6 +36,14 @@ Turn N+1:
3636

3737
The AI SDK's `toUIMessageStream` automatically reuses the assistant message ID across the pause (we pass `originalMessages` internally), so `responseMessage` in the post-resume `onTurnComplete` is the **full merged message** — the original text, the completed tool call, and any follow-up content — not just the new parts.
3838

39+
## Duration and cost while paused
40+
41+
A pause doesn't hold compute. After the model calls a no-execute tool, the turn finishes and the run stays warm for `idleTimeoutInSeconds` (default 30s), then **suspends** and frees its compute, the same way [`wait.for`](/wait-for) does. The user's `addToolOutput` wakes it back up.
42+
43+
Because the run is suspended while it waits, the human's thinking time is not billed and does **not** count against [`maxDuration`](/runs/max-duration). `maxDuration` measures active CPU time and excludes suspended waitpoint time, exactly like `wait.for`, so a user can take minutes, hours, or days to answer without the run hitting `maxDuration`. The only time that counts is each turn's actual compute plus the short warm window before each suspend.
44+
45+
You don't need to raise `maxDuration` or end the run to support long human waits. How long a single suspended pause stays open is governed by the run's suspend timeout, not `maxDuration`; if a wait outlives it the run ends, and the next `addToolOutput` boots a fresh continuation that picks up the resolved tool result.
46+
3947
## Backend: define the tool
4048

4149
A HITL tool has an `inputSchema` describing what the model can ask, but **no `execute` function**. When the LLM calls it, `streamText` returns control to your agent.

0 commit comments

Comments
 (0)