Version Packages#1789
Open
github-actions[bot] wants to merge 1 commit into
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR was opened by the Changesets release GitHub action. When you're ready to do a release, you can merge this and the packages will be published to npm automatically. If you're not ready to do a release yet, that's fine, whenever you add more changesets to main, this PR will be updated.
Releases
@cloudflare/ai-chat@0.9.0
Minor Changes
#1788
3b2af54Thanks @threepointone! -AIChatAgentnow uses an event-driven auto-continuation barrier that parksindefinitely on an incomplete parallel tool batch instead of force-continuing
after a fixed timeout.
Previously, when a turn ended with several parallel client tool calls and only
some results had arrived,
AIChatAgentran the completeness barrier insidethe continuation turn and polled for up to 60s
(
AUTO_CONTINUATION_PENDING_TOOL_TIMEOUT_MS), after which it continuedinference against whatever results had landed — potentially a half-complete tool
batch. The barrier is now event-driven and runs before the continuation is
enqueued (converging onto
@cloudflare/think's model): it fires only once everyresult in the batch has arrived, re-arms as each sibling result is applied and
when a streaming turn finalizes, guards against double-fire, and is gated on no
active stream. There is no orphan timeout — a batch with a never-arriving
sibling now parks budget-free until it completes (the same way a turn already
parks on a pending HITL/client interaction) rather than force-continuing with
missing results.
This is a behavior change for the rare stuck-tool case: a result that never
arrives no longer triggers a continuation after 60s; it parks until the missing
result lands (or a later user turn / chat recovery repairs the transcript). A
parked continuation leaves the same on-disk signature as a HITL park, so a
deploy/crash mid-park recovers by re-arming rather than terminalizing.
#1788
3b2af54Thanks @threepointone! -AIChatAgentnow replays the live "recovering…" status on connect (#1620).Previously the
cf_agent_chat_recoveringframe was only broadcast live, so aclient that connected (or reconnected) while a durable turn was mid-recovery —
between a scheduled continuation and its first chunk — saw nothing and appeared
frozen until the turn resumed or failed. It now receives the recovering status
directly on connect (when no stream is active to resume), so
useAgentChat'sisRecoveringreflects the in-progress recovery immediately. This convergesAIChatAgentonto@cloudflare/think's behavior. The status is still cleared oncompletion, exhaustion, or any terminal outcome, and stale records (older than
the recovering-flag TTL) are skipped so a recovery abandoned without a terminal
cannot show "recovering…" forever.
#1788
3b2af54Thanks @threepointone! -AIChatAgentnow compacts oversized tool outputs structurally instead ofreplacing them with a flat summary string.
Previously, when a persisted assistant message exceeded the SQLite row-size
limit,
AIChatAgentreplaced each large tool output with a single englishsummary string (
"This tool output was too large to persist… Preview: …"),discarding the original shape. It now uses the shared shape-preserving
truncateToolOutputcompactor (the same one@cloudflare/thinkalready used):objects and arrays keep their structure, long strings are truncated in place
with a
... [truncated N chars]marker, and only genuinely unrepresentablenesting collapses to a marker object. This makes a compacted tool result far
easier for the model to keep reasoning about, and converges
AIChatAgentand@cloudflare/thinkonto one row-size compaction path. Themetadata.compactedToolOutputs/metadata.compactedTextPartsannotations andthe compaction
console.warns are unchanged.#1788
3b2af54Thanks @threepointone! -AIChatAgentcan now detect and recover from a hung model/transport stream viathe opt-in
chatStreamStallTimeoutMswatchdog (#1626).Set
chatStreamStallTimeoutMs(a class field, likechatRecovery) to themaximum number of milliseconds allowed between stream chunks. If a turn parks
longer than that — a hung provider or a stalled transport — the watchdog aborts
the live stream instead of leaving the turn spinning forever. When
chatRecoveryis enabled, the stall is routed into the same bounded-recovery machinery a
deploy/eviction interruption uses: the partial generated so far is persisted and
a continuation is scheduled (or, once the recovery budget is spent, the
configured terminal message is delivered). With
chatRecoverydisabled, a stallsurfaces as a terminal stream error so the spinner is cleared.
The default is
0, which disables the watchdog (no behavior change unless youopt in), matching
@cloudflare/think. Because the watchdog measures the gapbetween chunks — not total turn duration — a steadily streaming turn never trips
it regardless of overall length. Internally this is built on the shared
iterateWithStallWatchdogprimitive both@cloudflare/ai-chatand@cloudflare/thinkconsume (an internalagents/chatseam, not a public API),so this change ships under the
@cloudflare/ai-chatbump alone.Patch Changes
#1788
3b2af54Thanks @threepointone! -AIChatAgentnow delivers the terminal banner before persisting the durableterminal record when chat recovery gives up, converging onto
@cloudflare/think's broadcast-first ordering.Previously
_exhaustChatRecoverypersisted the durable terminal record firstand broadcast the banner second. A terminal-record write can reject in the
deploy/storage window a give-up runs in (#1730); under persist-first the throw
propagated before the banner was sent, so the live banner was dropped on that
pass and only delivered on the healthy re-run (potentially a different isolate,
after the affected connections had gone). Broadcasting first makes the banner
resilient to a failing storage write: the throw still propagates and the whole
give-up re-runs on a healthy isolate, which persists the record idempotently and
re-delivers the banner (the documented at-least-once edge). Persisting first
gained no durability — the re-run persists either way — while losing this banner
resilience, so both chat hosts now terminalize broadcast-first.
#1788
3b2af54Thanks @threepointone! - Converge recovery forward-progress crediting betweenAIChatAgentandThink.Both hosts now credit the recovery no-progress counter through one shared, host-agnostic rule (
shouldCreditStreamProgress): a progress milestone (a started text/reasoning segment or a settled tool input/output) credits unconditionally, and mid-segment streaming deltas (text-delta/reasoning-delta/tool-input-delta) credit at most once per throttle window via a per-isolateStreamProgressCreditThrottle. PreviouslyAIChatAgentcredited only on chunk-type milestones whileThinkcredited on its flush cadence, so a long single content segment spanning repeated crashes could read as "no progress" underAIChatAgentand false-fire itsno_progress_timeout. The new rule is never coarser than either host's prior cadence, so it can only delay or avoid a false no-progress timeout, never hasten give-up.#1788
3b2af54Thanks @threepointone! - Recovery give-up now resolves the orphaned stream by newest metadata row.The stable-timeout/error give-up path that terminalizes an exhausted recovery
turn previously resolved the turn's orphaned stream id with an in-memory
first-match scan over all stream metadata, while the wake (restart) path already
used the newest durable row keyed by the recovery-root request id. These two
lookups are now a single seam, so both paths surface the same partial — the
newest stream the turn produced — when a request id spans more than one
recovery attempt. Single-attempt turns (one stream row per request id) are
unaffected.
@cloudflare/think@0.11.0
Minor Changes
#1788
3b2af54Thanks @threepointone! -Thinknow annotates and logs row-size compaction the same way@cloudflare/ai-chatdoes.When a persisted message exceeds the SQLite row-size limit and
Thinkcompactsits tool outputs or truncates its text parts to fit, the resulting message now
carries
metadata.compactedToolOutputs(the compacted tool-call IDs) and/ormetadata.compactedTextParts(the truncated text-part indices), andThinkemits a
console.warndescribing the compaction. The compaction itself isunchanged —
Thinkalready used the shared shape-preservingtruncateToolOutputcompactor — this only adds the previously ai-chat-only annotations/warnings so a
client can tell that a stored message was compacted. Both packages now share one
enforceRowSizeLimitimplementation.Patch Changes
#1788
3b2af54Thanks @threepointone! - Converge recovery forward-progress crediting betweenAIChatAgentandThink.Both hosts now credit the recovery no-progress counter through one shared, host-agnostic rule (
shouldCreditStreamProgress): a progress milestone (a started text/reasoning segment or a settled tool input/output) credits unconditionally, and mid-segment streaming deltas (text-delta/reasoning-delta/tool-input-delta) credit at most once per throttle window via a per-isolateStreamProgressCreditThrottle. PreviouslyAIChatAgentcredited only on chunk-type milestones whileThinkcredited on its flush cadence, so a long single content segment spanning repeated crashes could read as "no progress" underAIChatAgentand false-fire itsno_progress_timeout. The new rule is never coarser than either host's prior cadence, so it can only delay or avoid a false no-progress timeout, never hasten give-up.#1788
3b2af54Thanks @threepointone! - Recovery give-up now resolves the orphaned stream by newest metadata row.The stable-timeout/error give-up path that terminalizes an exhausted recovery
turn previously resolved the turn's orphaned stream id with an in-memory
first-match scan over all stream metadata, while the wake (restart) path already
used the newest durable row keyed by the recovery-root request id. These two
lookups are now a single seam, so both paths surface the same partial — the
newest stream the turn produced — when a request id spans more than one
recovery attempt. Single-attempt turns (one stream row per request id) are
unaffected.
agents@0.16.3
Patch Changes
#1788
3b2af54Thanks @threepointone! - Converge recovery forward-progress crediting betweenAIChatAgentandThink.Both hosts now credit the recovery no-progress counter through one shared, host-agnostic rule (
shouldCreditStreamProgress): a progress milestone (a started text/reasoning segment or a settled tool input/output) credits unconditionally, and mid-segment streaming deltas (text-delta/reasoning-delta/tool-input-delta) credit at most once per throttle window via a per-isolateStreamProgressCreditThrottle. PreviouslyAIChatAgentcredited only on chunk-type milestones whileThinkcredited on its flush cadence, so a long single content segment spanning repeated crashes could read as "no progress" underAIChatAgentand false-fire itsno_progress_timeout. The new rule is never coarser than either host's prior cadence, so it can only delay or avoid a false no-progress timeout, never hasten give-up.#1788
3b2af54Thanks @threepointone! - ExportreconcileOrphanPartialfromagents/chat.This is the shared primitive that merges a freshly-reconstructed orphaned stream
partial onto an assistant message that already owns its target id (an early
persist at tool-approval time, or a continuation resuming the prior assistant
message). It keeps all existing parts, appends only reconstructed parts whose
toolCallIdis not already present (so a recovery replay never duplicates atool call), and overlays incoming metadata onto existing — preserving an
in-place tool result that lives only in storage rather than letting a replayed
chunk re-advance it.
@cloudflare/ai-chat's orphan-persist path now uses it;hosts whose orphan persist only runs at stream finalize don't need it (the
shared reconstruction is already idempotent by
toolCallId).Additive export only — no behavior change to existing APIs.
#1788
3b2af54Thanks @threepointone! - Export theOrphanPersistStore<M = UIMessage>type fromagents/chat.This is the minimal message store the chat-recovery orphan-persist write goes
through — the write subset of
SessionProvider(thegetMessage,appendMessage, andupdateMessagemethods). It is parameterized over thehost's message type so the seam itself is not AI-SDK-specific: the AI-SDK chat
hosts (
@cloudflare/ai-chat,@cloudflare/think) instantiate it at theUIMessagedefault, whileSessionProvidersatisfies it at its ownSessionMessage. Both hosts now route their orphan-persist write through a hostadapter typed against this interface, turning the previous by-convention
alignment into a type-enforced contract.
Additive export only — no behavior change to existing APIs.