Skip to content

Version Packages#1789

Open
github-actions[bot] wants to merge 1 commit into
mainfrom
changeset-release/main
Open

Version Packages#1789
github-actions[bot] wants to merge 1 commit into
mainfrom
changeset-release/main

Conversation

@github-actions

@github-actions github-actions Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

This PR was opened by the Changesets release GitHub action. When you're ready to do a release, you can merge this and the packages will be published to npm automatically. If you're not ready to do a release yet, that's fine, whenever you add more changesets to main, this PR will be updated.

Releases

@cloudflare/ai-chat@0.9.0

Minor Changes

  • #1788 3b2af54 Thanks @threepointone! - AIChatAgent now uses an event-driven auto-continuation barrier that parks
    indefinitely on an incomplete parallel tool batch instead of force-continuing
    after a fixed timeout.

    Previously, when a turn ended with several parallel client tool calls and only
    some results had arrived, AIChatAgent ran the completeness barrier inside
    the continuation turn and polled for up to 60s
    (AUTO_CONTINUATION_PENDING_TOOL_TIMEOUT_MS), after which it continued
    inference against whatever results had landed — potentially a half-complete tool
    batch. The barrier is now event-driven and runs before the continuation is
    enqueued (converging onto @cloudflare/think's model): it fires only once every
    result in the batch has arrived, re-arms as each sibling result is applied and
    when a streaming turn finalizes, guards against double-fire, and is gated on no
    active stream. There is no orphan timeout — a batch with a never-arriving
    sibling now parks budget-free until it completes (the same way a turn already
    parks on a pending HITL/client interaction) rather than force-continuing with
    missing results.

    This is a behavior change for the rare stuck-tool case: a result that never
    arrives no longer triggers a continuation after 60s; it parks until the missing
    result lands (or a later user turn / chat recovery repairs the transcript). A
    parked continuation leaves the same on-disk signature as a HITL park, so a
    deploy/crash mid-park recovers by re-arming rather than terminalizing.

  • #1788 3b2af54 Thanks @threepointone! - AIChatAgent now replays the live "recovering…" status on connect (#1620).

    Previously the cf_agent_chat_recovering frame was only broadcast live, so a
    client that connected (or reconnected) while a durable turn was mid-recovery —
    between a scheduled continuation and its first chunk — saw nothing and appeared
    frozen until the turn resumed or failed. It now receives the recovering status
    directly on connect (when no stream is active to resume), so useAgentChat's
    isRecovering reflects the in-progress recovery immediately. This converges
    AIChatAgent onto @cloudflare/think's behavior. The status is still cleared on
    completion, exhaustion, or any terminal outcome, and stale records (older than
    the recovering-flag TTL) are skipped so a recovery abandoned without a terminal
    cannot show "recovering…" forever.

  • #1788 3b2af54 Thanks @threepointone! - AIChatAgent now compacts oversized tool outputs structurally instead of
    replacing them with a flat summary string.

    Previously, when a persisted assistant message exceeded the SQLite row-size
    limit, AIChatAgent replaced each large tool output with a single english
    summary string ("This tool output was too large to persist… Preview: …"),
    discarding the original shape. It now uses the shared shape-preserving
    truncateToolOutput compactor (the same one @cloudflare/think already used):
    objects and arrays keep their structure, long strings are truncated in place
    with a ... [truncated N chars] marker, and only genuinely unrepresentable
    nesting collapses to a marker object. This makes a compacted tool result far
    easier for the model to keep reasoning about, and converges AIChatAgent and
    @cloudflare/think onto one row-size compaction path. The
    metadata.compactedToolOutputs / metadata.compactedTextParts annotations and
    the compaction console.warns are unchanged.

  • #1788 3b2af54 Thanks @threepointone! - AIChatAgent can now detect and recover from a hung model/transport stream via
    the opt-in chatStreamStallTimeoutMs watchdog (#1626).

    Set chatStreamStallTimeoutMs (a class field, like chatRecovery) to the
    maximum number of milliseconds allowed between stream chunks. If a turn parks
    longer than that — a hung provider or a stalled transport — the watchdog aborts
    the live stream instead of leaving the turn spinning forever. When chatRecovery
    is enabled, the stall is routed into the same bounded-recovery machinery a
    deploy/eviction interruption uses: the partial generated so far is persisted and
    a continuation is scheduled (or, once the recovery budget is spent, the
    configured terminal message is delivered). With chatRecovery disabled, a stall
    surfaces as a terminal stream error so the spinner is cleared.

    The default is 0, which disables the watchdog (no behavior change unless you
    opt in), matching @cloudflare/think. Because the watchdog measures the gap
    between chunks — not total turn duration — a steadily streaming turn never trips
    it regardless of overall length. Internally this is built on the shared
    iterateWithStallWatchdog primitive both @cloudflare/ai-chat and
    @cloudflare/think consume (an internal agents/chat seam, not a public API),
    so this change ships under the @cloudflare/ai-chat bump alone.

Patch Changes

  • #1788 3b2af54 Thanks @threepointone! - AIChatAgent now delivers the terminal banner before persisting the durable
    terminal record when chat recovery gives up, converging onto
    @cloudflare/think's broadcast-first ordering.

    Previously _exhaustChatRecovery persisted the durable terminal record first
    and broadcast the banner second. A terminal-record write can reject in the
    deploy/storage window a give-up runs in (#1730); under persist-first the throw
    propagated before the banner was sent, so the live banner was dropped on that
    pass and only delivered on the healthy re-run (potentially a different isolate,
    after the affected connections had gone). Broadcasting first makes the banner
    resilient to a failing storage write: the throw still propagates and the whole
    give-up re-runs on a healthy isolate, which persists the record idempotently and
    re-delivers the banner (the documented at-least-once edge). Persisting first
    gained no durability — the re-run persists either way — while losing this banner
    resilience, so both chat hosts now terminalize broadcast-first.

  • #1788 3b2af54 Thanks @threepointone! - Converge recovery forward-progress crediting between AIChatAgent and Think.

    Both hosts now credit the recovery no-progress counter through one shared, host-agnostic rule (shouldCreditStreamProgress): a progress milestone (a started text/reasoning segment or a settled tool input/output) credits unconditionally, and mid-segment streaming deltas (text-delta/reasoning-delta/tool-input-delta) credit at most once per throttle window via a per-isolate StreamProgressCreditThrottle. Previously AIChatAgent credited only on chunk-type milestones while Think credited on its flush cadence, so a long single content segment spanning repeated crashes could read as "no progress" under AIChatAgent and false-fire its no_progress_timeout. The new rule is never coarser than either host's prior cadence, so it can only delay or avoid a false no-progress timeout, never hasten give-up.

  • #1788 3b2af54 Thanks @threepointone! - Recovery give-up now resolves the orphaned stream by newest metadata row.

    The stable-timeout/error give-up path that terminalizes an exhausted recovery
    turn previously resolved the turn's orphaned stream id with an in-memory
    first-match scan over all stream metadata, while the wake (restart) path already
    used the newest durable row keyed by the recovery-root request id. These two
    lookups are now a single seam, so both paths surface the same partial — the
    newest stream the turn produced — when a request id spans more than one
    recovery attempt. Single-attempt turns (one stream row per request id) are
    unaffected.

@cloudflare/think@0.11.0

Minor Changes

  • #1788 3b2af54 Thanks @threepointone! - Think now annotates and logs row-size compaction the same way
    @cloudflare/ai-chat does.

    When a persisted message exceeds the SQLite row-size limit and Think compacts
    its tool outputs or truncates its text parts to fit, the resulting message now
    carries metadata.compactedToolOutputs (the compacted tool-call IDs) and/or
    metadata.compactedTextParts (the truncated text-part indices), and Think
    emits a console.warn describing the compaction. The compaction itself is
    unchanged — Think already used the shared shape-preserving truncateToolOutput
    compactor — this only adds the previously ai-chat-only annotations/warnings so a
    client can tell that a stored message was compacted. Both packages now share one
    enforceRowSizeLimit implementation.

Patch Changes

  • #1788 3b2af54 Thanks @threepointone! - Converge recovery forward-progress crediting between AIChatAgent and Think.

    Both hosts now credit the recovery no-progress counter through one shared, host-agnostic rule (shouldCreditStreamProgress): a progress milestone (a started text/reasoning segment or a settled tool input/output) credits unconditionally, and mid-segment streaming deltas (text-delta/reasoning-delta/tool-input-delta) credit at most once per throttle window via a per-isolate StreamProgressCreditThrottle. Previously AIChatAgent credited only on chunk-type milestones while Think credited on its flush cadence, so a long single content segment spanning repeated crashes could read as "no progress" under AIChatAgent and false-fire its no_progress_timeout. The new rule is never coarser than either host's prior cadence, so it can only delay or avoid a false no-progress timeout, never hasten give-up.

  • #1788 3b2af54 Thanks @threepointone! - Recovery give-up now resolves the orphaned stream by newest metadata row.

    The stable-timeout/error give-up path that terminalizes an exhausted recovery
    turn previously resolved the turn's orphaned stream id with an in-memory
    first-match scan over all stream metadata, while the wake (restart) path already
    used the newest durable row keyed by the recovery-root request id. These two
    lookups are now a single seam, so both paths surface the same partial — the
    newest stream the turn produced — when a request id spans more than one
    recovery attempt. Single-attempt turns (one stream row per request id) are
    unaffected.

agents@0.16.3

Patch Changes

  • #1788 3b2af54 Thanks @threepointone! - Converge recovery forward-progress crediting between AIChatAgent and Think.

    Both hosts now credit the recovery no-progress counter through one shared, host-agnostic rule (shouldCreditStreamProgress): a progress milestone (a started text/reasoning segment or a settled tool input/output) credits unconditionally, and mid-segment streaming deltas (text-delta/reasoning-delta/tool-input-delta) credit at most once per throttle window via a per-isolate StreamProgressCreditThrottle. Previously AIChatAgent credited only on chunk-type milestones while Think credited on its flush cadence, so a long single content segment spanning repeated crashes could read as "no progress" under AIChatAgent and false-fire its no_progress_timeout. The new rule is never coarser than either host's prior cadence, so it can only delay or avoid a false no-progress timeout, never hasten give-up.

  • #1788 3b2af54 Thanks @threepointone! - Export reconcileOrphanPartial from agents/chat.

    This is the shared primitive that merges a freshly-reconstructed orphaned stream
    partial onto an assistant message that already owns its target id (an early
    persist at tool-approval time, or a continuation resuming the prior assistant
    message). It keeps all existing parts, appends only reconstructed parts whose
    toolCallId is not already present (so a recovery replay never duplicates a
    tool call), and overlays incoming metadata onto existing — preserving an
    in-place tool result that lives only in storage rather than letting a replayed
    chunk re-advance it. @cloudflare/ai-chat's orphan-persist path now uses it;
    hosts whose orphan persist only runs at stream finalize don't need it (the
    shared reconstruction is already idempotent by toolCallId).

    Additive export only — no behavior change to existing APIs.

  • #1788 3b2af54 Thanks @threepointone! - Export the OrphanPersistStore<M = UIMessage> type from agents/chat.

    This is the minimal message store the chat-recovery orphan-persist write goes
    through — the write subset of SessionProvider (the getMessage,
    appendMessage, and updateMessage methods). It is parameterized over the
    host's message type so the seam itself is not AI-SDK-specific: the AI-SDK chat
    hosts (@cloudflare/ai-chat, @cloudflare/think) instantiate it at the
    UIMessage default, while SessionProvider satisfies it at its own
    SessionMessage. Both hosts now route their orphan-persist write through a host
    adapter typed against this interface, turning the previous by-convention
    alignment into a type-enforced contract.

    Additive export only — no behavior change to existing APIs.


Open in Devin Review

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no bugs or issues to report.

Open in Devin Review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants