From d20367daf98aa570e95a0025fff3488edb801cc3 Mon Sep 17 00:00:00 2001 From: Mike Christensen Date: Fri, 16 Jan 2026 14:53:12 +0000 Subject: [PATCH 1/2] docs: writing style guide Adds additional friendly guidance to the writing style guide to stop LLMs consuming it from falling back onto bad habits. --- writing-style-guide.md | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/writing-style-guide.md b/writing-style-guide.md index ee739e7c67..68054b1b2c 100644 --- a/writing-style-guide.md +++ b/writing-style-guide.md @@ -235,6 +235,7 @@ Note the following points: * The list should always have a piece of text introducing the list followed by a colon, and then a blank line. * Each sentence in the list is terminated by a full-stop (period). * If each item in the list is a single word, a terminating period is not required. +* Do not use bold formatting for prefixes in bullet points (for example, avoid patterns like "**Feature name:** description"), as this is a common indicator of AI-generated content. ## Codeblocks @@ -244,6 +245,8 @@ When inserting example code in the text: * Break the text before a codeblock with a colon, not a period (which is a hard stop in the mind of the reader, rather than a continuation). * There should *not* be a space before the colon. * Place a blank line after the colon and before the code block. +* All headings must be followed by introductory text. Never place a code block, list, or other content immediately after a heading without explanatory text first. +* For JavaScript and TypeScript code, prefer single quotes over double quotes for strings (excluding JSON, which must use double quotes per the specification). ## Acronyms @@ -274,6 +277,16 @@ Make sure you write the correct case for product names: * GitHub not Github * macOS not Mac OS +## Avoid AI-generated content fingerprints + +Technical documentation should maintain a natural, human writing style and avoid patterns commonly associated with AI-generated content: + +* Do not use em-dashes (—) in technical writing. Prefer standard hyphens (-) or restructure the sentence for better clarity. +* Avoid bold prefixes in bullet points (for example, patterns like "**Feature:** Description" or "**Benefits:** Details"). This formatting style is a telltale sign of AI-generated content. +* Avoid formulaic patterns and overly structured prose that may appear mechanical or template-driven. + +These guidelines help ensure documentation feels authentic and professionally written while maintaining readability and clarity. + ## Other considerations Some additional points to bear in mind: From 73f5eb985c94424829efc4ede3d4c98059561d71 Mon Sep 17 00:00:00 2001 From: Mike Christensen Date: Fri, 16 Jan 2026 15:21:52 +0000 Subject: [PATCH 2/2] ait: misc improvements --- .../javascript/README.md | 2 +- .../javascript/README.md | 4 +- src/pages/docs/ai-transport/index.mdx | 22 ++++--- .../messaging/chain-of-thought.mdx | 16 ++--- .../docs/ai-transport/messaging/citations.mdx | 58 +++++++++---------- .../ai-transport/messaging/tool-calls.mdx | 6 +- .../identifying-users-and-agents.mdx | 2 +- .../ai-transport/sessions-identity/index.mdx | 2 +- .../sessions-identity/resuming-sessions.mdx | 4 +- .../ai-transport/token-streaming/index.mdx | 5 +- .../token-streaming/message-per-response.mdx | 16 ++--- .../token-streaming/message-per-token.mdx | 16 ++--- .../token-rate-limits.mdx | 6 +- .../anthropic-message-per-response.mdx | 6 +- .../anthropic-message-per-token.mdx | 10 ++-- .../openai-message-per-response.mdx | 6 +- .../ai-transport/openai-message-per-token.mdx | 26 ++++----- 17 files changed, 106 insertions(+), 101 deletions(-) rename src/pages/docs/ai-transport/{messaging => token-streaming}/token-rate-limits.mdx (83%) diff --git a/examples/ai-transport-message-per-response/javascript/README.md b/examples/ai-transport-message-per-response/javascript/README.md index c0e3f41b66..ebb5760df6 100644 --- a/examples/ai-transport-message-per-response/javascript/README.md +++ b/examples/ai-transport-message-per-response/javascript/README.md @@ -18,7 +18,7 @@ Use the following methods to implement AI Transport message-per-response streami - [`channel.subscribe()`](/docs/channels#subscribe): subscribes to messages, handling `message.create`, `message.append`, and `message.update` actions. - [`channel.setOptions()`](/docs/channels/options) with [`rewind`](/docs/channels/options/rewind): enables seamless message recovery during reconnections, delivering historical messages as `message.update` events. -Find out more about [AI Transport](/docs/ai-transport) and [message appending](/docs/ai-transport/features/token-streaming/message-per-response). +Find out more about [AI Transport](/docs/ai-transport) and [message appending](/docs/ai-transport/token-streaming/message-per-response). ## Getting started diff --git a/examples/ai-transport-message-per-token/javascript/README.md b/examples/ai-transport-message-per-token/javascript/README.md index 1a2ade8dec..ffed30477a 100644 --- a/examples/ai-transport-message-per-token/javascript/README.md +++ b/examples/ai-transport-message-per-token/javascript/README.md @@ -17,7 +17,7 @@ Use the following methods to implement AI Transport token streaming: - [`channel.publish()`](/docs/channels#publish): publishes individual tokens as they arrive from the LLM service with response tracking headers. - [`channel.history()`](/docs/channels/history) with [`untilAttach`](/docs/channels/options#attach): enables seamless message recovery during reconnections, ensuring no tokens are lost. -Find out more about [AI Transport](/docs/ai-transport) and [message history](/docs/channels/history). +Find out more about [AI Transport](/docs/ai-transport), [token streaming](/docs/ai-transport/token-streaming), and [message history](/docs/storage-history/history). ## Getting started @@ -57,4 +57,4 @@ Find out more about [AI Transport](/docs/ai-transport) and [message history](/do ## Open in CodeSandbox -In CodeSandbox, rename the `.env.example` file to `.env.local` and update the value of your `VITE_ABLY_KEY` variable to use your Ably API key. \ No newline at end of file +In CodeSandbox, rename the `.env.example` file to `.env.local` and update the value of your `VITE_ABLY_KEY` variable to use your Ably API key. diff --git a/src/pages/docs/ai-transport/index.mdx b/src/pages/docs/ai-transport/index.mdx index 0c906e9340..ceec10b83a 100644 --- a/src/pages/docs/ai-transport/index.mdx +++ b/src/pages/docs/ai-transport/index.mdx @@ -6,7 +6,7 @@ meta_description: "Learn more about Ably's AI Transport and the features that en AI Transport enables you to add a realtime delivery layer to your application, providing the infrastructure required to deliver modern, stateful AI experiences to users. It works seamlessly with any AI model or framework, such as OpenAI, Anthropic, Vercel or LangChain. -AI Transport runs on Ably's [fault-tolerant](/docs/platform/architecture/fault-tolerance) and highly-available platform. The platform enables data to be streamed between all internet-connected devices at [low latencies](/docs/platform/architecture/latency) across the globe. Its elastic global infrastructure delivers enterprise-scale messaging that [effortlessly scales](/docs/platform/architecture/platform-scalability) to meet demand. +AI Transport runs on Ably's [fault-tolerant](/docs/platform/architecture/fault-tolerance) and highly-available platform. The platform supports streaming data between all internet-connected devices at [low latencies](/docs/platform/architecture/latency) across the globe. Its elastic global infrastructure delivers enterprise-scale messaging that [effortlessly scales](/docs/platform/architecture/platform-scalability) to meet demand. Drop AI Transport into your applications to transform them into modern, bi-directional AI experiences that keep users engaged. AI Transport provides the building blocks to deliver reliable, resumable token streams with robust session management and state hydration to always keep your users and agents in sync. @@ -18,6 +18,8 @@ Start learning the basics of AI Transport right away with a getting started guid ### OpenAI +Use the following guides to get started with OpenAI: + {[ { @@ -37,6 +39,8 @@ Start learning the basics of AI Transport right away with a getting started guid ### Anthropic +Use the following guides to get started with Anthropic: + {[ { @@ -65,7 +69,7 @@ Token streaming is the core of how LLMs deliver their responses to users. Tokens Using AI Transport, your token streams are reliable and persistent. They survive modern environments where users change browser tabs, refresh the page or switch devices, and common interruptions such as temporary network loss. Your users can always reconnect and continue where they left off without having to start over. -[Read more about token streaming](/docs/ai-transport/features/token-streaming). +[Read more about token streaming](/docs/ai-transport/token-streaming). ### Bi-directional communication @@ -73,9 +77,9 @@ AI Transport supports rich, bi-directional communication patterns between users Build sophisticated AI experiences with features like accepting user input for interactive conversations, streaming chain-of-thought reasoning for transparency, attaching citations to responses for verifiability, implementing human-in-the-loop workflows for sensitive operations, and exposing tool calls for generative UI and visibility. -These messaging features work seamlessly with token streaming to create complete, interactive AI experiences. +These messaging features work seamlessly with [token streaming](/docs/ai-transport/token-streaming) to create complete, interactive AI experiences. -[Read more about messaging features](/docs/ai-transport/features/messaging). +[Read more about messaging features](/docs/ai-transport/messaging/accepting-user-input). ### Durable sessions @@ -85,25 +89,25 @@ Communication shouldn't be tied to the connection state of either party. If a us Your users can start a conversation on their mobile and seamlessly continue it on their desktop. Similarly, multiple users can participate in the same conversation with a single agent and they will all remain in sync, in realtime. -[Read more about sessions and identity](/docs/ai-transport/features/sessions-identity). +[Read more about sessions and identity](/docs/ai-transport/sessions-identity). ### Automatic catch-up -AI Transport enables clients to hydrate conversation and session state from the channel, including message history and in-progress responses. +AI Transport enables clients to hydrate conversation and session state from the [channel](/docs/channels), including [message history](/docs/storage-history/history) and in-progress responses. Whether a user is briefly disconnected when they drive through a tunnel, or they're rejoining a conversation the following day of work, AI Transport allows clients to resynchronise the full conversation state, including both historical messages and in-progress responses. Your users are always up to date with the full conversation, in order, anywhere. -[Read more about client hydration](/docs/ai-transport/features/token-streaming/message-per-response#hydration). +[Read more about client hydration](/docs/ai-transport/token-streaming/message-per-response#hydration). ### Background processing AI Transport allows agents to process jobs in the background while users go offline, with full awareness of their online status through realtime presence tracking. -Users can work asynchronously by prompting an agent to perform a task without having to monitor its progress. They can go offline and receive a push notification when the agent has completed the task, or reconnect at any time to seamlessly resume and see all progress made while they were away using [state hydration](#hydration). +Users can work asynchronously by prompting an agent to perform a task without having to monitor its progress. They can go offline and receive a push notification when the agent has completed the task, or reconnect at any time to seamlessly resume and see all progress made while they were away using [state hydration](#catch-up). It also puts you in control of how you manage your application when there aren't any users online. For example, you can choose whether to pause a conversation when a user exits their browser tab, or allow the agent to complete its response for the user to view when they return. -[Read more about status-aware cost controls](/docs/ai-transport/features/sessions-identity/online-status). +[Read more about status-aware cost controls](/docs/ai-transport/sessions-identity/online-status). ### Enterprise controls diff --git a/src/pages/docs/ai-transport/messaging/chain-of-thought.mdx b/src/pages/docs/ai-transport/messaging/chain-of-thought.mdx index 7c76eecea3..15c40d0dc0 100644 --- a/src/pages/docs/ai-transport/messaging/chain-of-thought.mdx +++ b/src/pages/docs/ai-transport/messaging/chain-of-thought.mdx @@ -26,15 +26,15 @@ As an application developer, you decide how to surface chain-of-thought reasonin ### Inline pattern -In the inline pattern, reasoning messages are published to the same channel as model output messages. +In the inline pattern, agents publish reasoning messages to the same channel as model output messages. By publishing all content to a single channel, the inline pattern: - Simplifies channel management by consolidating all conversation content in one place -- Maintains relative order of reasoning and model output messages as they are generated +- Maintains relative order of reasoning and model output messages as the model generates them - Supports retrieving reasoning and response messages together from history -#### Publishing +#### Publish Publish both reasoning and model output messages to a single channel. @@ -86,7 +86,7 @@ To learn how to stream individual tokens as they are generated, see the [token s Set [`echoMessages`](/docs/api/realtime-sdk/types#client-options) to `false` on the agent's Ably client to prevent the agent from receiving its own reasoning and output messages, avoiding billing for [echoed messages](/docs/pub-sub/advanced#echo). -#### Subscribing +#### Subscribe Subscribe to both reasoning and model output messages on the same channel. @@ -140,7 +140,7 @@ To learn about hydrating responses from channel history, including using `rewind ### Threading pattern -In the threading pattern, reasoning messages are published to a separate channel from model output messages. The reasoning channel name is communicated to clients, allowing them to discover where to obtain reasoning content on demand. +In the threading pattern, agents publish reasoning messages to a separate channel from model output messages. The reasoning channel name is communicated to clients, allowing them to discover where to obtain reasoning content on demand. By separating reasoning into its own channel, the threading pattern: @@ -148,11 +148,11 @@ By separating reasoning into its own channel, the threading pattern: - Reduces bandwidth usage by delivering reasoning messages only when users choose to view them - Works well for long reasoning threads, where not all the detail needs to be immediately surfaced to the user, but is helpful to see on demand -#### Publishing +#### Publish Publish model output messages to the main conversation channel and reasoning messages to a dedicated reasoning channel. The reasoning channel name includes the response ID, creating a unique reasoning channel per response. -In the example below, a `start` control message is sent on the main channel at the beginning of each response, which includes the response ID in the message [`extras`](/docs/api/realtime-sdk/messages#extras). Clients can derive the reasoning channel name from the response ID, allowing them to discover and subscribe to the stream of reasoning messages on demand: +In the example below, the agent sends a `start` control message on the main channel at the beginning of each response, which includes the response ID in the message [`extras`](/docs/api/realtime-sdk/messages#extras). Clients can derive the reasoning channel name from the response ID, allowing them to discover and subscribe to the stream of reasoning messages on demand: -#### Subscribing +#### Subscribe Subscribe to the main conversation channel to receive control messages and model output. Subscribe to the reasoning channel on demand, for example in response to a click event. diff --git a/src/pages/docs/ai-transport/messaging/citations.mdx b/src/pages/docs/ai-transport/messaging/citations.mdx index 1944200959..53d473b0be 100644 --- a/src/pages/docs/ai-transport/messaging/citations.mdx +++ b/src/pages/docs/ai-transport/messaging/citations.mdx @@ -31,7 +31,7 @@ Use [message annotations](/docs/messages/annotations) to attach source metadata Message append functionality requires "Message annotations, updates, deletes and appends" to be enabled in a [channel rule](/docs/channels#rules) associated with the channel. To enable the channel rule: @@ -39,7 +39,7 @@ To enable the channel rule: 1. Go to the [Ably dashboard](https://www.ably.com/dashboard) and select your app. 2. Navigate to the "Configuration" > "Rules" section from the left-hand navigation bar. 3. Choose "Add new rule". -4. Enter a channel name or namespace pattern (e.g. `ai` for all channels starting with `ai:`). +4. Enter a channel name or namespace pattern (for example, `ai` for all channels starting with `ai:`). 5. Select the "Message annotations, updates, deletes and appends" option from the list. 6. Click "Create channel rule". @@ -94,49 +94,49 @@ In this example: Including character offsets in annotation data allow UIs to attach inline citation markers to specific portions of the response text. -## Publishing citations +## Publish citations Agents create citations by publishing [message annotations](/docs/messages/annotations) that reference the [`serial`](/docs/messages#properties) of the response message: ```javascript -const channel = realtime.channels.get("ai:{{RANDOM_CHANNEL_NAME}}"); +const channel = realtime.channels.get('ai:{{RANDOM_CHANNEL_NAME}}'); // Publish the AI response message -const response = "The James Webb Space Telescope launched in December 2021 and its first images were released in July 2022."; -const { serials: [msgSerial] } = await channel.publish("response", response); +const response = 'The James Webb Space Telescope launched in December 2021 and its first images were released in July 2022.'; +const { serials: [msgSerial] } = await channel.publish('response', response); // Add citations by annotating the response message await channel.annotations.publish(msgSerial, { - type: "citations:multiple.v1", - name: "science.nasa.gov", + type: 'citations:multiple.v1', + name: 'science.nasa.gov', data: { - url: "https://science.nasa.gov/mission/webb/", - title: "James Webb Space Telescope - NASA Science", + url: 'https://science.nasa.gov/mission/webb/', + title: 'James Webb Space Telescope - NASA Science', startOffset: 43, endOffset: 56, - snippet: "Webb launched on Dec. 25th 2021" + snippet: 'Webb launched on Dec. 25th 2021' } }); await channel.annotations.publish(msgSerial, { - type: "citations:multiple.v1", - name: "en.wikipedia.org", + type: 'citations:multiple.v1', + name: 'en.wikipedia.org', data: { - url: "https://en.wikipedia.org/wiki/James_Webb_Space_Telescope", - title: "James Webb Space Telescope - Wikipedia", + url: 'https://en.wikipedia.org/wiki/James_Webb_Space_Telescope', + title: 'James Webb Space Telescope - Wikipedia', startOffset: 95, endOffset: 104, - snippet: "The telescope's first image was released to the public on 11 July 2022." + snippet: 'The telescope\'s first image was released to the public on 11 July 2022.' } }); ``` -## Subscribing to summaries +## Subscribe to summaries Clients can display a summary of the citations attached to a response by using [annotation summaries](/docs/messages/annotations#annotation-summaries). Clients receive realtime updates to annotation summaries automatically when subscribing to a channel, which are [delivered as messages](/docs/messages/annotations#subscribe) with an `action` of `message.summary`. When using [`multiple.v1`](/docs/messages/annotations#multiple) summarization, counts are grouped by the annotation `name`. @@ -160,13 +160,13 @@ In the example below, the `name` is set to the domain name of the citation sourc ```javascript -const channel = realtime.channels.get("ai:{{RANDOM_CHANNEL_NAME}}"); +const channel = realtime.channels.get('ai:{{RANDOM_CHANNEL_NAME}}'); await channel.subscribe((message) => { - if (message.action === "message.summary") { - const citations = message.annotations.summary["citations:multiple.v1"]; + if (message.action === 'message.summary') { + const citations = message.annotations.summary['citations:multiple.v1']; if (citations) { - console.log("Citation summary:", citations); + console.log('Citation summary:', citations); } } }); @@ -208,19 +208,19 @@ When agents publish citations with a [`clientId`](/docs/auth/identified-clients) The `clipped` field indicates whether the summary was truncated due to size limits. This only occurs when a large number of clients with distinct `clientId`s publish annotations. See [large summaries](/docs/messages/annotations#large-summaries) for more information. -## Subscribing to individual citations +## Subscribe to individual citations To access the full citation data, subscribe to [individual annotation events](/docs/messages/annotations#individual-annotations): ```javascript -const channel = realtime.channels.get("ai:{{RANDOM_CHANNEL_NAME}}", { - modes: ["ANNOTATION_SUBSCRIBE"] +const channel = realtime.channels.get('ai:{{RANDOM_CHANNEL_NAME}}', { + modes: ['ANNOTATION_SUBSCRIBE'] }); await channel.annotations.subscribe((annotation) => { - if (annotation.action === "annotation.create" && - annotation.type === "citations:multiple.v1") { + if (annotation.action === 'annotation.create' && + annotation.type === 'citations:multiple.v1') { const { url, title } = annotation.data; console.log(`Citation: ${title} (${url})`); // Output: Citation: James Webb Space Telescope - Wikipedia (https://en.wikipedia.org/wiki/James_Webb_Space_Telescope) @@ -254,7 +254,7 @@ Each annotation event includes the `messageSerial` of the response message it is Subscribe to individual annotation events when you need the full citation data updated in realtime, such as for rendering clickable source links or attaching inline citation markers to specific portions of the response text as citations arrive. -## Retrieving citations on demand +## Retrieve citations on demand Annotations can also be retrieved via the [REST API](/docs/api/rest-api#annotations-list) without maintaining a realtime subscription. diff --git a/src/pages/docs/ai-transport/messaging/tool-calls.mdx b/src/pages/docs/ai-transport/messaging/tool-calls.mdx index d987d2bb26..15de573ff2 100644 --- a/src/pages/docs/ai-transport/messaging/tool-calls.mdx +++ b/src/pages/docs/ai-transport/messaging/tool-calls.mdx @@ -24,7 +24,7 @@ Surfacing tool calls supports: - Human-in-the-loop workflows: Expose tool calls [resolved by humans](/docs/ai-transport/messaging/human-in-the-loop) where users can review and approve tool execution before it happens - Generative UI: Build dynamic, contextual UI components based on the structured tool data -## Publishing tool calls +## Publish tool calls Publish tool call and model output messages to the channel. @@ -98,7 +98,7 @@ To learn how to stream individual tokens as they are generated, see the [token s Set [`echoMessages`](/docs/api/realtime-sdk/types#client-options) to `false` on the agent's Ably client to prevent the agent from receiving its own tool call messages, avoiding billing for [echoed messages](/docs/pub-sub/advanced#echo). -## Subscribing to tool calls +## Subscribe to tool calls Subscribe to tool call and model output messages on the channel. @@ -188,7 +188,7 @@ await channel.subscribe((message) => { ## Client-side tools diff --git a/src/pages/docs/ai-transport/sessions-identity/identifying-users-and-agents.mdx b/src/pages/docs/ai-transport/sessions-identity/identifying-users-and-agents.mdx index 4614747840..f762d4ae87 100644 --- a/src/pages/docs/ai-transport/sessions-identity/identifying-users-and-agents.mdx +++ b/src/pages/docs/ai-transport/sessions-identity/identifying-users-and-agents.mdx @@ -317,7 +317,7 @@ await channel.subscribe((message) => { ## Adding roles and attributes diff --git a/src/pages/docs/ai-transport/sessions-identity/index.mdx b/src/pages/docs/ai-transport/sessions-identity/index.mdx index e1b9be5a05..828fdd6538 100644 --- a/src/pages/docs/ai-transport/sessions-identity/index.mdx +++ b/src/pages/docs/ai-transport/sessions-identity/index.mdx @@ -15,7 +15,7 @@ A session is an interaction between a user (or multiple users) and an AI agent w - Recover from interruptions: Experience connection drops, browser refreshes, or network instability without losing conversation progress - Collaborate in shared sessions: Multiple users can participate in the same conversation simultaneously and remain in sync -These capabilities represent a fundamental shift from traditional request/response AI experiences to continuous, resumable interactions that remain accessible across all user devices and locations. Sessions have a lifecycle: they begin when a user starts interacting with an agent, remain active while the interaction continues, and can persist even when users disconnect - enabling truly asynchronous AI workflows. +These capabilities represent a fundamental shift from traditional request/response AI experiences to continuous, resumable interactions that are accessible across all user devices and locations. Sessions have a lifecycle: they begin when a user starts interacting with an agent, remain active while the interaction continues, and can persist even when users disconnect - enabling truly asynchronous AI workflows. Managing this lifecycle in AI Transport's decoupled architecture involves detecting when users are present, deciding when to stop or continue agent work, and handling scenarios where users disconnect and return. diff --git a/src/pages/docs/ai-transport/sessions-identity/resuming-sessions.mdx b/src/pages/docs/ai-transport/sessions-identity/resuming-sessions.mdx index e5e4fbd159..56c99c93d4 100644 --- a/src/pages/docs/ai-transport/sessions-identity/resuming-sessions.mdx +++ b/src/pages/docs/ai-transport/sessions-identity/resuming-sessions.mdx @@ -37,9 +37,9 @@ For detailed examples of hydrating the token stream, see the token streaming doc When an agent restarts, it needs to resume from where it left off. This involves two distinct concerns: -1. **Recovering the agent's execution state**: The current step in the workflow, local variables, function call results, pending operations, and any other state needed to continue execution. This state is internal to the agent and typically not visible to users. +1. Recovering the agent's execution state: The current step in the workflow, local variables, function call results, pending operations, and any other state needed to continue execution. This state is internal to the agent and typically not visible to users. -2. **Catching up on session activity**: Any user messages, events, or other activity that occurred while the agent was offline. +2. Catching up on session activity: Any user messages, events, or other activity that occurred while the agent was offline. These are separate problems requiring different solutions. Agent execution state is handled by your application and you choose how to persist and restore the internal state your agent needs to resume. diff --git a/src/pages/docs/ai-transport/token-streaming/index.mdx b/src/pages/docs/ai-transport/token-streaming/index.mdx index b727224e71..9dfa3030b0 100644 --- a/src/pages/docs/ai-transport/token-streaming/index.mdx +++ b/src/pages/docs/ai-transport/token-streaming/index.mdx @@ -83,10 +83,11 @@ Example use cases: ## Message events -Different models and frameworks use different events to signal streaming state, for example start events, stop events, tool calls, and content deltas. When you publish a message to an Ably channel, you can set the [message name](/docs/messages#properties) to the event type your client expects, or encode the information in [message extras]((/docs/messages#properties)) or within the payload itself. This allows your frontend to handle each event type appropriately without parsing message content. +Different models and frameworks use different events to signal streaming state, for example start events, stop events, tool calls, and content deltas. When you publish a message to an Ably [channel](/docs/channels), you can set the [message name](/docs/messages#properties) to the event type your client expects, or encode the information in message [`extras`](/docs/messages#properties) or within the payload itself. This allows your frontend to handle each event type appropriately without parsing message content. ## Next steps - Implement token streaming with [message-per-response](/docs/ai-transport/token-streaming/message-per-response) (recommended for most applications) - Implement token streaming with [message-per-token](/docs/ai-transport/token-streaming/message-per-token) for sliding-window use cases -- Explore the guides for integration with specific models and frameworks +- Explore the [guides](/docs/guides/ai-transport/openai-message-per-response) for integration with specific models and frameworks +- Learn about [sessions and identity](/docs/ai-transport/sessions-identity) in AI Transport applications diff --git a/src/pages/docs/ai-transport/token-streaming/message-per-response.mdx b/src/pages/docs/ai-transport/token-streaming/message-per-response.mdx index 3442300f18..88d18426cc 100644 --- a/src/pages/docs/ai-transport/token-streaming/message-per-response.mdx +++ b/src/pages/docs/ai-transport/token-streaming/message-per-response.mdx @@ -7,7 +7,7 @@ Token streaming with message-per-response is a pattern where every token generat This pattern is useful for chat-style applications where you want each complete AI response stored as a single message in history, making it easy to retrieve and display multi-response conversation history. Each agent response becomes a single message that grows as tokens are appended, allowing clients joining mid-stream to catch up efficiently without processing thousands of individual tokens. -The message-per-response pattern includes [automatic rate limit protection](/docs/ai-transport/features/token-streaming/token-rate-limits#per-response) through rollups, making it the recommended approach for most token streaming use cases. +The message-per-response pattern includes [automatic rate limit protection](/docs/ai-transport/token-rate-limits#per-response) through rollups, making it the recommended approach for most token streaming use cases. ## How it works @@ -16,7 +16,7 @@ The message-per-response pattern includes [automatic rate limit protection](/doc 3. **Live delivery**: Clients subscribed to the channel receive each appended token in realtime, allowing them to progressively render the response. 4. **Compacted history**: The channel history contains only one message per agent response, which includes all tokens appended to it concatenated together. -You do not need to mark the message or token stream as completed; the final message content will automatically include the full response constructed from all appended tokens. +You do not need to mark the message or token stream as completed; the final message content automatically includes the full response constructed from all appended tokens.