From d20367daf98aa570e95a0025fff3488edb801cc3 Mon Sep 17 00:00:00 2001
From: Mike Christensen <mike.christensen@ably.com>
Date: Fri, 16 Jan 2026 14:53:12 +0000
Subject: [PATCH 1/2] docs: writing style guide

Adds additional friendly guidance to the writing style guide to stop
LLMs consuming it from falling back onto bad habits.
---
 writing-style-guide.md | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/writing-style-guide.md b/writing-style-guide.md
index ee739e7c67..68054b1b2c 100644
--- a/writing-style-guide.md
+++ b/writing-style-guide.md
@@ -235,6 +235,7 @@ Note the following points:
 * The list should always have a piece of text introducing the list followed by a colon, and then a blank line.
 * Each sentence in the list is terminated by a full-stop (period).
 * If each item in the list is a single word, a terminating period is not required.
+* Do not use bold formatting for prefixes in bullet points (for example, avoid patterns like "**Feature name:** description"), as this is a common indicator of AI-generated content.
 
 ## Codeblocks
 
@@ -244,6 +245,8 @@ When inserting example code in the text:
 * Break the text before a codeblock with a colon, not a period (which is a hard stop in the mind of the reader, rather than a continuation).
 * There should *not* be a space before the colon.
 * Place a blank line after the colon and before the code block.
+* All headings must be followed by introductory text. Never place a code block, list, or other content immediately after a heading without explanatory text first.
+* For JavaScript and TypeScript code, prefer single quotes over double quotes for strings (excluding JSON, which must use double quotes per the specification).
 
 ## Acronyms
 
@@ -274,6 +277,16 @@ Make sure you write the correct case for product names:
 * GitHub not Github
 * macOS not Mac OS
 
+## Avoid AI-generated content fingerprints
+
+Technical documentation should maintain a natural, human writing style and avoid patterns commonly associated with AI-generated content:
+
+* Do not use em-dashes (—) in technical writing. Prefer standard hyphens (-) or restructure the sentence for better clarity.
+* Avoid bold prefixes in bullet points (for example, patterns like "**Feature:** Description" or "**Benefits:** Details"). This formatting style is a telltale sign of AI-generated content.
+* Avoid formulaic patterns and overly structured prose that may appear mechanical or template-driven.
+
+These guidelines help ensure documentation feels authentic and professionally written while maintaining readability and clarity.
+
 ## Other considerations
 
 Some additional points to bear in mind:

From 73f5eb985c94424829efc4ede3d4c98059561d71 Mon Sep 17 00:00:00 2001
From: Mike Christensen <mike.christensen@ably.com>
Date: Fri, 16 Jan 2026 15:21:52 +0000
Subject: [PATCH 2/2] ait: misc improvements

---
 .../javascript/README.md                      |  2 +-
 .../javascript/README.md                      |  4 +-
 src/pages/docs/ai-transport/index.mdx         | 22 ++++---
 .../messaging/chain-of-thought.mdx            | 16 ++---
 .../docs/ai-transport/messaging/citations.mdx | 58 +++++++++----------
 .../ai-transport/messaging/tool-calls.mdx     |  6 +-
 .../identifying-users-and-agents.mdx          |  2 +-
 .../ai-transport/sessions-identity/index.mdx  |  2 +-
 .../sessions-identity/resuming-sessions.mdx   |  4 +-
 .../ai-transport/token-streaming/index.mdx    |  5 +-
 .../token-streaming/message-per-response.mdx  | 16 ++---
 .../token-streaming/message-per-token.mdx     | 16 ++---
 .../token-rate-limits.mdx                     |  6 +-
 .../anthropic-message-per-response.mdx        |  6 +-
 .../anthropic-message-per-token.mdx           | 10 ++--
 .../openai-message-per-response.mdx           |  6 +-
 .../ai-transport/openai-message-per-token.mdx | 26 ++++-----
 17 files changed, 106 insertions(+), 101 deletions(-)
 rename src/pages/docs/ai-transport/{messaging => token-streaming}/token-rate-limits.mdx (83%)

diff --git a/examples/ai-transport-message-per-response/javascript/README.md b/examples/ai-transport-message-per-response/javascript/README.md
index c0e3f41b66..ebb5760df6 100644
--- a/examples/ai-transport-message-per-response/javascript/README.md
+++ b/examples/ai-transport-message-per-response/javascript/README.md
@@ -18,7 +18,7 @@ Use the following methods to implement AI Transport message-per-response streami
 - [`channel.subscribe()`](/docs/channels#subscribe): subscribes to messages, handling `message.create`, `message.append`, and `message.update` actions.
 - [`channel.setOptions()`](/docs/channels/options) with [`rewind`](/docs/channels/options/rewind): enables seamless message recovery during reconnections, delivering historical messages as `message.update` events.
 
-Find out more about [AI Transport](/docs/ai-transport) and [message appending](/docs/ai-transport/features/token-streaming/message-per-response).
+Find out more about [AI Transport](/docs/ai-transport) and [message appending](/docs/ai-transport/token-streaming/message-per-response).
 
 ## Getting started
 
diff --git a/examples/ai-transport-message-per-token/javascript/README.md b/examples/ai-transport-message-per-token/javascript/README.md
index 1a2ade8dec..ffed30477a 100644
--- a/examples/ai-transport-message-per-token/javascript/README.md
+++ b/examples/ai-transport-message-per-token/javascript/README.md
@@ -17,7 +17,7 @@ Use the following methods to implement AI Transport token streaming:
 - [`channel.publish()`](/docs/channels#publish): publishes individual tokens as they arrive from the LLM service with response tracking headers.
 - [`channel.history()`](/docs/channels/history) with [`untilAttach`](/docs/channels/options#attach): enables seamless message recovery during reconnections, ensuring no tokens are lost.
 
-Find out more about [AI Transport](/docs/ai-transport) and [message history](/docs/channels/history).
+Find out more about [AI Transport](/docs/ai-transport), [token streaming](/docs/ai-transport/token-streaming), and [message history](/docs/storage-history/history).
 
 ## Getting started
 
@@ -57,4 +57,4 @@ Find out more about [AI Transport](/docs/ai-transport) and [message history](/do
 
 ## Open in CodeSandbox
 
-In CodeSandbox, rename the `.env.example` file to `.env.local` and update the value of your `VITE_ABLY_KEY` variable to use your Ably API key.
\ No newline at end of file
+In CodeSandbox, rename the `.env.example` file to `.env.local` and update the value of your `VITE_ABLY_KEY` variable to use your Ably API key.
diff --git a/src/pages/docs/ai-transport/index.mdx b/src/pages/docs/ai-transport/index.mdx
index 0c906e9340..ceec10b83a 100644
--- a/src/pages/docs/ai-transport/index.mdx
+++ b/src/pages/docs/ai-transport/index.mdx
@@ -6,7 +6,7 @@ meta_description: "Learn more about Ably's AI Transport and the features that en
 
 AI Transport enables you to add a realtime delivery layer to your application, providing the infrastructure required to deliver modern, stateful AI experiences to users. It works seamlessly with any AI model or framework, such as OpenAI, Anthropic, Vercel or LangChain.
 
-AI Transport runs on Ably's [fault-tolerant](/docs/platform/architecture/fault-tolerance) and highly-available platform. The platform enables data to be streamed between all internet-connected devices at [low latencies](/docs/platform/architecture/latency) across the globe. Its elastic global infrastructure delivers enterprise-scale messaging that [effortlessly scales](/docs/platform/architecture/platform-scalability) to meet demand.
+AI Transport runs on Ably's [fault-tolerant](/docs/platform/architecture/fault-tolerance) and highly-available platform. The platform supports streaming data between all internet-connected devices at [low latencies](/docs/platform/architecture/latency) across the globe. Its elastic global infrastructure delivers enterprise-scale messaging that [effortlessly scales](/docs/platform/architecture/platform-scalability) to meet demand.
 
 Drop AI Transport into your applications to transform them into modern, bi-directional AI experiences that keep users engaged. AI Transport provides the building blocks to deliver reliable, resumable token streams with robust session management and state hydration to always keep your users and agents in sync.
 
@@ -18,6 +18,8 @@ Start learning the basics of AI Transport right away with a getting started guid
 
 ### OpenAI
 
+Use the following guides to get started with OpenAI:
+
 <Tiles>
 {[
   {
@@ -37,6 +39,8 @@ Start learning the basics of AI Transport right away with a getting started guid
 
 ### Anthropic
 
+Use the following guides to get started with Anthropic:
+
 <Tiles>
 {[
   {
@@ -65,7 +69,7 @@ Token streaming is the core of how LLMs deliver their responses to users. Tokens
 
 Using AI Transport, your token streams are reliable and persistent. They survive modern environments where users change browser tabs, refresh the page or switch devices, and common interruptions such as temporary network loss. Your users can always reconnect and continue where they left off without having to start over.
 
-[Read more about token streaming](/docs/ai-transport/features/token-streaming).
+[Read more about token streaming](/docs/ai-transport/token-streaming).
 
 ### Bi-directional communication <a id="communication"/>
 
@@ -73,9 +77,9 @@ AI Transport supports rich, bi-directional communication patterns between users
 
 Build sophisticated AI experiences with features like accepting user input for interactive conversations, streaming chain-of-thought reasoning for transparency, attaching citations to responses for verifiability, implementing human-in-the-loop workflows for sensitive operations, and exposing tool calls for generative UI and visibility.
 
-These messaging features work seamlessly with token streaming to create complete, interactive AI experiences.
+These messaging features work seamlessly with [token streaming](/docs/ai-transport/token-streaming) to create complete, interactive AI experiences.
 
-[Read more about messaging features](/docs/ai-transport/features/messaging).
+[Read more about messaging features](/docs/ai-transport/messaging/accepting-user-input).
 
 ### Durable sessions <a id="sessions"/>
 
@@ -85,25 +89,25 @@ Communication shouldn't be tied to the connection state of either party. If a us
 
 Your users can start a conversation on their mobile and seamlessly continue it on their desktop. Similarly, multiple users can participate in the same conversation with a single agent and they will all remain in sync, in realtime.
 
-[Read more about sessions and identity](/docs/ai-transport/features/sessions-identity).
+[Read more about sessions and identity](/docs/ai-transport/sessions-identity).
 
 ### Automatic catch-up <a id="catch-up"/>
 
-AI Transport enables clients to hydrate conversation and session state from the channel, including message history and in-progress responses.
+AI Transport enables clients to hydrate conversation and session state from the [channel](/docs/channels), including [message history](/docs/storage-history/history) and in-progress responses.
 
 Whether a user is briefly disconnected when they drive through a tunnel, or they're rejoining a conversation the following day of work, AI Transport allows clients to resynchronise the full conversation state, including both historical messages and in-progress responses. Your users are always up to date with the full conversation, in order, anywhere.
 
-[Read more about client hydration](/docs/ai-transport/features/token-streaming/message-per-response#hydration).
+[Read more about client hydration](/docs/ai-transport/token-streaming/message-per-response#hydration).
 
 ### Background processing <a id="background"/>
 
 AI Transport allows agents to process jobs in the background while users go offline, with full awareness of their online status through realtime presence tracking.
 
-Users can work asynchronously by prompting an agent to perform a task without having to monitor its progress. They can go offline and receive a push notification when the agent has completed the task, or reconnect at any time to seamlessly resume and see all progress made while they were away using [state hydration](#hydration).
+Users can work asynchronously by prompting an agent to perform a task without having to monitor its progress. They can go offline and receive a push notification when the agent has completed the task, or reconnect at any time to seamlessly resume and see all progress made while they were away using [state hydration](#catch-up).
 
 It also puts you in control of how you manage your application when there aren't any users online. For example, you can choose whether to pause a conversation when a user exits their browser tab, or allow the agent to complete its response for the user to view when they return.
 
-[Read more about status-aware cost controls](/docs/ai-transport/features/sessions-identity/online-status).
+[Read more about status-aware cost controls](/docs/ai-transport/sessions-identity/online-status).
 
 ### Enterprise controls <a id="enterprise"/>
 
diff --git a/src/pages/docs/ai-transport/messaging/chain-of-thought.mdx b/src/pages/docs/ai-transport/messaging/chain-of-thought.mdx
index 7c76eecea3..15c40d0dc0 100644
--- a/src/pages/docs/ai-transport/messaging/chain-of-thought.mdx
+++ b/src/pages/docs/ai-transport/messaging/chain-of-thought.mdx
@@ -26,15 +26,15 @@ As an application developer, you decide how to surface chain-of-thought reasonin
 
 ### Inline pattern <a id="inline"/>
 
-In the inline pattern, reasoning messages are published to the same channel as model output messages.
+In the inline pattern, agents publish reasoning messages to the same channel as model output messages.
 
 By publishing all content to a single channel, the inline pattern:
 
 - Simplifies channel management by consolidating all conversation content in one place
-- Maintains relative order of reasoning and model output messages as they are generated
+- Maintains relative order of reasoning and model output messages as the model generates them
 - Supports retrieving reasoning and response messages together from history
 
-#### Publishing
+#### Publish
 
 Publish both reasoning and model output messages to a single channel.
 
@@ -86,7 +86,7 @@ To learn how to stream individual tokens as they are generated, see the [token s
 Set [`echoMessages`](/docs/api/realtime-sdk/types#client-options) to `false` on the agent's Ably client to prevent the agent from receiving its own reasoning and output messages, avoiding billing for [echoed messages](/docs/pub-sub/advanced#echo).
 </Aside>
 
-#### Subscribing
+#### Subscribe
 
 Subscribe to both reasoning and model output messages on the same channel.
 
@@ -140,7 +140,7 @@ To learn about hydrating responses from channel history, including using `rewind
 
 ### Threading pattern <a id="threading"/>
 
-In the threading pattern, reasoning messages are published to a separate channel from model output messages. The reasoning channel name is communicated to clients, allowing them to discover where to obtain reasoning content on demand.
+In the threading pattern, agents publish reasoning messages to a separate channel from model output messages. The reasoning channel name is communicated to clients, allowing them to discover where to obtain reasoning content on demand.
 
 By separating reasoning into its own channel, the threading pattern:
 
@@ -148,11 +148,11 @@ By separating reasoning into its own channel, the threading pattern:
 - Reduces bandwidth usage by delivering reasoning messages only when users choose to view them
 - Works well for long reasoning threads, where not all the detail needs to be immediately surfaced to the user, but is helpful to see on demand
 
-#### Publishing
+#### Publish
 
 Publish model output messages to the main conversation channel and reasoning messages to a dedicated reasoning channel. The reasoning channel name includes the response ID, creating a unique reasoning channel per response.
 
-In the example below, a `start` control message is sent on the main channel at the beginning of each response, which includes the response ID in the message [`extras`](/docs/api/realtime-sdk/messages#extras). Clients can derive the reasoning channel name from the response ID, allowing them to discover and subscribe to the stream of reasoning messages on demand:
+In the example below, the agent sends a `start` control message on the main channel at the beginning of each response, which includes the response ID in the message [`extras`](/docs/api/realtime-sdk/messages#extras). Clients can derive the reasoning channel name from the response ID, allowing them to discover and subscribe to the stream of reasoning messages on demand:
 
 <Aside data-type="note">
 Ably channels are [created when used](/docs/channels#use), so you can dynamically create unique reasoning channels on demand.
@@ -212,7 +212,7 @@ To learn how to stream individual tokens as they are generated, see the [token s
 Set [`echoMessages`](/docs/api/realtime-sdk/types#client-options) to `false` on the agent's Ably client to prevent the agent from receiving its own reasoning and output messages, avoiding billing for [echoed messages](/docs/pub-sub/advanced#echo).
 </Aside>
 
-#### Subscribing
+#### Subscribe
 
 Subscribe to the main conversation channel to receive control messages and model output. Subscribe to the reasoning channel on demand, for example in response to a click event.
 
diff --git a/src/pages/docs/ai-transport/messaging/citations.mdx b/src/pages/docs/ai-transport/messaging/citations.mdx
index 1944200959..53d473b0be 100644
--- a/src/pages/docs/ai-transport/messaging/citations.mdx
+++ b/src/pages/docs/ai-transport/messaging/citations.mdx
@@ -31,7 +31,7 @@ Use [message annotations](/docs/messages/annotations) to attach source metadata
 Message append functionality requires "Message annotations, updates, deletes and appends" to be enabled in a [channel rule](/docs/channels#rules) associated with the channel.
 
 <Aside data-type="important">
-When the "Message annotations, updates, deletes and appends" channel rule is enabled, messages are persisted irrespective of whether or not persistence has also been explicitly enabled. This will be reflected in increased usage since [we charge for persisting messages](https://faqs.ably.com/how-does-ably-count-messages).
+When the "Message annotations, updates, deletes and appends" channel rule is enabled, Ably persists messages irrespective of whether or not persistence has also been explicitly enabled. This increases usage since [Ably charges for persisting messages](https://faqs.ably.com/how-does-ably-count-messages).
 </Aside>
 
 To enable the channel rule:
@@ -39,7 +39,7 @@ To enable the channel rule:
 1. Go to the [Ably dashboard](https://www.ably.com/dashboard) and select your app.
 2. Navigate to the "Configuration" > "Rules" section from the left-hand navigation bar.
 3. Choose "Add new rule".
-4. Enter a channel name or namespace pattern (e.g. `ai` for all channels starting with `ai:`).
+4. Enter a channel name or namespace pattern (for example, `ai` for all channels starting with `ai:`).
 5. Select the "Message annotations, updates, deletes and appends" option from the list.
 6. Click "Create channel rule".
 
@@ -94,49 +94,49 @@ In this example:
 Including character offsets in annotation data allow UIs to attach inline citation markers to specific portions of the response text.
 
 <Aside data-type="note">
-Annotation data is not included in [summaries](#summary-view), but can be accessed by subscribing to [individual annotation events](/docs/messages/annotations#individual-annotations).
+[Summaries](#summary-view) do not include annotation data, but you can access it by subscribing to [individual annotation events](/docs/messages/annotations#individual-annotations).
 </Aside>
 
-## Publishing citations <a id="publishing"/>
+## Publish citations <a id="publishing"/>
 
 Agents create citations by publishing [message annotations](/docs/messages/annotations) that reference the [`serial`](/docs/messages#properties) of the response message:
 
 <Code>
 ```javascript
-const channel = realtime.channels.get("ai:{{RANDOM_CHANNEL_NAME}}");
+const channel = realtime.channels.get('ai:{{RANDOM_CHANNEL_NAME}}');
 
 // Publish the AI response message
-const response = "The James Webb Space Telescope launched in December 2021 and its first images were released in July 2022.";
-const { serials: [msgSerial] } = await channel.publish("response", response);
+const response = 'The James Webb Space Telescope launched in December 2021 and its first images were released in July 2022.';
+const { serials: [msgSerial] } = await channel.publish('response', response);
 
 // Add citations by annotating the response message
 await channel.annotations.publish(msgSerial, {
-  type: "citations:multiple.v1",
-  name: "science.nasa.gov",
+  type: 'citations:multiple.v1',
+  name: 'science.nasa.gov',
   data: {
-    url: "https://science.nasa.gov/mission/webb/",
-    title: "James Webb Space Telescope - NASA Science",
+    url: 'https://science.nasa.gov/mission/webb/',
+    title: 'James Webb Space Telescope - NASA Science',
     startOffset: 43,
     endOffset: 56,
-    snippet: "Webb launched on Dec. 25th 2021"
+    snippet: 'Webb launched on Dec. 25th 2021'
   }
 });
 await channel.annotations.publish(msgSerial, {
-  type: "citations:multiple.v1",
-  name: "en.wikipedia.org",
+  type: 'citations:multiple.v1',
+  name: 'en.wikipedia.org',
   data: {
-    url: "https://en.wikipedia.org/wiki/James_Webb_Space_Telescope",
-    title: "James Webb Space Telescope - Wikipedia",
+    url: 'https://en.wikipedia.org/wiki/James_Webb_Space_Telescope',
+    title: 'James Webb Space Telescope - Wikipedia',
     startOffset: 95,
     endOffset: 104,
-    snippet: "The telescope's first image was released to the public on 11 July 2022."
+    snippet: 'The telescope\'s first image was released to the public on 11 July 2022.'
   }
 });
 ```
 </Code>
 
 <Aside data-type="note">
-When streaming response tokens using the [message-per-response](/docs/ai-transport/message-per-response) pattern, citations can be published while the response is still being streamed since the `serial` of the response message is known after the [initial message is published](/docs/ai-transport/token-streaming/message-per-response#publishing).
+When streaming response tokens using the [message-per-response](/docs/ai-transport/message-per-response) pattern, you can publish citations while the response is still streaming since the `serial` of the response message becomes known after you [publish the initial message](/docs/ai-transport/token-streaming/message-per-response#publishing).
 </Aside>
 
 <Aside data-type="note">
@@ -147,7 +147,7 @@ Identify the agent with a [`clientId`](/docs/messages#properties) in order to at
 Set [`echoMessages`](/docs/api/realtime-sdk/types#client-options) to `false` on the agent's Ably client to prevent the agent from receiving its own responses and citations, avoiding billing for [echoed messages](/docs/pub-sub/advanced#echo).
 </Aside>
 
-## Subscribing to summaries <a id="annotation-summary"/>
+## Subscribe to summaries <a id="annotation-summary"/>
 
 
 Clients can display a summary of the citations attached to a response by using [annotation summaries](/docs/messages/annotations#annotation-summaries). Clients receive realtime updates to annotation summaries automatically when subscribing to a channel, which are [delivered as messages](/docs/messages/annotations#subscribe) with an `action` of `message.summary`. When using [`multiple.v1`](/docs/messages/annotations#multiple) summarization, counts are grouped by the annotation `name`.
@@ -160,13 +160,13 @@ In the example below, the `name` is set to the domain name of the citation sourc
 
 <Code>
 ```javascript
-const channel = realtime.channels.get("ai:{{RANDOM_CHANNEL_NAME}}");
+const channel = realtime.channels.get('ai:{{RANDOM_CHANNEL_NAME}}');
 
 await channel.subscribe((message) => {
-  if (message.action === "message.summary") {
-    const citations = message.annotations.summary["citations:multiple.v1"];
+  if (message.action === 'message.summary') {
+    const citations = message.annotations.summary['citations:multiple.v1'];
     if (citations) {
-      console.log("Citation summary:", citations);
+      console.log('Citation summary:', citations);
     }
   }
 });
@@ -208,19 +208,19 @@ When agents publish citations with a [`clientId`](/docs/auth/identified-clients)
 The `clipped` field indicates whether the summary was truncated due to size limits. This only occurs when a large number of clients with distinct `clientId`s publish annotations. See [large summaries](/docs/messages/annotations#large-summaries) for more information.
 </Aside>
 
-## Subscribing to individual citations <a id="individual-citations"/>
+## Subscribe to individual citations <a id="individual-citations"/>
 
 To access the full citation data, subscribe to [individual annotation events](/docs/messages/annotations#individual-annotations):
 
 <Code>
 ```javascript
-const channel = realtime.channels.get("ai:{{RANDOM_CHANNEL_NAME}}", {
-  modes: ["ANNOTATION_SUBSCRIBE"]
+const channel = realtime.channels.get('ai:{{RANDOM_CHANNEL_NAME}}', {
+  modes: ['ANNOTATION_SUBSCRIBE']
 });
 
 await channel.annotations.subscribe((annotation) => {
-  if (annotation.action === "annotation.create" &&
-      annotation.type === "citations:multiple.v1") {
+  if (annotation.action === 'annotation.create' &&
+      annotation.type === 'citations:multiple.v1') {
     const { url, title } = annotation.data;
     console.log(`Citation: ${title} (${url})`);
     // Output: Citation: James Webb Space Telescope - Wikipedia (https://en.wikipedia.org/wiki/James_Webb_Space_Telescope)
@@ -254,7 +254,7 @@ Each annotation event includes the `messageSerial` of the response message it is
 Subscribe to individual annotation events when you need the full citation data updated in realtime, such as for rendering clickable source links or attaching inline citation markers to specific portions of the response text as citations arrive.
 </Aside>
 
-## Retrieving citations on demand <a id="on-demand"/>
+## Retrieve citations on demand <a id="on-demand"/>
 
 Annotations can also be retrieved via the [REST API](/docs/api/rest-api#annotations-list) without maintaining a realtime subscription.
 
diff --git a/src/pages/docs/ai-transport/messaging/tool-calls.mdx b/src/pages/docs/ai-transport/messaging/tool-calls.mdx
index d987d2bb26..15de573ff2 100644
--- a/src/pages/docs/ai-transport/messaging/tool-calls.mdx
+++ b/src/pages/docs/ai-transport/messaging/tool-calls.mdx
@@ -24,7 +24,7 @@ Surfacing tool calls supports:
 - Human-in-the-loop workflows: Expose tool calls [resolved by humans](/docs/ai-transport/messaging/human-in-the-loop) where users can review and approve tool execution before it happens
 - Generative UI: Build dynamic, contextual UI components based on the structured tool data
 
-## Publishing tool calls <a id="publishing"/>
+## Publish tool calls <a id="publishing"/>
 
 Publish tool call and model output messages to the channel.
 
@@ -98,7 +98,7 @@ To learn how to stream individual tokens as they are generated, see the [token s
 Set [`echoMessages`](/docs/api/realtime-sdk/types#client-options) to `false` on the agent's Ably client to prevent the agent from receiving its own tool call messages, avoiding billing for [echoed messages](/docs/pub-sub/advanced#echo).
 </Aside>
 
-## Subscribing to tool calls <a id="subscribing"/>
+## Subscribe to tool calls <a id="subscribing"/>
 
 Subscribe to tool call and model output messages on the channel.
 
@@ -188,7 +188,7 @@ await channel.subscribe((message) => {
 </Code>
 
 <Aside data-type="note">
-Tool call arguments can be streamed token by token as they are generated by the model. When implementing token-level streaming, your UI should handle parsing partial JSON gracefully to render realtime updates as the arguments stream in. To learn more about approaches to token streaming, see the [token streaming](/docs/ai-transport/token-streaming) documentation.
+Models generate tool call arguments token by token, which you can stream to the client. When implementing token-level streaming, your UI should handle parsing partial JSON gracefully to render realtime updates as the arguments stream in. To learn more about approaches to token streaming, see the [token streaming](/docs/ai-transport/token-streaming) documentation.
 </Aside>
 
 ## Client-side tools <a id="client-tools"/>
diff --git a/src/pages/docs/ai-transport/sessions-identity/identifying-users-and-agents.mdx b/src/pages/docs/ai-transport/sessions-identity/identifying-users-and-agents.mdx
index 4614747840..f762d4ae87 100644
--- a/src/pages/docs/ai-transport/sessions-identity/identifying-users-and-agents.mdx
+++ b/src/pages/docs/ai-transport/sessions-identity/identifying-users-and-agents.mdx
@@ -317,7 +317,7 @@ await channel.subscribe((message) => {
 </Code>
 
 <Aside data-type="note">
-Agents can also authenticate using [token authentication](/docs/auth/token), which is useful when agents run in environments where API keys cannot be accessed securely or when agents need [dynamic capabilities](/docs/auth/token#dynamic-channel-access-control-.
+Agents can also authenticate using [token authentication](/docs/auth/token), which is useful when agents run in environments where API keys cannot be accessed securely or when agents need [dynamic capabilities](/docs/auth/token#dynamic-channel-access-control).
 </Aside>
 
 ## Adding roles and attributes <a id="roles-attributes"/>
diff --git a/src/pages/docs/ai-transport/sessions-identity/index.mdx b/src/pages/docs/ai-transport/sessions-identity/index.mdx
index e1b9be5a05..828fdd6538 100644
--- a/src/pages/docs/ai-transport/sessions-identity/index.mdx
+++ b/src/pages/docs/ai-transport/sessions-identity/index.mdx
@@ -15,7 +15,7 @@ A session is an interaction between a user (or multiple users) and an AI agent w
 - Recover from interruptions: Experience connection drops, browser refreshes, or network instability without losing conversation progress
 - Collaborate in shared sessions: Multiple users can participate in the same conversation simultaneously and remain in sync
 
-These capabilities represent a fundamental shift from traditional request/response AI experiences to continuous, resumable interactions that remain accessible across all user devices and locations. Sessions have a lifecycle: they begin when a user starts interacting with an agent, remain active while the interaction continues, and can persist even when users disconnect - enabling truly asynchronous AI workflows.
+These capabilities represent a fundamental shift from traditional request/response AI experiences to continuous, resumable interactions that are accessible across all user devices and locations. Sessions have a lifecycle: they begin when a user starts interacting with an agent, remain active while the interaction continues, and can persist even when users disconnect - enabling truly asynchronous AI workflows.
 
 Managing this lifecycle in AI Transport's decoupled architecture involves detecting when users are present, deciding when to stop or continue agent work, and handling scenarios where users disconnect and return.
 
diff --git a/src/pages/docs/ai-transport/sessions-identity/resuming-sessions.mdx b/src/pages/docs/ai-transport/sessions-identity/resuming-sessions.mdx
index e5e4fbd159..56c99c93d4 100644
--- a/src/pages/docs/ai-transport/sessions-identity/resuming-sessions.mdx
+++ b/src/pages/docs/ai-transport/sessions-identity/resuming-sessions.mdx
@@ -37,9 +37,9 @@ For detailed examples of hydrating the token stream, see the token streaming doc
 
 When an agent restarts, it needs to resume from where it left off. This involves two distinct concerns:
 
-1. **Recovering the agent's execution state**: The current step in the workflow, local variables, function call results, pending operations, and any other state needed to continue execution. This state is internal to the agent and typically not visible to users.
+1. Recovering the agent's execution state: The current step in the workflow, local variables, function call results, pending operations, and any other state needed to continue execution. This state is internal to the agent and typically not visible to users.
 
-2. **Catching up on session activity**: Any user messages, events, or other activity that occurred while the agent was offline.
+2. Catching up on session activity: Any user messages, events, or other activity that occurred while the agent was offline.
 
 These are separate problems requiring different solutions. Agent execution state is handled by your application and you choose how to persist and restore the internal state your agent needs to resume.
 
diff --git a/src/pages/docs/ai-transport/token-streaming/index.mdx b/src/pages/docs/ai-transport/token-streaming/index.mdx
index b727224e71..9dfa3030b0 100644
--- a/src/pages/docs/ai-transport/token-streaming/index.mdx
+++ b/src/pages/docs/ai-transport/token-streaming/index.mdx
@@ -83,10 +83,11 @@ Example use cases:
 
 ## Message events
 
-Different models and frameworks use different events to signal streaming state, for example start events, stop events, tool calls, and content deltas. When you publish a message to an Ably channel, you can set the [message name](/docs/messages#properties) to the event type your client expects, or encode the information in [message extras]((/docs/messages#properties)) or within the payload itself. This allows your frontend to handle each event type appropriately without parsing message content.
+Different models and frameworks use different events to signal streaming state, for example start events, stop events, tool calls, and content deltas. When you publish a message to an Ably [channel](/docs/channels), you can set the [message name](/docs/messages#properties) to the event type your client expects, or encode the information in message [`extras`](/docs/messages#properties) or within the payload itself. This allows your frontend to handle each event type appropriately without parsing message content.
 
 ## Next steps
 
 - Implement token streaming with [message-per-response](/docs/ai-transport/token-streaming/message-per-response) (recommended for most applications)
 - Implement token streaming with [message-per-token](/docs/ai-transport/token-streaming/message-per-token) for sliding-window use cases
-- Explore the guides for integration with specific models and frameworks
+- Explore the [guides](/docs/guides/ai-transport/openai-message-per-response) for integration with specific models and frameworks
+- Learn about [sessions and identity](/docs/ai-transport/sessions-identity) in AI Transport applications
diff --git a/src/pages/docs/ai-transport/token-streaming/message-per-response.mdx b/src/pages/docs/ai-transport/token-streaming/message-per-response.mdx
index 3442300f18..88d18426cc 100644
--- a/src/pages/docs/ai-transport/token-streaming/message-per-response.mdx
+++ b/src/pages/docs/ai-transport/token-streaming/message-per-response.mdx
@@ -7,7 +7,7 @@ Token streaming with message-per-response is a pattern where every token generat
 
 This pattern is useful for chat-style applications where you want each complete AI response stored as a single message in history, making it easy to retrieve and display multi-response conversation history. Each agent response becomes a single message that grows as tokens are appended, allowing clients joining mid-stream to catch up efficiently without processing thousands of individual tokens.
 
-The message-per-response pattern includes [automatic rate limit protection](/docs/ai-transport/features/token-streaming/token-rate-limits#per-response) through rollups, making it the recommended approach for most token streaming use cases.
+The message-per-response pattern includes [automatic rate limit protection](/docs/ai-transport/token-rate-limits#per-response) through rollups, making it the recommended approach for most token streaming use cases.
 
 ## How it works <a id="how-it-works"/>
 
@@ -16,7 +16,7 @@ The message-per-response pattern includes [automatic rate limit protection](/doc
 3. **Live delivery**: Clients subscribed to the channel receive each appended token in realtime, allowing them to progressively render the response.
 4. **Compacted history**: The channel history contains only one message per agent response, which includes all tokens appended to it concatenated together.
 
-You do not need to mark the message or token stream as completed; the final message content will automatically include the full response constructed from all appended tokens.
+You do not need to mark the message or token stream as completed; the final message content automatically includes the full response constructed from all appended tokens.
 
 <Aside data-type="important">
 Standard Ably message [size limits](/docs/platform/pricing/limits#message) apply to the complete concatenated message. The system validates size limits before accepting append operations. If appending a token would exceed the maximum message size, the append is rejected.
@@ -123,7 +123,7 @@ The `appendRollupWindow` parameter controls how many tokens are combined into ea
 The default 40ms window strikes a balance, delivering tokens at 25 messages per second - smooth enough for a great user experience while allowing you to run two simultaneous response streams on a single connection. If you need to support more concurrent streams, increase the rollup window (up to 500ms), accepting that tokens will arrive in more noticeable batches. Alternatively, instantiate a separate Ably client which uses its own connection, giving you access to additional message rate capacity.
 
 <Aside data-type="further-reading">
-For more details on rate limits and rollup behavior, see [Token streaming limits](/docs/ai-transport/features/token-streaming/token-rate-limits#rollup).
+For more details on rate limits and rollup behavior, see [Token streaming limits](/docs/ai-transport/token-rate-limits#rollup).
 </Aside>
 
 ## Subscribing to token streams <a id="subscribing"/>
@@ -169,7 +169,7 @@ When clients connect or reconnect, such as after a page refresh, they often need
 The message per response pattern enables efficient client state hydration without needing to process every individual token and supports seamlessly transitioning from historical responses to live tokens.
 
 <Aside data-type="note">
-For brief disconnections, Ably's automatic [connection recovery](docs/connect/states#connection-state-recovery) ensures that clients receive all missed tokens in order immediately on reconnection, and no additional client action is needed. The sections that follow describe actions that can be taken in response to longer disconnections, or situations such as a page refresh where client state is lost.
+For brief disconnections, Ably's automatic [connection recovery](/docs/connect/states#connection-state-recovery) ensures that clients receive all missed tokens in order immediately on reconnection, and no additional client action is needed. The sections that follow describe actions that can be taken in response to longer disconnections, or situations such as a page refresh where client state is lost.
 </Aside>
 
 ### Using rewind for recent history <a id="rewind"/>
@@ -274,7 +274,7 @@ You can hydrate in-progress responses using either the [rewind](#rewind) or [his
 
 #### Publishing with correlation metadata <a id="publishing-with-metadata"/>
 
-To correlate Ably messages with your database records, include the `responseId` in the message [extras](/docs/messages#properties) when publishing:
+To correlate Ably messages with your database records, include the `responseId` in the message [`extras`](/docs/messages#properties) when publishing:
 
 <Code>
 ```javascript
@@ -305,12 +305,12 @@ for await (const event of stream) {
 </Code>
 
 <Aside data-type="note">
-When appending tokens, include the `extras` with all headers to preserve them on the message. If you omit `extras` from an append operation, any existing headers will be removed. If you include `extras`, the headers completely supersede any previous headers. This is the same [mixin behavior](/docs/messages/updates-deletes) used for message updates and deletes.
+When appending tokens, include the [`extras`](/docs/messages#properties) with all headers to preserve them on the message. If you omit `extras` from an append operation, any existing headers will be removed. If you include `extras`, the headers completely supersede any previous headers. This is the same [mixin behavior](/docs/messages/updates-deletes) used for message updates and deletes.
 </Aside>
 
 #### Hydrate using rewind
 
-When hydrating, load completed responses from your database, then use rewind to catch up on any in-progress response. Check the `responseId` from message extras to skip responses already loaded from your database:
+When hydrating, load completed responses from your database, then use [rewind](/docs/channels/options/rewind) to catch up on any in-progress response. Check the `responseId` from message extras to skip responses already loaded from your database:
 
 <Code>
 ```javascript
@@ -359,7 +359,7 @@ await channel.subscribe((message) => {
 </Code>
 
 <Aside data-type="note">
-Alternatively, instead of including `responseId` in message extras, you could store the Ably message `serial` alongside each response in your database. When hydrating, skip messages for responses already loaded from database using the message `serial` to identify the response. This approach eliminates the need to include metadata in message extras, but requires storing the `serial` (which is assigned by Ably) in your database after the response is complete.
+Alternatively, instead of including `responseId` in message [`extras`](/docs/messages#properties), you could store the Ably message `serial` alongside each response in your database. When hydrating, skip messages for responses already loaded from database using the message `serial` to identify the response. This approach eliminates the need to include metadata in message extras, but requires storing the `serial` (which is assigned by Ably) in your database after the response is complete.
 </Aside>
 
 #### Hydrate using history
diff --git a/src/pages/docs/ai-transport/token-streaming/message-per-token.mdx b/src/pages/docs/ai-transport/token-streaming/message-per-token.mdx
index 434aa6aa57..8ea595cfc6 100644
--- a/src/pages/docs/ai-transport/token-streaming/message-per-token.mdx
+++ b/src/pages/docs/ai-transport/token-streaming/message-per-token.mdx
@@ -7,10 +7,10 @@ Token streaming with message-per-token is a pattern where every token generated
 
 This pattern is useful when clients only care about the most recent part of a response and you are happy to treat the channel history as a short sliding window rather than a full conversation log. For example:
 
-- **Backend-stored responses**: The backend writes complete responses to a database and clients load those full responses from there, while Ably is used only to deliver live tokens for the current in-progress response.
-- **Live transcription, captioning, or translation**: A viewer who joins a live stream only needs sufficient tokens for the current "frame" of subtitles, not the entire transcript so far.
-- **Code assistance in an editor**: Streamed tokens become part of the file on disk as they are accepted, so past tokens do not need to be replayed from Ably.
-- **Autocomplete**: A fresh response is streamed for each change a user makes to a document, with only the latest suggestion being relevant.
+- Backend-stored responses: The backend writes complete responses to a database and clients load those full responses from there, while Ably is used only to deliver live tokens for the current in-progress response.
+- Live transcription, captioning, or translation: A viewer who joins a live stream only needs sufficient tokens for the current "frame" of subtitles, not the entire transcript so far.
+- Code assistance in an editor: Streamed tokens become part of the file on disk as they are accepted, so past tokens do not need to be replayed from Ably.
+- Autocomplete: A fresh response is streamed for each change a user makes to a document, with only the latest suggestion being relevant.
 
 ## Publishing tokens <a id="publishing"/>
 
@@ -49,7 +49,7 @@ for await (const event of stream) {
 This approach maximizes throughput while maintaining ordering guarantees, allowing you to stream tokens as fast as your AI model generates them.
 
 <Aside data-type="important">
-Unlike the [message-per-response](/docs/ai-transport/features/token-streaming/message-per-response) pattern, the message-per-token pattern requires you to [manage rate limits directly](/docs/ai-transport/features/token-streaming/token-rate-limits#per-token).
+Unlike the [message-per-response](/docs/ai-transport/token-streaming/message-per-response) pattern, the message-per-token pattern requires you to [manage rate limits directly](/docs/ai-transport/token-rate-limits#per-token).
 </Aside>
 
 <Aside data-type="note">
@@ -97,7 +97,7 @@ This pattern is simple and works well when you're displaying a single, continuou
 
 ### Token stream with multiple responses <a id="multiple-responses"/>
 
-For applications with multiple responses, such as chat conversations, include a `responseId` in message [extras](/docs/messages#properties) to correlate tokens together that belong to the same response.
+For applications with multiple responses, such as chat conversations, include a `responseId` in message [`extras`](/docs/messages#properties) to correlate tokens together that belong to the same response.
 
 #### Publish tokens
 
@@ -155,7 +155,7 @@ await channel.subscribe('token', (message) => {
 
 ### Token stream with explicit start/stop events <a id="explicit-events"/>
 
-In some cases, your AI model response stream may include explicit events to mark response boundaries. You can indicate the event type, such as a response start/stop event, using the Ably message name.
+In some cases, your AI model response stream may include explicit events to mark response boundaries. You can indicate the event type, such as a response start/stop event, using the Ably message [`name`](/docs/messages#properties).
 
 #### Publish tokens
 
@@ -244,7 +244,7 @@ await channel.subscribe('stop', (message) => {
 When clients connect or reconnect, such as after a page refresh, they often need to catch up on tokens that were published while they were offline or before they joined. Ably provides several approaches to hydrate client state depending on your application's requirements.
 
 <Aside data-type="note">
-For brief disconnections, Ably's automatic [connection recovery](docs/connect/states#connection-state-recovery) ensures that clients receive all missed tokens in order immediately on reconnection, and no additional client action is needed. The sections that follow describe actions that can be taken in response to longer disconnections, or situations such as a page refresh where client state is lost.
+For brief disconnections, Ably's automatic [connection recovery](/docs/connect/states#connection-state-recovery) ensures that clients receive all missed tokens in order immediately on reconnection, and no additional client action is needed. The sections that follow describe actions that can be taken in response to longer disconnections, or situations such as a page refresh where client state is lost.
 </Aside>
 
 <Aside data-type="note">
diff --git a/src/pages/docs/ai-transport/messaging/token-rate-limits.mdx b/src/pages/docs/ai-transport/token-streaming/token-rate-limits.mdx
similarity index 83%
rename from src/pages/docs/ai-transport/messaging/token-rate-limits.mdx
rename to src/pages/docs/ai-transport/token-streaming/token-rate-limits.mdx
index b5b3792e7a..1efaeb4aaa 100644
--- a/src/pages/docs/ai-transport/messaging/token-rate-limits.mdx
+++ b/src/pages/docs/ai-transport/token-streaming/token-rate-limits.mdx
@@ -3,7 +3,7 @@ title: Token streaming limits
 meta_description: "Learn how token streaming interacts with Ably message limits and how to ensure your application delivers consistent performance."
 ---
 
-LLM token streaming introduces high-rate or bursty traffic patterns to your application, with some models outputting upwards of 150 distinct events (i.e. tokens or response deltas) per second. Output rates can vary unpredictably over the lifetime of a response stream, and you have limited control over third-party model behaviour. AI Transport provides functionality to help you stay within your [rate limits](/docs/platform/pricing/limits) while delivering a great experience to your users.
+LLM token streaming introduces high-rate or bursty traffic patterns to your application, with some models outputting upwards of 150 distinct events (that is, tokens or response deltas) per second. Output rates can vary unpredictably over the lifetime of a response stream, and you have limited control over third-party model behaviour. AI Transport provides functionality to help you stay within your [rate limits](/docs/platform/pricing/limits) while delivering a great experience to your users.
 
 Ably's limits divide into two categories:
 
@@ -22,9 +22,9 @@ The [message-per-response](/docs/ai-transport/features/token-streaming/message-p
 2. Ably publishes the first token immediately, then automatically rolls up subsequent tokens on receipt
 3. Clients receive the same content, delivered in fewer discrete messages
 
-By default, a single response stream will be delivered at 25 messages per second or the model output rate, whichever is lower. This means you can publish two simultaneous response streams on the same channel or connection with any [Ably package](/docs/platform/pricing#packages), because each stream will be using half of the [connection inbound message rate](/docs/platform/pricing/limits#connection). You will be charged for the number of published messages, not for the number of streamed tokens.
+By default, Ably delivers a single response stream at 25 messages per second or the model output rate, whichever is lower. This means you can publish two simultaneous response streams on the same channel or connection with any [Ably package](/docs/platform/pricing#packages), because each stream uses half of the [connection inbound message rate](/docs/platform/pricing/limits#connection). Ably charges for the number of published messages, not for the number of streamed tokens.
 
-### Configuring rollup behaviour <a id="rollup"/>
+### Configure rollup behaviour <a id="rollup"/>
 
 Ably concatenates all appends for a single response that are received during the rollup window into one published message. You can specify the rollup window for a particular connection by setting the `appendRollupWindow` [transport parameter](/docs/api/realtime-sdk#client-options). This allows you to determine how much of the connection message rate can be consumed by a single response stream and control your consumption costs.
 
diff --git a/src/pages/docs/guides/ai-transport/anthropic-message-per-response.mdx b/src/pages/docs/guides/ai-transport/anthropic-message-per-response.mdx
index 1ce3260825..663fadf73f 100644
--- a/src/pages/docs/guides/ai-transport/anthropic-message-per-response.mdx
+++ b/src/pages/docs/guides/ai-transport/anthropic-message-per-response.mdx
@@ -57,7 +57,7 @@ export ANTHROPIC_API_KEY="your_api_key_here"
 Message append functionality requires "Message annotations, updates, deletes and appends" to be enabled in a [channel rule](/docs/channels#rules) associated with the channel.
 
 <Aside data-type="important">
-When the "Message annotations, updates, deletes and appends" channel rule is enabled, messages are persisted irrespective of whether or not persistence has also been explicitly enabled. This will be reflected in increased usage since [we charge for persisting messages](https://faqs.ably.com/how-does-ably-count-messages).
+When the "Message annotations, updates, deletes and appends" channel rule is enabled, messages are persisted irrespective of whether or not persistence has also been explicitly enabled. This increases usage since [we charge for persisting messages](https://faqs.ably.com/how-does-ably-count-messages).
 </Aside>
 
 To enable the channel rule:
@@ -333,7 +333,7 @@ node subscriber.mjs
 ```
 </Code>
 
-With the subscriber running, run the publisher in another terminal. The tokens stream in realtime as they are generated by the Anthropic model.
+With the subscriber running, run the publisher in another terminal. The tokens stream in realtime as the Anthropic model generates them.
 
 ## Step 5: Stream with multiple publishers and subscribers <a id="step-5"/>
 
@@ -362,7 +362,7 @@ All subscribers receive the same stream of tokens in realtime.
 
 ### Publishing concurrent responses <a id="multiple-publishers"/>
 
-Multiple publishers can stream different responses concurrently on the same channel. Each response is a distinct message with its own unique `serial` identifier, so tokens from different responses are isolated to distinct messages and don't interfere with each other.
+Multiple publishers can stream different responses concurrently on the same [channel](/docs/channels). Each response is a distinct message with its own unique `serial` identifier, so tokens from different responses are isolated to distinct messages and don't interfere with each other.
 
 To demonstrate this, run a publisher in multiple separate terminals:
 
diff --git a/src/pages/docs/guides/ai-transport/anthropic-message-per-token.mdx b/src/pages/docs/guides/ai-transport/anthropic-message-per-token.mdx
index 3b99dd66ca..996c3cb353 100644
--- a/src/pages/docs/guides/ai-transport/anthropic-message-per-token.mdx
+++ b/src/pages/docs/guides/ai-transport/anthropic-message-per-token.mdx
@@ -170,7 +170,7 @@ Set [`echoMessages`](/docs/api/realtime-sdk/types#client-options) to `false` on
 
 ### Map Anthropic streaming events to Ably messages <a id="map-events"/>
 
-Choose how to map [Anthropic streaming events](#understand-streaming-events) to Ably messages. You can choose any mapping strategy that suits your application's needs. This guide uses the following pattern as an example:
+Choose how to map [Anthropic streaming events](#understand-streaming-events) to Ably [messages](/docs/messages). You can choose any mapping strategy that suits your application's needs. This guide uses the following pattern as an example:
 
 - `start`: Signals the beginning of a response
 - `token`: Contains the incremental text content for each delta
@@ -235,10 +235,10 @@ This implementation:
 - Publishes a `start` event when the response begins
 - Filters for `content_block_delta` events with `text_delta` type and publishes them as `token` events
 - Publishes a `stop` event when the response completes
-- All published events include the `responseId` in message `extras` to allow the client to correlate events relating to a particular response
+- All published events include the `responseId` in message [`extras`](/docs/messages#properties) to allow the client to correlate events relating to a particular response
 
 <Aside data-type="note">
-Ably messages are published without `await` to maximize throughput. Ably maintains message ordering even without awaiting each publish. For more information, see [Publishing tokens](/docs/ai-transport/token-streaming/message-per-token#publishing).
+This implementation publishes Ably messages without `await` to maximize throughput. Ably maintains message ordering even without awaiting each publish. For more information, see [Publishing tokens](/docs/ai-transport/token-streaming/message-per-token#publishing).
 </Aside>
 
 Run the publisher to see tokens streaming to Ably:
@@ -307,7 +307,7 @@ node subscriber.mjs
 ```
 </Code>
 
-With the subscriber running, run the publisher in another terminal. The tokens stream in realtime as they are generated by the Anthropic model.
+With the subscriber running, run the publisher in another terminal. The tokens stream in realtime as the Anthropic model generates them.
 
 ## Step 4: Stream with multiple publishers and subscribers <a id="step-4"/>
 
@@ -336,7 +336,7 @@ All subscribers receive the same stream of tokens in realtime.
 
 ### Publishing concurrent responses <a id="multiple-publishers"/>
 
-The implementation uses `responseId` in message `extras` to correlate tokens with their originating response. This enables multiple publishers to stream different responses concurrently on the same channel, with each subscriber correctly tracking all responses independently.
+The implementation uses `responseId` in message [`extras`](/docs/messages#properties) to correlate tokens with their originating response. This enables multiple publishers to stream different responses concurrently on the same [channel](/docs/channels), with each subscriber correctly tracking all responses independently.
 
 To demonstrate this, run a publisher in multiple separate terminals:
 
diff --git a/src/pages/docs/guides/ai-transport/openai-message-per-response.mdx b/src/pages/docs/guides/ai-transport/openai-message-per-response.mdx
index 77afb9745d..0aeb1ac49a 100644
--- a/src/pages/docs/guides/ai-transport/openai-message-per-response.mdx
+++ b/src/pages/docs/guides/ai-transport/openai-message-per-response.mdx
@@ -57,7 +57,7 @@ export OPENAI_API_KEY="your_api_key_here"
 Message append functionality requires "Message annotations, updates, deletes and appends" to be enabled in a [channel rule](/docs/channels#rules) associated with the channel.
 
 <Aside data-type="important">
-When the "Message annotations, updates, deletes and appends" channel rule is enabled, messages are persisted irrespective of whether or not persistence has also been explicitly enabled. This will be reflected in increased usage since [we charge for persisting messages](https://faqs.ably.com/how-does-ably-count-messages).
+When the "Message annotations, updates, deletes and appends" channel rule is enabled, messages are persisted irrespective of whether or not persistence has also been explicitly enabled. This increases usage since [we charge for persisting messages](https://faqs.ably.com/how-does-ably-count-messages).
 </Aside>
 
 To enable the channel rule:
@@ -347,7 +347,7 @@ node subscriber.mjs
 ```
 </Code>
 
-With the subscriber running, run the publisher in another terminal. The tokens stream in realtime as they are generated by the OpenAI model.
+With the subscriber running, run the publisher in another terminal. The tokens stream in realtime as the OpenAI model generates them.
 
 ## Step 5: Stream with multiple publishers and subscribers <a id="step-5"/>
 
@@ -376,7 +376,7 @@ All subscribers receive the same stream of tokens in realtime.
 
 ### Publishing concurrent responses <a id="multiple-publishers"/>
 
-Multiple publishers can stream different responses concurrently on the same channel. Each response is a distinct message with its own unique `serial` identifier, so tokens from different responses are isolated to distinct messages and don't interfere with each other.
+Multiple publishers can stream different responses concurrently on the same [channel](/docs/channels). Each response is a distinct message with its own unique `serial` identifier, so tokens from different responses are isolated to distinct messages and don't interfere with each other.
 
 To demonstrate this, run a publisher in multiple separate terminals:
 
diff --git a/src/pages/docs/guides/ai-transport/openai-message-per-token.mdx b/src/pages/docs/guides/ai-transport/openai-message-per-token.mdx
index 7573bcbd3f..e2a0d84717 100644
--- a/src/pages/docs/guides/ai-transport/openai-message-per-token.mdx
+++ b/src/pages/docs/guides/ai-transport/openai-message-per-token.mdx
@@ -4,12 +4,12 @@ meta_description: "Stream tokens from the OpenAI Responses API over Ably in real
 meta_keywords: "AI, token streaming, OpenAI, Responses API, AI transport, Ably, realtime"
 ---
 
-This guide shows you how to stream AI responses from OpenAI's [Responses API](https://platform.openai.com/docs/api-reference/responses) over Ably using the [message-per-token pattern](/docs/ai-transport//token-streaming/message-per-token). Specifically, it implements the [explicit start/stop events approach](/docs/ai-transport//token-streaming/message-per-token#explicit-events), which publishes each response token as an individual message, along with explicit lifecycle events to signal when responses begin and end.
+This guide shows you how to stream AI responses from OpenAI's [Responses API](https://platform.openai.com/docs/api-reference/responses) over Ably using the [message-per-token pattern](/docs/ai-transport/token-streaming/message-per-token). Specifically, it implements the [explicit start/stop events approach](/docs/ai-transport/token-streaming/message-per-token#explicit-events), which publishes each response token as an individual message, along with explicit lifecycle events to signal when responses begin and end.
 
 Using Ably to distribute tokens from the OpenAI SDK enables you to broadcast AI responses to thousands of concurrent subscribers with reliable message delivery and ordering guarantees, ensuring that each client receives the complete response stream with all tokens delivered in order. This approach decouples your AI inference from client connections, enabling you to scale agents independently and handle reconnections gracefully.
 
 <Aside data-type="further-reading">
-To discover other approaches to token streaming, including the [message-per-response](/docs/ai-transport//token-streaming/message-per-response) pattern, see the [token streaming](/docs/ai-transport//token-streaming) documentation.
+To discover other approaches to token streaming, including the [message-per-response](/docs/ai-transport/token-streaming/message-per-response) pattern, see the [token streaming](/docs/ai-transport/token-streaming) documentation.
 </Aside>
 
 ## Prerequisites <a id="prerequisites"/>
@@ -155,7 +155,7 @@ This is only an illustrative example for a simple "text in, text out" use case a
 
 Publish OpenAI streaming events to Ably to reliably and scalably distribute them to subscribers.
 
-This implementation follows the [explicit start/stop events pattern](/docs/ai-transport//token-streaming/message-per-token#explicit-events), which provides clear response boundaries.
+This implementation follows the [explicit start/stop events pattern](/docs/ai-transport/token-streaming/message-per-token#explicit-events), which provides clear response boundaries.
 
 ### Initialize the Ably client <a id="initialize-ably"/>
 
@@ -184,7 +184,7 @@ Set [`echoMessages`](/docs/api/realtime-sdk/types#client-options) to `false` on
 
 ### Map OpenAI streaming events to Ably messages <a id="map-events"/>
 
-Choose how to map [OpenAI streaming events](#understand-streaming-events) to Ably messages. You can choose any mapping strategy that suits your application's needs. This guide uses the following pattern as an example:
+Choose how to map [OpenAI streaming events](#understand-streaming-events) to Ably [messages](/docs/messages). You can choose any mapping strategy that suits your application's needs. This guide uses the following pattern as an example:
 
 - `start`: Signals the beginning of a response
 - `token`: Contains the incremental text content for each delta
@@ -257,10 +257,10 @@ This implementation:
 - Publishes a `start` event when the response begins
 - Filters for `response.output_text.delta` events from `message` type output items and publishes them as `token` events
 - Publishes a `stop` event when the response completes
-- All published events include the `responseId` in message `extras` to allow the client to correlate events relating to a particular response
+- All published events include the `responseId` in message [`extras`](/docs/messages#properties) to allow the client to correlate events relating to a particular response
 
 <Aside data-type="note">
-Ably messages are published without `await` to maximize throughput. Ably maintains message ordering even without awaiting each publish. For more information, see [Publishing tokens](/docs/ai-transport//token-streaming/message-per-token#publishing).
+This implementation publishes Ably messages without `await` to maximize throughput. Ably maintains message ordering even without awaiting each publish. For more information, see [Publishing tokens](/docs/ai-transport/token-streaming/message-per-token#publishing).
 </Aside>
 
 Run the publisher to see tokens streaming to Ably:
@@ -329,11 +329,11 @@ node subscriber.mjs
 ```
 </Code>
 
-With the subscriber running, run the publisher in another terminal. The tokens stream in realtime as they are generated by the OpenAI model.
+With the subscriber running, run the publisher in another terminal. The tokens stream in realtime as the OpenAI model generates them.
 
 ## Step 4: Stream with multiple publishers and subscribers <a id="step-4"/>
 
-Ably's [channel-oriented sessions](/docs/ai-transport//sessions-identity#connection-oriented-vs-channel-oriented-sessions) enables multiple AI agents to publish responses and multiple users to receive them on a single channel simultaneously. Ably handles message delivery to all participants, eliminating the need to implement routing logic or manage state synchronization across connections.
+Ably's [channel-oriented sessions](/docs/ai-transport/sessions-identity#connection-oriented-vs-channel-oriented-sessions) enables multiple AI agents to publish responses and multiple users to receive them on a single channel simultaneously. Ably handles message delivery to all participants, eliminating the need to implement routing logic or manage state synchronization across connections.
 
 ### Broadcasting to multiple subscribers <a id="broadcasting"/>
 
@@ -358,7 +358,7 @@ All subscribers receive the same stream of tokens in realtime.
 
 ### Publishing concurrent responses <a id="multiple-publishers"/>
 
-The implementation uses `responseId` in message `extras` to correlate tokens with their originating response. This enables multiple publishers to stream different responses concurrently on the same channel, with each subscriber correctly tracking all responses independently.
+The implementation uses `responseId` in message [`extras`](/docs/messages#properties) to correlate tokens with their originating response. This enables multiple publishers to stream different responses concurrently on the same [channel](/docs/channels), with each subscriber correctly tracking all responses independently.
 
 To demonstrate this, run a publisher in multiple separate terminals:
 
@@ -379,7 +379,7 @@ All running subscribers receive tokens from all responses concurrently. Each sub
 
 ## Next steps
 
-- Learn more about the [message-per-token pattern](/docs/ai-transport//token-streaming/message-per-token) used in this guide
-- Learn about [client hydration strategies](/docs/ai-transport//token-streaming/message-per-token#hydration) for handling late joiners and reconnections
-- Understand [sessions and identity](/docs/ai-transport//sessions-identity) in AI enabled applications
-- Explore the [message-per-response pattern](/docs/ai-transport//token-streaming/message-per-response) for storing complete AI responses as single messages in history
+- Learn more about the [message-per-token pattern](/docs/ai-transport/token-streaming/message-per-token) used in this guide
+- Learn about [client hydration strategies](/docs/ai-transport/token-streaming/message-per-token#hydration) for handling late joiners and reconnections
+- Understand [sessions and identity](/docs/ai-transport/sessions-identity) in AI enabled applications
+- Explore the [message-per-response pattern](/docs/ai-transport/token-streaming/message-per-response) for storing complete AI responses as single messages in history