From 1753aac4652e1b772daa29acbc932816ca74dd3d Mon Sep 17 00:00:00 2001
From: "global.eye.wu" <global.eye.wu@gmail.com>
Date: Thu, 14 May 2026 15:59:36 +0800
Subject: [PATCH 1/2] chore(ci): bump actions/checkout + actions/setup-node to
 @v5

GitHub annotated PR #6 with a deprecation warning: the @v4 versions
of these actions run on Node.js 20 internally, which GitHub is
phasing out as the runner default moves to Node.js 24. The @v5
versions are functionally identical for our usage but are built on
the new runtime.

No workflow behavior changes; cache and matrix config untouched.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 .github/workflows/ci.yml | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
index 5d9e549..5621549 100644
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -15,10 +15,10 @@ jobs:
       matrix:
         os: [ubuntu-latest, macos-latest, windows-latest]
     steps:
-      - uses: actions/checkout@v4
+      - uses: actions/checkout@v5
 
       - name: Setup Node.js
-        uses: actions/setup-node@v4
+        uses: actions/setup-node@v5
         with:
           node-version: 20
           cache: npm

From 4eaae2967d4fe55c63ce0c0b8940fe441a1b5886 Mon Sep 17 00:00:00 2001
From: "global.eye.wu" <global.eye.wu@gmail.com>
Date: Thu, 14 May 2026 16:08:26 +0800
Subject: [PATCH 2/2] feat(model): enable Anthropic prompt caching end-to-end
 (M1.5b)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The wiring built in M1.5a (cache token accumulators, cost factors,
`myagent usage` CLI) now has something to count. This change toggles
the request side from "no cache_control anywhere" to "two cache
breakpoints: system + tools".

Type extensions:

- New `SystemTextBlock` in `model.ts`: `{type:"text", text, cache_control?}`.
- `ModelRequest.system` now accepts `string | readonly SystemTextBlock[]`.
- `ToolContext.system` and `QueryOptions.system` propagated the same way.
- `ForkTrace.systemPrompt` accepts both forms and hashes their text
  content, so the fork-trace identity stays stable across the legacy
  flat-string and structured-array representations.

Outbound request shape:

- `buildAgentSystemPrompt` (in cli/src/index.ts) returns a single
  `SystemTextBlock` containing base prompt + memory + skill context,
  marked `cache_control: { type: "ephemeral" }`. Identical content
  across every turn of a session → cache hit on every turn after the
  first.

- `toAnthropicTools` (in core/src/anthropic.ts) marks the *last* tool
  in the list with `cache_control: ephemeral`, turning the whole tool
  list into a single cache breakpoint. Tool definitions are stable
  across turns by construction, so the breakpoint reliably hits.

- `toAnthropicTools` and `toModelUsage` are now exported so the
  security suite can unit-test them.

Response parsing:

- `toModelUsage` extracts `cache_creation_input_tokens` and
  `cache_read_input_tokens` from the SDK's `message_start.message.usage`
  and `message_delta.usage`. Both fields are optional; non-cached
  turns leave them `undefined`, which `addTokenUsage` already treats
  as zero.

- `runAgentTurn` emits per-turn profile metrics
  `model.cache_creation_input_tokens` /
  `model.cache_read_input_tokens` and per-session counterparts
  `session.cache_creation_input_tokens` /
  `session.cache_read_input_tokens`.

Tests added in `packages/core/test/security/prompt-caching.test.ts`
(6 cases on `toAnthropicTools` + `toModelUsage`) and a CLI assertion
that the agent's outbound `request.system` is the structured form
with a `cache_control` marker. Catalog row added; CLAUDE.md updated.

Two pre-existing cli tests captured `request.system` as a string;
extracted a `systemToText` helper to flatten the array form during
assertions.

Also bundled the chore from PR #6's deprecation annotation:
actions/checkout and actions/setup-node bumped from @v4 to @v5.

Local: 161 tests, 3/3 runs green.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 CLAUDE.md                                     |  2 +-
 packages/cli/src/index.ts                     | 45 +++++++++++-
 packages/cli/test/cli.test.ts                 | 52 ++++++++++++-
 packages/core/src/anthropic.ts                | 57 +++++++++++----
 packages/core/src/fork.ts                     | 15 +++-
 packages/core/src/model.ts                    | 19 ++++-
 packages/core/src/query.ts                    |  7 +-
 packages/core/src/types.ts                    |  4 +-
 packages/core/test/security/README.md         | 10 +++
 .../core/test/security/prompt-caching.test.ts | 73 +++++++++++++++++++
 10 files changed, 257 insertions(+), 27 deletions(-)
 create mode 100644 packages/core/test/security/prompt-caching.test.ts

diff --git a/CLAUDE.md b/CLAUDE.md
index f7f0507..1bdeda5 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -25,7 +25,7 @@ The CLI binary entry is `packages/cli/dist/index.js` (exposed as `myagent`). The
 ### Environment
 
 - `ANTHROPIC_API_KEY` — required for real model calls (`chat`, `agent`, `tui`). Read from process env or a local `.env` (parsed by `loadEnvironment` in `packages/cli/src/index.ts`; only an allow-listed set of keys is honored).
-- `ANTHROPIC_BASE_URL`, `MYAGENT_MODEL`, `MYAGENT_PERMISSION_MODE`, `MYAGENT_INPUT_USD_PER_MTOK`, `MYAGENT_OUTPUT_USD_PER_MTOK`, `MYAGENT_CACHE_WRITE_USD_PER_MTOK`, `MYAGENT_CACHE_READ_USD_PER_MTOK` — optional overrides. The two cache-rate vars feed `estimateUsageCostUsd` and surface in `myagent usage <sessionId>` once prompt caching is enabled in M1.5b.
+- `ANTHROPIC_BASE_URL`, `MYAGENT_MODEL`, `MYAGENT_PERMISSION_MODE`, `MYAGENT_INPUT_USD_PER_MTOK`, `MYAGENT_OUTPUT_USD_PER_MTOK`, `MYAGENT_CACHE_WRITE_USD_PER_MTOK`, `MYAGENT_CACHE_READ_USD_PER_MTOK` — optional overrides. Prompt caching is wired on outbound requests: the agent's system prompt is sent as a single `SystemTextBlock` with `cache_control: ephemeral`, and the tool list's last entry carries a matching marker so the whole tool block is cached. `cacheCreationInputTokens` / `cacheReadInputTokens` flow back through `ModelUsage` → `TokenUsage` → session record → `myagent usage <sessionId>` per-turn breakdown.
 - Offline tests use `FakeModel` and do not need an API key.
 - Runtime state (sessions, artifacts, profiles, tasks, fork traces, memory) is written under `.myagent/` in the cwd; gitignored.
 
diff --git a/packages/cli/src/index.ts b/packages/cli/src/index.ts
index e833973..09a24b4 100644
--- a/packages/cli/src/index.ts
+++ b/packages/cli/src/index.ts
@@ -46,6 +46,7 @@ import {
   type MemoryEntry,
   type SessionCompactionArchiver,
   type SessionEvent,
+  type SystemTextBlock,
   type ModelClient,
   type ModelStreamEvent,
   type PermissionDecision,
@@ -1576,6 +1577,18 @@ async function runAgentTurn(options: RunAgentTurnOptions): Promise<AgentTurnResu
           profile.addMetric("model.output_tokens", event.usage.outputTokens ?? 0, "tokens", {
             requestId: event.requestId
           });
+          profile.addMetric(
+            "model.cache_creation_input_tokens",
+            event.usage.cacheCreationInputTokens ?? 0,
+            "tokens",
+            { requestId: event.requestId }
+          );
+          profile.addMetric(
+            "model.cache_read_input_tokens",
+            event.usage.cacheReadInputTokens ?? 0,
+            "tokens",
+            { requestId: event.requestId }
+          );
           profile.addMetric("model.cost_usd", costDelta, "usd", {
             requestId: event.requestId,
             estimated: true
@@ -1638,6 +1651,16 @@ async function runAgentTurn(options: RunAgentTurnOptions): Promise<AgentTurnResu
     const finalState = getBootstrapState();
     profile.addMetric("session.input_tokens", finalState.tokenUsage.inputTokens, "tokens");
     profile.addMetric("session.output_tokens", finalState.tokenUsage.outputTokens, "tokens");
+    profile.addMetric(
+      "session.cache_creation_input_tokens",
+      finalState.tokenUsage.cacheCreationInputTokens,
+      "tokens"
+    );
+    profile.addMetric(
+      "session.cache_read_input_tokens",
+      finalState.tokenUsage.cacheReadInputTokens,
+      "tokens"
+    );
     profile.addMetric("session.cost_usd", finalState.costUsd, "usd", { estimated: true });
     await profileStore.save(profile.finish("completed")).catch(() => undefined);
     return { exitCode: 0, sessionId: bootstrap.sessionId };
@@ -1856,10 +1879,28 @@ function parseOptionalNumber(value: string | undefined): number | undefined {
   return Number.isFinite(parsed) && parsed >= 0 ? parsed : undefined;
 }
 
-function buildAgentSystemPrompt(memoryContext: string, skillContext: string): string {
-  return [READ_ONLY_AGENT_SYSTEM_PROMPT, memoryContext.trim(), skillContext.trim()]
+/**
+ * Returns the agent's system prompt as a structured block array so we
+ * can mark it as a prompt-cache breakpoint. The combined content is
+ * placed in a single text block with `cache_control: ephemeral`, which
+ * tells Anthropic to cache the entire system prompt: identical reuse
+ * across every turn of a session, since memory + skill snapshots are
+ * captured once at session start.
+ */
+function buildAgentSystemPrompt(
+  memoryContext: string,
+  skillContext: string
+): readonly SystemTextBlock[] {
+  const combined = [READ_ONLY_AGENT_SYSTEM_PROMPT, memoryContext.trim(), skillContext.trim()]
     .filter((part) => part.length > 0)
     .join("\n\n");
+  return [
+    {
+      type: "text",
+      text: combined,
+      cache_control: { type: "ephemeral" }
+    }
+  ];
 }
 
 const READ_ONLY_AGENT_SYSTEM_PROMPT = `You are myagent Week 18, a safety-first coding agent.
diff --git a/packages/cli/test/cli.test.ts b/packages/cli/test/cli.test.ts
index 38f1e99..f5be100 100644
--- a/packages/cli/test/cli.test.ts
+++ b/packages/cli/test/cli.test.ts
@@ -26,6 +26,12 @@ function captureWriter() {
   };
 }
 
+function systemToText(system: string | ReadonlyArray<{ type: "text"; text: string }> | undefined): string {
+  if (system === undefined) return "";
+  if (typeof system === "string") return system;
+  return system.map((block) => block.text).join("\n\n");
+}
+
 describe("myagent cli", () => {
   it("prints version without starting agent runtime", async () => {
     const stdout = captureWriter();
@@ -150,7 +156,7 @@ describe("myagent cli", () => {
         };
       },
       async *stream(request) {
-        systems.push(request.system ?? "");
+        systems.push(systemToText(request.system));
         yield {
           type: "assistant_message",
           message: {
@@ -227,7 +233,7 @@ describe("myagent cli", () => {
             };
           },
           async *stream(request) {
-            systems.push(request.system ?? "");
+            systems.push(systemToText(request.system));
             yield {
               type: "assistant_message",
               message: { role: "assistant", content: "Use real DB integration fixtures." },
@@ -525,6 +531,48 @@ describe("myagent cli", () => {
     expect(record.events.at(-1)?.type).toBe("compact");
   });
 
+  it("sends the agent's system prompt as a structured block with cache_control", async () => {
+    const cwd = mkdtempSync(join(tmpdir(), "myagent-cli-cache-system-"));
+    let capturedSystem: unknown;
+    const stdout = captureWriter();
+    const stderr = captureWriter();
+
+    const exitCode = await runCli(["agent", "summarize", "fixture"], stdout.writer, stderr.writer, {
+      cwd,
+      env: {},
+      createModelClient: () =>
+        ({
+          async create() {
+            return {
+              message: { role: "assistant", content: "ok" },
+              requestId: "req_cache"
+            };
+          },
+          async *stream(request) {
+            capturedSystem = request.system;
+            yield {
+              type: "assistant_message",
+              message: { role: "assistant", content: "fixture summary" },
+              requestId: "req_cache"
+            };
+          }
+        }) satisfies ModelClient
+    });
+
+    expect(exitCode).toBe(0);
+    expect(stderr.text()).toBe("");
+    expect(Array.isArray(capturedSystem)).toBe(true);
+    const systemBlocks = capturedSystem as Array<{
+      type: string;
+      text: string;
+      cache_control?: { type: string };
+    }>;
+    expect(systemBlocks).toHaveLength(1);
+    expect(systemBlocks[0]?.type).toBe("text");
+    expect(systemBlocks[0]?.text).toContain("safety-first coding agent");
+    expect(systemBlocks[0]?.cache_control).toEqual({ type: "ephemeral" });
+  });
+
   it("prints per-turn token + cost breakdown via myagent usage", async () => {
     const cwd = mkdtempSync(join(tmpdir(), "myagent-cli-usage-"));
     const sessionRootDir = join(cwd, ".myagent", "sessions");
diff --git a/packages/core/src/anthropic.ts b/packages/core/src/anthropic.ts
index 3869bfa..114f9d4 100644
--- a/packages/core/src/anthropic.ts
+++ b/packages/core/src/anthropic.ts
@@ -161,8 +161,19 @@ export class AnthropicModelClient implements ModelClient {
           stop_reason?: string | null;
           partial_json?: string;
         };
-        message?: { usage?: { input_tokens?: number; output_tokens?: number } };
-        usage?: { output_tokens?: number };
+        message?: {
+          usage?: {
+            input_tokens?: number;
+            output_tokens?: number;
+            cache_creation_input_tokens?: number;
+            cache_read_input_tokens?: number;
+          };
+        };
+        usage?: {
+          output_tokens?: number;
+          cache_creation_input_tokens?: number;
+          cache_read_input_tokens?: number;
+        };
       };
 
       if (typed.type === "message_start") {
@@ -227,7 +238,11 @@ export class AnthropicModelClient implements ModelClient {
         stopReason = typed.delta?.stop_reason;
         usage = {
           ...usage,
-          outputTokens: typed.usage?.output_tokens ?? usage?.outputTokens
+          outputTokens: typed.usage?.output_tokens ?? usage?.outputTokens,
+          cacheCreationInputTokens:
+            typed.usage?.cache_creation_input_tokens ?? usage?.cacheCreationInputTokens,
+          cacheReadInputTokens:
+            typed.usage?.cache_read_input_tokens ?? usage?.cacheReadInputTokens
         };
       }
     }
@@ -403,28 +418,40 @@ function toInternalContent(
   return content;
 }
 
-function toAnthropicTools(
+export function toAnthropicTools(
   tools: readonly ModelToolDefinition[] | undefined
 ): { tools?: Array<Record<string, unknown>> } {
   if (!tools || tools.length === 0) {
     return {};
   }
 
-  return {
-    tools: tools.map((tool) => ({
-      name: tool.name,
-      description: tool.description,
-      input_schema: tool.inputSchema
-    }))
-  };
+  const mapped: Array<Record<string, unknown>> = tools.map((tool) => ({
+    name: tool.name,
+    description: tool.description,
+    input_schema: tool.inputSchema
+  }));
+  // Mark the last tool with cache_control so the entire tool list becomes
+  // a single prompt-cache breakpoint. Tools rarely change across turns,
+  // so this caches the largest stable input segment after the system
+  // prompt. The marker is harmless on uncached calls.
+  const lastIndex = mapped.length - 1;
+  mapped[lastIndex] = { ...mapped[lastIndex], cache_control: { type: "ephemeral" } };
+  return { tools: mapped };
 }
 
 function isRecord(value: unknown): value is Record<string, unknown> {
   return typeof value === "object" && value !== null && !Array.isArray(value);
 }
 
-function toModelUsage(
-  usage: { input_tokens?: number; output_tokens?: number } | undefined
+export function toModelUsage(
+  usage:
+    | {
+        input_tokens?: number;
+        output_tokens?: number;
+        cache_creation_input_tokens?: number;
+        cache_read_input_tokens?: number;
+      }
+    | undefined
 ): ModelUsage | undefined {
   if (!usage) {
     return undefined;
@@ -432,7 +459,9 @@ function toModelUsage(
 
   return {
     inputTokens: usage.input_tokens,
-    outputTokens: usage.output_tokens
+    outputTokens: usage.output_tokens,
+    cacheCreationInputTokens: usage.cache_creation_input_tokens,
+    cacheReadInputTokens: usage.cache_read_input_tokens
   };
 }
 
diff --git a/packages/core/src/fork.ts b/packages/core/src/fork.ts
index f5a3f13..4abf3be 100644
--- a/packages/core/src/fork.ts
+++ b/packages/core/src/fork.ts
@@ -1,5 +1,6 @@
 import { createHash } from "node:crypto";
 
+import type { SystemTextBlock } from "./model.js";
 import type { Message, ToolDefinition } from "./types.js";
 
 export type ForkTrace = {
@@ -18,15 +19,25 @@ export type ForkTraceInput = {
   parentDepth: number;
   subagentType: string;
   model: string;
-  systemPrompt?: string;
+  systemPrompt?: string | readonly SystemTextBlock[];
   tools: readonly ToolDefinition[];
   prefixMessages: readonly Message[];
   directive: string;
   previous?: ForkTrace;
 };
 
+function systemPromptToHashable(systemPrompt: ForkTraceInput["systemPrompt"]): string {
+  if (systemPrompt === undefined) {
+    return "";
+  }
+  if (typeof systemPrompt === "string") {
+    return systemPrompt;
+  }
+  return systemPrompt.map((block) => block.text).join("\n\n");
+}
+
 export function createForkTrace(input: ForkTraceInput): ForkTrace {
-  const systemPromptHash = sha256(input.systemPrompt ?? "");
+  const systemPromptHash = sha256(systemPromptToHashable(input.systemPrompt));
   const toolHash = hashToolDefinitions(input.tools);
   const prefixHash = hashMessages(input.prefixMessages);
   const directiveHash = sha256(input.directive);
diff --git a/packages/core/src/model.ts b/packages/core/src/model.ts
index a80c267..81aa916 100644
--- a/packages/core/src/model.ts
+++ b/packages/core/src/model.ts
@@ -23,11 +23,28 @@ export type ModelUsage = {
   cacheReadInputTokens?: number;
 };
 
+/**
+ * A single text block in a structured system prompt. The optional
+ * `cache_control` marker turns this block into an Anthropic prompt-cache
+ * breakpoint: the cumulative content up to and including this block is
+ * cached and reused across requests that share the same prefix.
+ */
+export type SystemTextBlock = {
+  type: "text";
+  text: string;
+  cache_control?: { type: "ephemeral" };
+};
+
 export type ModelRequest = {
   messages: readonly Message[];
   model?: string;
   maxTokens?: number;
-  system?: string;
+  /**
+   * The system prompt. A plain string preserves the legacy flat form
+   * (no caching). An array of `SystemTextBlock`s enables structured
+   * caching when at least one block carries `cache_control`.
+   */
+  system?: string | readonly SystemTextBlock[];
   requestId?: string;
   timeoutMs?: number;
   signal?: AbortSignal;
diff --git a/packages/core/src/query.ts b/packages/core/src/query.ts
index fde8b6b..b09fefa 100644
--- a/packages/core/src/query.ts
+++ b/packages/core/src/query.ts
@@ -10,7 +10,8 @@ import {
   type ModelClient,
   type ModelErrorKind,
   type ModelStreamEvent,
-  type ModelUsage
+  type ModelUsage,
+  type SystemTextBlock
 } from "./model.js";
 import { executeToolBatch, partitionToolCalls } from "./scheduler.js";
 import { toModelToolDefinition } from "./tool.js";
@@ -32,7 +33,7 @@ export type QueryOptions = {
   initialMessages: readonly Message[];
   tools: readonly ToolDefinition[];
   toolContext: ToolContext;
-  system?: string;
+  system?: string | readonly SystemTextBlock[];
   modelName?: string;
   maxTokens?: number;
   maxTurns?: number;
@@ -270,7 +271,7 @@ type CollectModelTurnWithRetryOptions = {
   messages: readonly Message[];
   modelName: string;
   maxTokens: number;
-  system?: string;
+  system?: string | readonly SystemTextBlock[];
   signal?: AbortSignal;
   tools: readonly ModelToolDefinition[];
   contextBudgetTokens: number;
diff --git a/packages/core/src/types.ts b/packages/core/src/types.ts
index e1af982..4dde22b 100644
--- a/packages/core/src/types.ts
+++ b/packages/core/src/types.ts
@@ -1,5 +1,5 @@
 import type { z } from "zod";
-import type { ModelClient, ModelUsage } from "./model.js";
+import type { ModelClient, ModelUsage, SystemTextBlock } from "./model.js";
 import type { ForkTrace } from "./fork.js";
 import type { ProfileRecorder } from "./profile.js";
 import type { TaskStore } from "./task.js";
@@ -78,7 +78,7 @@ export type ToolContext = {
   model?: ModelClient;
   modelName?: string;
   maxTokens?: number;
-  system?: string;
+  system?: string | readonly SystemTextBlock[];
   parentMessages?: readonly Message[];
   tools?: readonly ToolDefinition[];
   taskStore?: TaskStore;
diff --git a/packages/core/test/security/README.md b/packages/core/test/security/README.md
index d2bb431..3e46c18 100644
--- a/packages/core/test/security/README.md
+++ b/packages/core/test/security/README.md
@@ -73,6 +73,16 @@ Tests live in two trees because of the package boundary
 | `executeToolBatch` never overlaps two non-concurrency-safe tools | `packages/core/test/security/scheduler-write-serialization.test.ts` |
 | Sibling read tools cancel when a Bash sibling errors with cancel-on-error | `packages/core/test/scheduler.test.ts` |
 
+### Prompt caching plumbing
+
+| Invariant | Test |
+|---|---|
+| `toAnthropicTools` marks the *last* tool with `cache_control: { type: "ephemeral" }` so the full tool list becomes a single cache breakpoint | `packages/core/test/security/prompt-caching.test.ts` |
+| `toAnthropicTools` returns `{}` (no tools, no spurious cache marker) on an empty/undefined input | `packages/core/test/security/prompt-caching.test.ts` |
+| `toModelUsage` extracts `cache_creation_input_tokens` + `cache_read_input_tokens` when the SDK provides them | `packages/core/test/security/prompt-caching.test.ts` |
+| `toModelUsage` leaves cache fields `undefined` on non-cached turns (the SDK omits them) | `packages/core/test/security/prompt-caching.test.ts` |
+| The agent's outbound `request.system` is a `SystemTextBlock[]` (not a string) with `cache_control: ephemeral` on the block | `packages/cli/test/cli.test.ts` |
+
 ### Cache token accounting
 
 | Invariant | Test |
diff --git a/packages/core/test/security/prompt-caching.test.ts b/packages/core/test/security/prompt-caching.test.ts
new file mode 100644
index 0000000..ba85a78
--- /dev/null
+++ b/packages/core/test/security/prompt-caching.test.ts
@@ -0,0 +1,73 @@
+import { describe, expect, it } from "vitest";
+import {
+  toAnthropicTools,
+  toModelUsage,
+  type ModelToolDefinition
+} from "../../src/index.js";
+
+function makeTool(name: string): ModelToolDefinition {
+  return {
+    name,
+    description: `${name} fixture`,
+    inputSchema: { type: "object", properties: {}, additionalProperties: false }
+  };
+}
+
+describe("security: prompt caching wiring", () => {
+  it("toAnthropicTools marks the last tool with cache_control ephemeral", () => {
+    const { tools } = toAnthropicTools([makeTool("Read"), makeTool("Glob"), makeTool("Edit")]);
+    expect(tools).toBeDefined();
+    expect(tools).toHaveLength(3);
+    // First two are plain.
+    expect(tools![0]).not.toHaveProperty("cache_control");
+    expect(tools![1]).not.toHaveProperty("cache_control");
+    // Last one carries the cache breakpoint.
+    expect(tools![2]).toMatchObject({
+      name: "Edit",
+      cache_control: { type: "ephemeral" }
+    });
+  });
+
+  it("toAnthropicTools handles a single-tool list (the only tool is the cache breakpoint)", () => {
+    const { tools } = toAnthropicTools([makeTool("Read")]);
+    expect(tools).toHaveLength(1);
+    expect(tools![0]).toMatchObject({
+      name: "Read",
+      cache_control: { type: "ephemeral" }
+    });
+  });
+
+  it("toAnthropicTools returns no tools object for an empty list (no spurious cache marker)", () => {
+    expect(toAnthropicTools([])).toEqual({});
+    expect(toAnthropicTools(undefined)).toEqual({});
+  });
+
+  it("toModelUsage extracts cache_creation_input_tokens and cache_read_input_tokens", () => {
+    const usage = toModelUsage({
+      input_tokens: 10,
+      output_tokens: 5,
+      cache_creation_input_tokens: 100,
+      cache_read_input_tokens: 200
+    });
+    expect(usage).toEqual({
+      inputTokens: 10,
+      outputTokens: 5,
+      cacheCreationInputTokens: 100,
+      cacheReadInputTokens: 200
+    });
+  });
+
+  it("toModelUsage leaves cache fields undefined when the SDK omits them (non-cached turns)", () => {
+    const usage = toModelUsage({ input_tokens: 10, output_tokens: 5 });
+    expect(usage).toEqual({
+      inputTokens: 10,
+      outputTokens: 5,
+      cacheCreationInputTokens: undefined,
+      cacheReadInputTokens: undefined
+    });
+  });
+
+  it("toModelUsage returns undefined when the SDK provides no usage block", () => {
+    expect(toModelUsage(undefined)).toBeUndefined();
+  });
+});