feat(usage): cache-token accounting + myagent usage CLI (M1.5a)#6
Merged
Conversation
Pure-additive groundwork for prompt caching. After this change the runtime *tracks* and *renders* prompt-cache token + cost numbers; the actual `cache_control` plumbing on the Anthropic request lands in M1.5b. Until then `cacheCreationInputTokens` / `cacheReadInputTokens` stay zero on every turn but the visibility surface is ready. Type extensions: - `ModelUsage` gains optional `cacheCreationInputTokens` / `cacheReadInputTokens`. Required on the SDK side (these are the fields Anthropic returns once cache_control is in use), optional in our internal shape so non-cached turns omit them. - `TokenUsage` gains required `cacheCreationInputTokens` / `cacheReadInputTokens` with default 0. This is the accumulator shape; required + always-zero is the safest invariant. - `addTokenUsage` and `createBootstrapState` updated to sum / init the new fields. `normalizeBootstrap` defaults them to 0 when loading sessions persisted before this change, so existing `.myagent/sessions/*.json` keep working. Cost rates: - `CostRates` gains optional `cacheWriteUsdPerMillionTokens` / `cacheReadUsdPerMillionTokens`. - `estimateUsageCostUsd` factors all four streams. Cache write defaults to the base input rate (small premium varies by model and is on average within rounding); cache read defaults to 0. - CLI env: `MYAGENT_CACHE_WRITE_USD_PER_MTOK` / `MYAGENT_CACHE_READ_USD_PER_MTOK`. Threaded through `pricingFromEnv` and `loadEnvironment`. CLI surface: - New `myagent usage <sessionId>` subcommand. Walks `assistant_message` events on the saved session, prints a per-turn table (#, requestId, in, out, cache_w, cache_r, cost_usd) plus totals + the bootstrap-recorded session cost. Hints the env vars when pricing is unset. Help text + router wired. Tests added under `packages/core/test/security/cache-accounting.test.ts` (8 cases pinning the accumulator, init, and pricing semantics) plus a `myagent usage` CLI e2e (2 cases: happy path + missing session) and a `normalizeBootstrap` regression for pre-M1.5a session JSON. Catalog row added under "Cache token accounting" in the security README; CLAUDE.md updated. Two existing `toEqual` assertions in `state.test.ts` that pinned the shape of `tokenUsage` had to be updated to include the new fields. Verified 3/3 local runs (154 tests) green. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
wusijian007
added a commit
that referenced
this pull request
May 14, 2026
* chore(ci): bump actions/checkout + actions/setup-node to @v5 GitHub annotated PR #6 with a deprecation warning: the @v4 versions of these actions run on Node.js 20 internally, which GitHub is phasing out as the runner default moves to Node.js 24. The @v5 versions are functionally identical for our usage but are built on the new runtime. No workflow behavior changes; cache and matrix config untouched. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(model): enable Anthropic prompt caching end-to-end (M1.5b) The wiring built in M1.5a (cache token accumulators, cost factors, `myagent usage` CLI) now has something to count. This change toggles the request side from "no cache_control anywhere" to "two cache breakpoints: system + tools". Type extensions: - New `SystemTextBlock` in `model.ts`: `{type:"text", text, cache_control?}`. - `ModelRequest.system` now accepts `string | readonly SystemTextBlock[]`. - `ToolContext.system` and `QueryOptions.system` propagated the same way. - `ForkTrace.systemPrompt` accepts both forms and hashes their text content, so the fork-trace identity stays stable across the legacy flat-string and structured-array representations. Outbound request shape: - `buildAgentSystemPrompt` (in cli/src/index.ts) returns a single `SystemTextBlock` containing base prompt + memory + skill context, marked `cache_control: { type: "ephemeral" }`. Identical content across every turn of a session → cache hit on every turn after the first. - `toAnthropicTools` (in core/src/anthropic.ts) marks the *last* tool in the list with `cache_control: ephemeral`, turning the whole tool list into a single cache breakpoint. Tool definitions are stable across turns by construction, so the breakpoint reliably hits. - `toAnthropicTools` and `toModelUsage` are now exported so the security suite can unit-test them. Response parsing: - `toModelUsage` extracts `cache_creation_input_tokens` and `cache_read_input_tokens` from the SDK's `message_start.message.usage` and `message_delta.usage`. Both fields are optional; non-cached turns leave them `undefined`, which `addTokenUsage` already treats as zero. - `runAgentTurn` emits per-turn profile metrics `model.cache_creation_input_tokens` / `model.cache_read_input_tokens` and per-session counterparts `session.cache_creation_input_tokens` / `session.cache_read_input_tokens`. Tests added in `packages/core/test/security/prompt-caching.test.ts` (6 cases on `toAnthropicTools` + `toModelUsage`) and a CLI assertion that the agent's outbound `request.system` is the structured form with a `cache_control` marker. Catalog row added; CLAUDE.md updated. Two pre-existing cli tests captured `request.system` as a string; extracted a `systemToText` helper to flatten the array form during assertions. Also bundled the chore from PR #6's deprecation annotation: actions/checkout and actions/setup-node bumped from @v4 to @v5. Local: 161 tests, 3/3 runs green. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pure-additive groundwork for prompt caching. After this change the runtime tracks and renders prompt-cache token + cost numbers; the actual
cache_controlplumbing on the Anthropic request lands in M1.5b. Until thencacheCreationInputTokens/cacheReadInputTokensstay zero on every turn but the visibility surface is ready.Type extensions:
ModelUsagegains optionalcacheCreationInputTokens/cacheReadInputTokens. Required on the SDK side (these are the fields Anthropic returns once cache_control is in use), optional in our internal shape so non-cached turns omit them.TokenUsagegains requiredcacheCreationInputTokens/cacheReadInputTokenswith default 0. This is the accumulator shape; required + always-zero is the safest invariant.addTokenUsageandcreateBootstrapStateupdated to sum / init the new fields.normalizeBootstrapdefaults them to 0 when loading sessions persisted before this change, so existing.myagent/sessions/*.jsonkeep working.Cost rates:
CostRatesgains optionalcacheWriteUsdPerMillionTokens/cacheReadUsdPerMillionTokens.estimateUsageCostUsdfactors all four streams. Cache write defaults to the base input rate (small premium varies by model and is on average within rounding); cache read defaults to 0.MYAGENT_CACHE_WRITE_USD_PER_MTOK/MYAGENT_CACHE_READ_USD_PER_MTOK. Threaded throughpricingFromEnvandloadEnvironment.CLI surface:
myagent usage <sessionId>subcommand. Walksassistant_messageevents on the saved session, prints a per-turn table (#, requestId, in, out, cache_w, cache_r, cost_usd) plus totals + the bootstrap-recorded session cost. Hints the env vars when pricing is unset. Help text + router wired.Tests added under
packages/core/test/security/cache-accounting.test.ts(8 cases pinning the accumulator, init, and pricing semantics) plus amyagent usageCLI e2e (2 cases: happy path + missing session) and anormalizeBootstrapregression for pre-M1.5a session JSON. Catalog row added under "Cache token accounting" in the security README; CLAUDE.md updated.Two existing
toEqualassertions instate.test.tsthat pinned the shape oftokenUsagehad to be updated to include the new fields.Verified 3/3 local runs (154 tests) green.