Skip to content

feat(usage): cache-token accounting + myagent usage CLI (M1.5a)#6

Merged
wusijian007 merged 1 commit into
mainfrom
feat/m1.5a-cache-accounting
May 14, 2026
Merged

feat(usage): cache-token accounting + myagent usage CLI (M1.5a)#6
wusijian007 merged 1 commit into
mainfrom
feat/m1.5a-cache-accounting

Conversation

@wusijian007

Copy link
Copy Markdown
Owner

Pure-additive groundwork for prompt caching. After this change the runtime tracks and renders prompt-cache token + cost numbers; the actual cache_control plumbing on the Anthropic request lands in M1.5b. Until then cacheCreationInputTokens / cacheReadInputTokens stay zero on every turn but the visibility surface is ready.

Type extensions:

  • ModelUsage gains optional cacheCreationInputTokens / cacheReadInputTokens. Required on the SDK side (these are the fields Anthropic returns once cache_control is in use), optional in our internal shape so non-cached turns omit them.

  • TokenUsage gains required cacheCreationInputTokens / cacheReadInputTokens with default 0. This is the accumulator shape; required + always-zero is the safest invariant.

  • addTokenUsage and createBootstrapState updated to sum / init the new fields. normalizeBootstrap defaults them to 0 when loading sessions persisted before this change, so existing .myagent/sessions/*.json keep working.

Cost rates:

  • CostRates gains optional cacheWriteUsdPerMillionTokens / cacheReadUsdPerMillionTokens.
  • estimateUsageCostUsd factors all four streams. Cache write defaults to the base input rate (small premium varies by model and is on average within rounding); cache read defaults to 0.
  • CLI env: MYAGENT_CACHE_WRITE_USD_PER_MTOK / MYAGENT_CACHE_READ_USD_PER_MTOK. Threaded through pricingFromEnv and loadEnvironment.

CLI surface:

  • New myagent usage <sessionId> subcommand. Walks assistant_message events on the saved session, prints a per-turn table (#, requestId, in, out, cache_w, cache_r, cost_usd) plus totals + the bootstrap-recorded session cost. Hints the env vars when pricing is unset. Help text + router wired.

Tests added under packages/core/test/security/cache-accounting.test.ts (8 cases pinning the accumulator, init, and pricing semantics) plus a myagent usage CLI e2e (2 cases: happy path + missing session) and a normalizeBootstrap regression for pre-M1.5a session JSON. Catalog row added under "Cache token accounting" in the security README; CLAUDE.md updated.

Two existing toEqual assertions in state.test.ts that pinned the shape of tokenUsage had to be updated to include the new fields.

Verified 3/3 local runs (154 tests) green.

Pure-additive groundwork for prompt caching. After this change the
runtime *tracks* and *renders* prompt-cache token + cost numbers; the
actual `cache_control` plumbing on the Anthropic request lands in
M1.5b. Until then `cacheCreationInputTokens` / `cacheReadInputTokens`
stay zero on every turn but the visibility surface is ready.

Type extensions:

- `ModelUsage` gains optional `cacheCreationInputTokens` /
  `cacheReadInputTokens`. Required on the SDK side (these are the
  fields Anthropic returns once cache_control is in use), optional in
  our internal shape so non-cached turns omit them.

- `TokenUsage` gains required `cacheCreationInputTokens` /
  `cacheReadInputTokens` with default 0. This is the accumulator
  shape; required + always-zero is the safest invariant.

- `addTokenUsage` and `createBootstrapState` updated to sum / init
  the new fields. `normalizeBootstrap` defaults them to 0 when
  loading sessions persisted before this change, so existing
  `.myagent/sessions/*.json` keep working.

Cost rates:

- `CostRates` gains optional `cacheWriteUsdPerMillionTokens` /
  `cacheReadUsdPerMillionTokens`.
- `estimateUsageCostUsd` factors all four streams. Cache write
  defaults to the base input rate (small premium varies by model and
  is on average within rounding); cache read defaults to 0.
- CLI env: `MYAGENT_CACHE_WRITE_USD_PER_MTOK` /
  `MYAGENT_CACHE_READ_USD_PER_MTOK`. Threaded through
  `pricingFromEnv` and `loadEnvironment`.

CLI surface:

- New `myagent usage <sessionId>` subcommand. Walks
  `assistant_message` events on the saved session, prints a per-turn
  table (#, requestId, in, out, cache_w, cache_r, cost_usd) plus
  totals + the bootstrap-recorded session cost. Hints the env vars
  when pricing is unset. Help text + router wired.

Tests added under `packages/core/test/security/cache-accounting.test.ts`
(8 cases pinning the accumulator, init, and pricing semantics) plus
a `myagent usage` CLI e2e (2 cases: happy path + missing session)
and a `normalizeBootstrap` regression for pre-M1.5a session JSON.
Catalog row added under "Cache token accounting" in the security
README; CLAUDE.md updated.

Two existing `toEqual` assertions in `state.test.ts` that pinned the
shape of `tokenUsage` had to be updated to include the new fields.

Verified 3/3 local runs (154 tests) green.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@wusijian007 wusijian007 merged commit 12a44ec into main May 14, 2026
3 checks passed
@wusijian007 wusijian007 deleted the feat/m1.5a-cache-accounting branch May 14, 2026 07:55
wusijian007 added a commit that referenced this pull request May 14, 2026
* chore(ci): bump actions/checkout + actions/setup-node to @v5

GitHub annotated PR #6 with a deprecation warning: the @v4 versions
of these actions run on Node.js 20 internally, which GitHub is
phasing out as the runner default moves to Node.js 24. The @v5
versions are functionally identical for our usage but are built on
the new runtime.

No workflow behavior changes; cache and matrix config untouched.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(model): enable Anthropic prompt caching end-to-end (M1.5b)

The wiring built in M1.5a (cache token accumulators, cost factors,
`myagent usage` CLI) now has something to count. This change toggles
the request side from "no cache_control anywhere" to "two cache
breakpoints: system + tools".

Type extensions:

- New `SystemTextBlock` in `model.ts`: `{type:"text", text, cache_control?}`.
- `ModelRequest.system` now accepts `string | readonly SystemTextBlock[]`.
- `ToolContext.system` and `QueryOptions.system` propagated the same way.
- `ForkTrace.systemPrompt` accepts both forms and hashes their text
  content, so the fork-trace identity stays stable across the legacy
  flat-string and structured-array representations.

Outbound request shape:

- `buildAgentSystemPrompt` (in cli/src/index.ts) returns a single
  `SystemTextBlock` containing base prompt + memory + skill context,
  marked `cache_control: { type: "ephemeral" }`. Identical content
  across every turn of a session → cache hit on every turn after the
  first.

- `toAnthropicTools` (in core/src/anthropic.ts) marks the *last* tool
  in the list with `cache_control: ephemeral`, turning the whole tool
  list into a single cache breakpoint. Tool definitions are stable
  across turns by construction, so the breakpoint reliably hits.

- `toAnthropicTools` and `toModelUsage` are now exported so the
  security suite can unit-test them.

Response parsing:

- `toModelUsage` extracts `cache_creation_input_tokens` and
  `cache_read_input_tokens` from the SDK's `message_start.message.usage`
  and `message_delta.usage`. Both fields are optional; non-cached
  turns leave them `undefined`, which `addTokenUsage` already treats
  as zero.

- `runAgentTurn` emits per-turn profile metrics
  `model.cache_creation_input_tokens` /
  `model.cache_read_input_tokens` and per-session counterparts
  `session.cache_creation_input_tokens` /
  `session.cache_read_input_tokens`.

Tests added in `packages/core/test/security/prompt-caching.test.ts`
(6 cases on `toAnthropicTools` + `toModelUsage`) and a CLI assertion
that the agent's outbound `request.system` is the structured form
with a `cache_control` marker. Catalog row added; CLAUDE.md updated.

Two pre-existing cli tests captured `request.system` as a string;
extracted a `systemToText` helper to flatten the array form during
assertions.

Also bundled the chore from PR #6's deprecation annotation:
actions/checkout and actions/setup-node bumped from @v4 to @v5.

Local: 161 tests, 3/3 runs green.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant