Skip to content

Feat/m1.5b prompt caching#7

Merged
wusijian007 merged 2 commits into
mainfrom
feat/m1.5b-prompt-caching
May 14, 2026
Merged

Feat/m1.5b prompt caching#7
wusijian007 merged 2 commits into
mainfrom
feat/m1.5b-prompt-caching

Conversation

@wusijian007

Copy link
Copy Markdown
Owner

No description provided.

wusijian007 and others added 2 commits May 14, 2026 15:59
GitHub annotated PR #6 with a deprecation warning: the @v4 versions
of these actions run on Node.js 20 internally, which GitHub is
phasing out as the runner default moves to Node.js 24. The @v5
versions are functionally identical for our usage but are built on
the new runtime.

No workflow behavior changes; cache and matrix config untouched.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The wiring built in M1.5a (cache token accumulators, cost factors,
`myagent usage` CLI) now has something to count. This change toggles
the request side from "no cache_control anywhere" to "two cache
breakpoints: system + tools".

Type extensions:

- New `SystemTextBlock` in `model.ts`: `{type:"text", text, cache_control?}`.
- `ModelRequest.system` now accepts `string | readonly SystemTextBlock[]`.
- `ToolContext.system` and `QueryOptions.system` propagated the same way.
- `ForkTrace.systemPrompt` accepts both forms and hashes their text
  content, so the fork-trace identity stays stable across the legacy
  flat-string and structured-array representations.

Outbound request shape:

- `buildAgentSystemPrompt` (in cli/src/index.ts) returns a single
  `SystemTextBlock` containing base prompt + memory + skill context,
  marked `cache_control: { type: "ephemeral" }`. Identical content
  across every turn of a session → cache hit on every turn after the
  first.

- `toAnthropicTools` (in core/src/anthropic.ts) marks the *last* tool
  in the list with `cache_control: ephemeral`, turning the whole tool
  list into a single cache breakpoint. Tool definitions are stable
  across turns by construction, so the breakpoint reliably hits.

- `toAnthropicTools` and `toModelUsage` are now exported so the
  security suite can unit-test them.

Response parsing:

- `toModelUsage` extracts `cache_creation_input_tokens` and
  `cache_read_input_tokens` from the SDK's `message_start.message.usage`
  and `message_delta.usage`. Both fields are optional; non-cached
  turns leave them `undefined`, which `addTokenUsage` already treats
  as zero.

- `runAgentTurn` emits per-turn profile metrics
  `model.cache_creation_input_tokens` /
  `model.cache_read_input_tokens` and per-session counterparts
  `session.cache_creation_input_tokens` /
  `session.cache_read_input_tokens`.

Tests added in `packages/core/test/security/prompt-caching.test.ts`
(6 cases on `toAnthropicTools` + `toModelUsage`) and a CLI assertion
that the agent's outbound `request.system` is the structured form
with a `cache_control` marker. Catalog row added; CLAUDE.md updated.

Two pre-existing cli tests captured `request.system` as a string;
extracted a `systemToText` helper to flatten the array form during
assertions.

Also bundled the chore from PR #6's deprecation annotation:
actions/checkout and actions/setup-node bumped from @v4 to @v5.

Local: 161 tests, 3/3 runs green.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@wusijian007 wusijian007 merged commit e1be0d5 into main May 14, 2026
3 checks passed
@wusijian007 wusijian007 deleted the feat/m1.5b-prompt-caching branch May 14, 2026 08:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant