Skip to content

API-key users silently stuck on 5m cache TTL: bundled CLI in v0.1.59 predates ENABLE_PROMPT_CACHING_1H opt-in #826

@oneryalcin

Description

@oneryalcin

Summary

claude-agent-sdk-python API-key users cannot opt into 1-hour prompt cache TTL. The env var that enables it (ENABLE_PROMPT_CACHING_1H, added in Claude CLI 2.1.108) is absent from v0.1.59's bundled CLI (2.1.105) and is a no-op.

We've been running the Claude Agent SDK in production for 8 months and only discovered this during cost investigation: every cache write across hundreds of thousands of agent turns has been billed at the 5-minute tier. The re-creation cost compounds badly on long agent sessions (16–60 min), which is exactly the SDK's primary workload.

Thariq (your dev advocate I guess) even by november was claiming SDK customers get prompt-caching savings automatically and encouriging us to use sdk as the promt caching is a not trival problme in harnesses to implement and sdk does it for us (as it uses cli):

"because of the scale we operate at, prompt caching is a p0 for us and luckily these savings are passed on to Claude Agent SDK customers without any more configuration"
— @trq212, Anthropic developer relations, Nov 10, 2025

Reality for API-key users of the SDK like us:

  1. Per bcherny in anthropics/claude-code#45381: "we are not defaulting API customers to 1h yet" — API customers must set ENABLE_PROMPT_CACHING_1H=1.
  2. That env var was added in CLI 2.1.108.
  3. SDK v0.1.59 pins __cli_version__ = "2.1.105" (source) and its wheel ships that CLI binary, so the env var is silently dropped by the subprocess.
  4. main has since merged bumps to 2.1.106 → 2.1.107 → 2.1.108 → 2.1.109 → 2.1.110 (commits e8bcd95, 4b17d02, acd62ea, aaac538) but no v0.1.60 release has been tagged.

The advertised "automatic savings" materially do not apply to API-key SDK users, and there is no documented workaround short of an unreleased git commit or CLI-path override.

Reproduction

1. SDK path (v0.1.59, Anthropic API key):

# sdk_repro.py (excerpt)
for label, env in [("baseline", {}), ("force_1h", {"ENABLE_PROMPT_CACHING_1H": "1"})]:
    async for msg in query(prompt="say hello", options=ClaudeAgentOptions(session_id=sid, ...)):
        ...

Running with pip install claude-agent-sdk==0.1.59:

SDK version: 0.1.59
Bundled CLI version: 2.1.105
baseline    ttl=5m   1h=0  5m=4133
force_1h    ttl=5m   1h=0  5m=4139    ← env var dropped, no effect

2. Bare CLI path (system claude 2.1.110) — shows the env var works when bundled CLI ≥ 2.1.108:

baseline                            ttl=5m   1h=0      5m=59214
ENABLE_PROMPT_CACHING_1H=1 claude   ttl=1h   1h=59214  5m=0

Both verified via usage.cache_creation in ~/.claude/projects/*/<session>.jsonl.

I have to say I'm extremely disapppointed how this was going on for many months. Please do the following as soon as possible:

  1. Tag v0.1.60 bundling CLI ≥ 2.1.108 so ENABLE_PROMPT_CACHING_1H=1 by default actually reaches the subprocess. The bumps are already on main.
  2. Consider 1h as SDK default for API customers. Agent sessions routinely exceed 5 minutes — the current default is economically disasterous for the SDK's primary workload.
  3. Add a ClaudeAgentOptions.cache_ttl: Literal["5m", "1h"] field so users don't have to discover and plumb through an undocumented Claude Code env var to control SDK behavior.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions