Skip to content

feat: complete reports tokens spent + saved (0.24.0)#42

Merged
Shahinyanm merged 1 commit into
mainfrom
feat/complete-token-stats
Jun 13, 2026
Merged

feat: complete reports tokens spent + saved (0.24.0)#42
Shahinyanm merged 1 commit into
mainfrom
feat/complete-token-stats

Conversation

@Shahinyanm

Copy link
Copy Markdown
Member

What

complete now shows people what it costs and what it compresses, so the price/value is visible:

complete tj-x: retitled "#: 5" → "Voucher refund…"; closed | spent 1.5k tok ($0.0012) · saved ~88k→1.5k tok (59×)
…
Totals across 9 task(s): spent 14k tok ($0.011) · saved ~512k tok
  • Spent — exact. Pulled from the backend's own usage report: the claude -p JSON envelope's usage + total_cost_usd, Anthropic/OpenAI usage. Summed across the judge call and any --enrich calls. Cost shown only when the backend reports a non-zero price (it's $0 under a subscription).
  • Saved — estimate. Memory compression: the raw transcript size of the task's sessions vs its compact pack (≈ chars/4), as ~raw→pack tok (N×).
  • Batch prints a Totals across N task(s): line.

How

  • New LlmUsage { input_tokens, output_tokens, cost_usd } and a new LlmBackend::complete_usage method with a default that reports no usage — so the three real backends opt in by parsing their usage, while mocks and any custom backend keep working unchanged (no signature break).
  • finalize::judge returns usage; LlmDreamBackend accumulates usage across enrich chunks; the CLI sums judge + enrich into FinalizeOutcome and renders per-task + batch totals.

Tests

  • fmt_tokens unit; stats_suffix spent/saved formatting + empty cases.
  • E2E (Unix): the fake claude envelope now carries usage/total_cost_usd, and the test asserts spent 1.5k tok ($0.0012) appears.
  • Full local gate green: fmt --check, clippy --workspace --all-targets -D warnings, test --workspace, lean --no-default-features build.

🤖 Generated with Claude Code

Each finalize now prints what it cost and what it compresses:
  complete tj-x: … | spent 1.5k tok ($0.0012) · saved ~88k→1.5k tok (59×)
and a batch ends with a Totals line. Spent is exact, from the backend's
own usage report (claude -p envelope usage/total_cost_usd, Anthropic/
OpenAI usage), summed across judge + any --enrich calls. Saved estimates
memory compression: raw session transcript size vs compact pack (chars/4).

Backends expose usage via a new LlmBackend::complete_usage method with a
default that reports none, so mocks and custom backends are unchanged.

claude-memory-642

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@Shahinyanm Shahinyanm merged commit 1917c82 into main Jun 13, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant