Skip to content

Migrate burn compare to grouped SQL aggregates (#88)#130

Open
willwashburn wants to merge 1 commit intomainfrom
feat/compare-from-archive-88
Open

Migrate burn compare to grouped SQL aggregates (#88)#130
willwashburn wants to merge 1 commit intomainfrom
feat/compare-from-archive-88

Conversation

@willwashburn
Copy link
Copy Markdown
Member

@willwashburn willwashburn commented Apr 26, 2026

Summary

  • burn compare now reads from archive.sqlite by default via a single grouped SELECT … GROUP BY model, activity, source (plus a per-cell median-retries follow-up), instead of streaming the full ledger and reducing in memory.
  • New compareFromArchive(query, opts) helper in @relayburn/analyze carries the SQL path; runCompare calls buildArchive() first so the archive reflects the freshly-ingested ledger tail.
  • Output (text, CSV, --json) is byte-identical to the legacy path for the parity fixture; all existing flags (--models, --since, --project, --session, --workflow, --agent, --min-sample) work through the SQL path.
  • New --no-archive flag (also honored via env RELAYBURN_ARCHIVE=0) keeps the in-memory path as a parity-validation / safety-net fallback.
  • Per-source reasoning-mode handling (Codex's included_in_output) is preserved by grouping on source alongside (model, activity).

Test plan

  • pnpm run test:ts (519 tests pass)
  • Parity test deep-equals every (model, category) cell + per-model totals against the in-memory buildCompareTable for a mixed fixture (multiple models, edit/non-edit categories, varied retries, cache-heavy turns, an unpriced model, and a Codex-source turn with reasoning tokens).
  • Filter-coverage tests for each user-facing flag — --models (incl. pre-seeded absent models), --since, --project (literal path AND projectKey), --session, --workflow/--agent (stamp-folded enrichment), --min-sample.
  • Empty-archive and single-cell edge cases.
  • Codex-reasoning regression guard (totalCost from the SQL path matches in-memory costForTurn for a Codex turn with reasoning_tokens > 0).

Refs

🤖 Generated with Claude Code


Open in Devin Review

Read the per-(model, activity) compare table from `archive.sqlite` via a
single grouped `SELECT … GROUP BY model, activity, source` plus a tiny
per-cell median-retries follow-up, instead of streaming every turn
through `queryAll()` + an in-memory reduce. New `compareFromArchive`
helper in `@relayburn/analyze` carries the SQL path and returns the
same `CompareTable` shape; `runCompare` calls `buildArchive()` to
catch the archive up to the freshly-ingested ledger tail before
reading. Output (text, CSV, `--json`) is byte-identical to the legacy
path for the parity fixture; per-source reasoning-mode handling
(Codex's `included_in_output`) is preserved by grouping on `source`
alongside `(model, activity)`. All existing flags (`--models`,
`--since`, `--project`, `--session`, `--workflow`, `--agent`,
`--min-sample`) work through the SQL path. New `--no-archive` flag
(also honored via `RELAYBURN_ARCHIVE=0`) keeps the in-memory path as
a parity-validation / safety-net fallback.

Tests: parity fixture (deep-equal across every cell vs the in-memory
table), filter coverage for each flag, empty / single-cell edges,
and a Codex-reasoning regression guard.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 5 additional findings.

Open in Devin Review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Migrate burn compare to grouped SQL aggregates over archive

1 participant