Skip to content

QA: harden flight recorder, ROI and benchmark#47

Merged
00PrabalK00 merged 1 commit into
mainfrom
qa/flight-roi-benchmark
May 31, 2026
Merged

QA: harden flight recorder, ROI and benchmark#47
00PrabalK00 merged 1 commit into
mainfrom
qa/flight-roi-benchmark

Conversation

@00PrabalK00
Copy link
Copy Markdown
Owner

@-

Fix three defects found while exercising the v0.9.0 flight/ROI/benchmark
modules against real stored state, and add edge-case coverage.

Defects fixed:
- benchmark.load_capture raised an uncaught FileNotFoundError on a missing
  capture path, producing a traceback instead of a clean CLI exit 1. It now
  raises ValueError (caught by cli.main) with a clear message, and reports
  malformed JSON and non-object JSON the same way.
- benchmark.compare_captures crashed with AttributeError when a capture had
  "metrics": None (the dict.get fallback only triggered on a missing key, not
  a None/non-dict value). Added _metrics_view to fall back to the flat capture
  whenever metrics is absent or not a mapping.
- roi.roi_summary crashed in provider aggregation if an event payload was not
  a dict. It now skips non-dict payloads defensively.

Tests added (tests/test_roi_benchmark.py, tests/test_evidence.py):
- fresh/uninitialized store: roi_summary and `roi --json` exit 0, no crash.
- flight-record on unknown task -> CLI exit 1 with "Unknown task".
- bare task (no worktree/gates/claims): task_metrics, capture, and flight
  record render without crashing.
- provider_usage aggregation across model-only, provider+model, and agent-only
  events, plus a non-dict payload that must be ignored.
- compare_captures fallback (flat capture, metrics=None) and all _verdict
  branches (tie/win/loss/merge_ready swing).
- benchmark compare CLI with missing and malformed JSON -> exit 1.

Test count: 15 -> 32 in the targeted files; full suite 303 passing.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@00PrabalK00 00PrabalK00 merged commit de6aa1b into main May 31, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant