Summary
Databases created before the fix for #473 retain 35 orphan session_outcomes rows from the old sink code path. These rows have:
run_id = NULL
archetype = NULL
model = 'unknown'
cost = $0.00
They are exact duplicates of legitimate sessions — identical node_id, duration_ms, and token counts — but with timestamps shifted by the local UTC offset (see #478).
Evidence
Total sessions: 118 (real: 83, shadows: 35)
Session count inflation: 42%
All 35 orphans match 1:1 with Run 1 sessions by node_id and duration_ms.
Impact
Any query on session_outcomes without filtering WHERE run_id IS NOT NULL double-counts Run 1 data. The runs.total_sessions field is correct (35 + 48 = 83), so the discrepancy is row-level only.
Suggested Fix
Add a one-time migration that cleans up orphan rows:
DELETE FROM session_outcomes WHERE run_id IS NULL;
Or, if preservation matters, add a schema migration that marks them with a legacy_orphan flag.
Related
Summary
Databases created before the fix for #473 retain 35 orphan
session_outcomesrows from the old sink code path. These rows have:run_id = NULLarchetype = NULLmodel = 'unknown'cost = $0.00They are exact duplicates of legitimate sessions — identical
node_id,duration_ms, and token counts — but with timestamps shifted by the local UTC offset (see #478).Evidence
All 35 orphans match 1:1 with Run 1 sessions by
node_idandduration_ms.Impact
Any query on
session_outcomeswithout filteringWHERE run_id IS NOT NULLdouble-counts Run 1 data. Theruns.total_sessionsfield is correct (35 + 48 = 83), so the discrepancy is row-level only.Suggested Fix
Add a one-time migration that cleans up orphan rows:
Or, if preservation matters, add a schema migration that marks them with a
legacy_orphanflag.Related