Skip to content

fix: orphan session_outcomes rows from pre-#473 dual-write path survive in databases #479

@mickume

Description

@mickume

Summary

Databases created before the fix for #473 retain 35 orphan session_outcomes rows from the old sink code path. These rows have:

  • run_id = NULL
  • archetype = NULL
  • model = 'unknown'
  • cost = $0.00

They are exact duplicates of legitimate sessions — identical node_id, duration_ms, and token counts — but with timestamps shifted by the local UTC offset (see #478).

Evidence

Total sessions: 118 (real: 83, shadows: 35)
Session count inflation: 42%

All 35 orphans match 1:1 with Run 1 sessions by node_id and duration_ms.

Impact

Any query on session_outcomes without filtering WHERE run_id IS NOT NULL double-counts Run 1 data. The runs.total_sessions field is correct (35 + 48 = 83), so the discrepancy is row-level only.

Suggested Fix

Add a one-time migration that cleans up orphan rows:

DELETE FROM session_outcomes WHERE run_id IS NULL;

Or, if preservation matters, add a schema migration that marks them with a legacy_orphan flag.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    invalidThis doesn't seem right

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions