Skip to content

Add idempotent incremental checkpoints and cursor paging#1

Merged
kraftaa merged 3 commits intomainfrom
reliability-idempotency
Feb 25, 2026
Merged

Add idempotent incremental checkpoints and cursor paging#1
kraftaa merged 3 commits intomainfrom
reliability-idempotency

Conversation

@kraftaa
Copy link
Copy Markdown
Owner

@kraftaa kraftaa commented Feb 25, 2026

Summary

This PR improves incremental sync correctness and rerun safety in rustream by:

  • checkpointing progress after each successful batch write
  • using composite cursor pagination for duplicate watermark values
  • persisting both watermark and cursor in SQLite state
  • supporting explicit append-only incremental mode when no tiebreaker is needed
    Why
    Previously incremental sync could skip rows when many records shared the same watermark value (for example identical updated_at). Also, retries after partial failure had weaker resume guarantees.

Changes

  1. Incremental checkpointing:
  • progress saved batch-by-batch after successful output write
  1. Composite cursor pagination:
  • query now advances by (watermark, cursor):
  • watermark > last_watermark
  • or watermark = last_watermark and cursor > last_cursor
  • deterministic ordering by watermark and cursor
  1. State model:
  • watermarks table now includes cursor_value
  • backward-compatible migration for existing state DBs
  • fail-fast when legacy state has watermark but missing cursor for cursor-based incremental
  1. Config updates:
  • added tables[].incremental_tiebreaker_column
  • added tables[].incremental_column_is_unique for append-only watermark-only mode
  1. Safety/validation:
  • parameterized watermark query
  • validation for missing/invalid incremental/tiebreaker columns
  • validation preventing identical watermark and tiebreaker columns
  1. Docs/examples:
  • updated README.md and config.example.yaml with new incremental config modes

Behavior notes

  • For mutable tables, configure both:
    • incremental_column
    • incremental_tiebreaker_column (recommended primary key)
  • For append-only unique columns (for example monotonic id), set:
    • incremental_column
    • incremental_column_is_unique: true
  • Tables without incremental_column remain full-sync each run.

Testing

cargo test passed locally (44 passed, 0 failed)

Risk / Follow-up

  • Existing deployments with old watermark-only state may need state reset for tables moved to cursor-based incremental mode (fail-fast error explains this).

@kraftaa kraftaa merged commit c79c69f into main Feb 25, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant