Skip to content

Support format-version 3 in icebergRewrite (preserve row lineage)#24

Merged
philcunliffe merged 2 commits into
masterfrom
v3-rewrite
Jun 11, 2026
Merged

Support format-version 3 in icebergRewrite (preserve row lineage)#24
philcunliffe merged 2 commits into
masterfrom
v3-rewrite

Conversation

@philcunliffe

Copy link
Copy Markdown
Contributor

Implements the v3 rewrite follow-up deferred in #23: icebergRewrite now works on format-version 3 tables and preserves row lineage exactly, instead of throwing format-version 2 only.

The problem

Iceberg v3 gives every row a stable _row_id and _last_updated_sequence_number, normally derived from the data file's first_row_id + row position. Rewrite reads every live row, sorts globally, and regroups — which destroys the positional ordering derivation relies on. Letting assignFirstRowIds hand the rewritten manifest a fresh id range would renumber every row, making lineage consumers see the whole table as deleted-and-reinserted.

The spec's rules for moved rows (format spec, Row Lineage): a row's existing non-null _row_id must be copied into the new data file, and — since rewrite doesn't modify rows — its existing non-null _last_updated_sequence_number must be copied too. Commit-time first_row_id assignment to a manifest "is optional if the writer knows that existing data files in the manifest have assigned first_row_id values."

What this PR does

All production changes are in src/write/rewrite.js; everything else already did its half (the read path surfaces lineage per row with stored-value-wins semantics, the parquet writer threads field ids and embeds iceberg.schema, and assignFirstRowIds skips manifests that already have a first_row_id).

  • Materialized lineage columns. For v3, the parquet writer gets an extended schema with _row_id / _last_updated_sequence_number (reserved field ids 2147483540 / 2147483539, per-spec). Values are copied through the sort untouched. The field ids land both in the parquet SchemaElements (external-reader interop) and in the embedded iceberg.schema kv metadata (icebird's own discovery path). Stats, partitioning, and the manifest's embedded schema stay user-only.
  • No id consumption for a pure rewrite. When every live row carries lineage, the new manifest's first_row_id is pinned to the minimum carried _row_id. Carried ids are distinct and below next-row-id, so min + row count ≤ next-row-id: assignFirstRowIds assigns nothing, the snapshot gets added-rows: 0, and next-row-id does not advance. Per-file first_row_id stays null; inherited values are never consulted because every stored _row_id is non-null.
  • Upgraded-table fallback. If any row reads with null lineage (a v2→v3 upgrade rewritten before any post-upgrade commit assigned ids), the manifest is left for spec-default commit-time assignment: stored ids still win on read, null rows get derived ids, and next-row-id advances by the manifest row count — matching the spec's worked example.
  • Conflict safety. buildSnapshotUpdate already emits assert-next-row-id for v3, so a rewrite staged against stale metadata fails on commit if a concurrent v3 append moved the id watermark (on top of the existing CAS on main).
  • The v1/other guard now mirrors prepareAppend (unsupported format-version).

Tests & verification

  • +8 tests (568 → 576), all passing; npm run lint and npm run build:types clean. v2 rewrite behavior unchanged.
  • The v3 gating test ("throws on v3") is replaced with real coverage: lineage preserved across a sorted rewrite (the sort scrambles positions, so values must come from the materialized columns — asserted), targetFileRows splits, deletion vectors consumed with survivors keeping their ids, appends after a rewrite continuing from the unchanged next-row-id with no collisions, rewrite-of-a-rewrite stability, the upgraded-table assignment fallback, and the assert-next-row-id requirement rejecting a moved watermark.
  • Manually verified the rewritten parquet files carry both lineage columns with the reserved field ids in the physical schema and kv metadata, and round-trip a scrambled v3 table with next-row-id unchanged.

Deliberate limitations

  • Lineage columns are always materialized on v3 rewrites, even in the fallback mode where they hold nulls — spec-equivalent to omitting them ("readers should treat both columns as if they exist and are set to null"), and required anyway for mixed tables where some rows carry ids.
  • Retry-on-conflict remains out of scope, as in Scan pruning and sort-on-write (#20, #21, #22) #23: a rewrite only rewrote the rows it read, so on conflict it throws rather than risk dropping concurrently-appended rows.

🤖 Generated with Claude Code

philcunliffe and others added 2 commits June 10, 2026 15:17
A v3 rewrite must not renumber existing rows: each surviving row's
_row_id and _last_updated_sequence_number are now materialized as
explicit columns (reserved field ids 2147483540 / 2147483539) in the
rewritten parquet files, since the global sort destroys the positional
ordering that derivation relies on. When every live row carries
lineage, the new manifest's first_row_id is pinned to the minimum
carried _row_id so assignFirstRowIds consumes no new ids and
next-row-id does not advance. Tables upgraded from v2 whose rows were
never assigned ids fall back to spec-default commit-time assignment.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@philcunliffe

Copy link
Copy Markdown
Contributor Author

Dual-agent review — approve

  • Verdict: approve
  • Risk class: low
  • Auto-merge advisory: 👎 thumbs down — minor findings present; human should confirm before merge

Advisory only: no merge was attempted.

Risk capstone

Cross-reference: reviewer findings vs high-risk surfaces

Source Finding (severity, evidence) Intersects
Claude README still documents icebergRewrite as v2-only (minor, README.md:176, README.md:227) No — doc-only; outside the impact surface
Codex review

Fix Validations

v3 rewrite previously rejected / row lineage not preserved

  • Status: correct
  • Evidence: src/write/rewrite.js:59, src/write/rewrite.js:117, src/write/rewrite.js:198, src/write/snapshot.js:72, test/write/rewrite.test.js:233
  • Assessment: The guard now permits format-version 3, v3 rewrites write reserved lineage columns, and pure rewrites pin the manifest so assignFirstRowIds assigns zero new rows. The tests cover sorted rewrites, split files, deletion vectors, rewrite-of-rewrite, upgraded-table fallback, and stale next-row-id rejection.

upgraded v2 to v3 rows without lineage get commit-time assignment

  • Status: correct
  • Evidence: src/write/rewrite.js:97, src/write/rewrite.js:198, src/write/snapshot.js:162, src/read.js:608, test/write/rewrite.test.js:339
  • Assessment: Mixed or null lineage leaves first_row_id unset on the new manifest, so the shared v3 assignment path assigns a fresh range and the read path derives values only where stored lineage is null.

Findings

No new issues identified.

No Finding

  1. Behavioral Correctness
  2. Contract & Interface Fidelity
  3. Change Impact / Blast Radius
  4. Concurrency, Ordering & State Safety
  5. Error Handling & Resilience
  6. Security Surface
  7. Resource Lifecycle & Cleanup
  8. Release Safety
  9. Test Evidence Quality
  10. Architectural Consistency
  11. Debuggability & Operability

Evidence Bundle

  • Changed hot paths: icebergStageRewrite v2/v3 gate and lineage write schema at src/write/rewrite.js:59, src/write/rewrite.js:117; manifest row-id pinning at src/write/rewrite.js:198; shared v3 row-id assignment at src/write/snapshot.js:72.
  • Impacted callers: one-call rewrite API exercises staged rewrite via test/write/rewrite.test.js:386; commit applies staged requirements and snapshot updates at src/write/commit.js:41, src/write/commit.js:222.
  • Impacted tests: v3 preservation test/write/rewrite.test.js:233; split output test/write/rewrite.test.js:262; deletion vector rewrite test/write/rewrite.test.js:276; append after rewrite test/write/rewrite.test.js:306; fallback assignment test/write/rewrite.test.js:339; next-row-id conflict test/write/rewrite.test.js:365.
  • Unresolved uncertainty: I did not independently verify the Parquet physical schema writer beyond the diff’s use of writeSchema; the provided tests also do not assert physical field IDs directly.
Claude review

Claude review

README still documents icebergRewrite as v2-only

  • Severity: minor
  • Confidence: 85
  • Evidence: README.md:176, README.md:227
  • Why it matters: The PR's headline feature is v3 rewrite support, but the user-facing docs still say "via a replace snapshot (v2 tables)" and "icebergRewrite compacts to sorted, non-overlapping files (v2)" — the restriction was added in the same commit as the now-removed guard (3edb15b), and this PR lifts the guard and the test but leaves the README advertising the restriction it removed.
  • Suggested fix: Update README.md lines 176 and 227 to state v2 and v3 are supported, noting that v3 rewrites preserve row lineage.

Reports: .git/dual-review/pr-24

@philcunliffe

Copy link
Copy Markdown
Contributor Author

🧭 Decision map — where to spend your attention

Companion to the dual-review verdict. This casts no verdict — it points at the 4 forks where the author made a real choice, so you can skim the rest.

Scanned: 10 hunks across 2 files. Most is mechanical: ~140 added test lines (the lineageById helper, formatVersion plumbing, seven new v3 test bodies) and JSDoc updates. The production change is ~50 lines, and nearly all its decision content is in four forks:

1. Pin the rewritten manifest's first_row_id to the minimum carried _row_id · magic value / contract

src/write/rewrite.js:204

  if (allLineage) {
    newManifest.first_row_id = minRowId
  }
  • Decision: a pure rewrite reuses the carried id range — pinning first_row_id to min(_row_id) makes assignFirstRowIds take its skip-if-assigned branch, so the snapshot gets added-rows: 0 and next-row-id does not advance.
  • Alternative not taken: leave first_row_id null and accept a fresh commit-time range — simpler and spec-default, but it renumbers every row (the exact bug this PR exists to fix).
  • Check: the safety argument is the comment's invariant — carried ids are distinct and below next-row-id, so min + row count ≤ next-row-id. That holds for any valid table but is trusted, not asserted: duplicate stored ids from a corrupt upstream writer would silently extend the pinned range past the watermark.

2. All-or-nothing preserve mode: one null-lineage row drops the whole rewrite to fallback · unhappy-path policy

src/write/rewrite.js:97

  const allLineage = rowLineage && liveRows.length > 0 &&
    liveRows.every(r => r._row_id != null && r._last_updated_sequence_number != null)
  • Decision: preserve mode requires every live row to carry both lineage values; otherwise the manifest is left for spec-default commit-time assignment (next-row-id advances by the full row count; carried rows survive only via stored-value-wins on read).
  • Alternative not taken: handle mixed tables in preserve mode — e.g. split carried vs null rows into separate manifests so only null rows consume fresh ids. Tighter id accounting, materially more complexity.
  • Check: the mixed case (v2→v3 upgrade + post-upgrade appends, then rewrite) is the one branch with no test — confirm the fallback semantics are what you intend there.

3. Lineage columns are always materialized on v3 — even all-null in fallback mode · contract / shape

src/write/rewrite.js:117

  const writeSchema = rowLineage
    ? { ...schema, fields: [ ...schema.fields, /* _row_id, _last_updated_sequence_number */ ] }
    : schema
  • Decision: keyed on formatVersion >= 3, not on allLineage — every v3 rewrite writes the two lineage columns, holding nulls in fallback mode.
  • Alternative not taken: extend the schema only in preserve mode — spec-equivalent (readers treat absent columns as null) and smaller files for upgraded tables.
  • Check: the PR body owns this as a deliberate limitation (uniformity + required for mixed tables); confirm the file-size/interop cost on upgraded tables is acceptable.

4. The extended schema reaches the parquet writer only · contract / shape

src/write/rewrite.js:136

      await writeParquet({ writer: dataWriter, schema: writeSchema, records: chunk, codec })
      const stats = computeColumnStats(chunk, schema)
  • Decision: lineage columns get parquet SchemaElement field ids and iceberg.schema kv metadata, but no column stats, no role in partitioning, and no presence in the manifest's embedded schema.
  • Alternative not taken: include them in stats/manifest schema so engines could prune scans on _row_id ranges — at the cost of leaking writer-internal columns into table metadata.
  • Check: the one-word schema vs writeSchema difference on adjacent lines is very easy to misread as a bug — it is deliberate; confirm no downstream consumer expects lineage bounds in manifests.

Honorable mentions (real but lower-stakes): src/write/rewrite.js:59 — v1 rejection message harmonized to unsupported format-version: N, dropping the old self-naming error string; src/write/rewrite.js:122 — reserved field ids 2147483540/2147483539 are spec-fixed and match src/read.js:533, but a swap would be silent corruption; src/write/rewrite.js:97liveRows.length > 0 routes a fully-deleted table to fallback rather than crashing, untested; test/write/rewrite.test.js:365 — the assert-next-row-id conflict is exercised via hand-mutated metadata rather than a real interleaved commit (the in-test comment owns this).

Generated by /decision-map. Advisory — directs attention, casts no verdict.

@philcunliffe philcunliffe requested a review from platypii June 11, 2026 17:50
@philcunliffe philcunliffe merged commit 43e0ece into master Jun 11, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants