Skip to content

perf(data-interp): default-on multithreading for cell<->point data interpolation#78

Merged
akaszynski merged 1 commit into
mainfrom
feat/datainterp-threading
Jun 19, 2026
Merged

perf(data-interp): default-on multithreading for cell<->point data interpolation#78
akaszynski merged 1 commit into
mainfrom
feat/datainterp-threading

Conversation

@akaszynski

Copy link
Copy Markdown
Member

Summary

Threads vtkCellDataToPointData and vtkPointDataToCellData by default (audited, 4-thread-capped fvtk policy), wrapping their existing per-output-element vtkSMPTools::For loops in fvtk::RunSafeFilterParallel. fvtk's default vtkSMPTools backend is Sequential, so these ran serial; this is a pure threading opt-in, no algorithm change.

  • vtkCellDataToPointData: 5 dispatch sites (For(0, numPts, …) across the abstract / static-cell-links fast paths).
  • vtkPointDataToCellData: 2 sites (For(0, numCells, …), categorical + non-categorical).

Why it's safe (bit-exactness)

Each output element is an independent average of read-only input: a point gathers from its incident cells (via the deterministically-built cell links), a cell gathers from its points — each summed in a fixed link/point order and written to its own pre-sized slot. No shared scatter, no cross-element mutable state, no insertion-order dependence ⇒ bit-identical under any thread count and byte-identical to serial stock VTK. Only the For compute loops are wrapped; the cell-links build stays outside the parallel scope.

Validation

Built the cp312-abi3 wheel on the manylinux_2_28 executor and ran the full bit-exact gate in-container against stock vtk==9.6.2: 266 passed.

  • Byte-exact-vs-stock: cell2point, point2cell, point2cell_ugrid cases green.
  • New test_large_interp_threaded_path_is_deterministic (sphere 700×700 ≈ 490k pts / ≈980k cells, > vtkSMPTools::THRESHOLD) hashes identically at 1 / 4 / 8 threads for both cd2pd and pd2cd.

🤖 Generated with Claude Code

…lation

Wrap the per-output-element vtkSMPTools::For loops of vtkCellDataToPointData
(5 dispatch sites over numPts) and vtkPointDataToCellData (2 sites over numCells)
in fvtk::RunSafeFilterParallel so they thread under the audited 4-thread-capped
default policy instead of the Sequential backend.

Each output element is an independent average of read-only input: a point
gathers from its incident cells (via the deterministically-built cell links) and
a cell gathers from its points, each summed in a fixed link/point order and
written to its own pre-sized slot. No shared scatter, no cross-element state =>
bit-identical under any thread count and to serial stock VTK. Only the For
compute loops are wrapped; the cell-links build is left outside the scope.

Adds a >THRESHOLD (sphere 700x700 ~490k pts / ~980k cells) determinism case
parametrized over cd2pd/pd2cd. Byte-exactness vs stock is covered by the
existing op_cell2point / op_point2cell / op_point2cell_ugrid bitexact cases.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@akaszynski akaszynski force-pushed the feat/datainterp-threading branch from 54fb7d9 to 3fdf0fe Compare June 19, 2026 21:28
@akaszynski akaszynski merged commit 3fdf0fe into main Jun 19, 2026
9 checks passed
@akaszynski akaszynski deleted the feat/datainterp-threading branch June 19, 2026 22:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant