perf(data-interp): default-on multithreading for cell<->point data interpolation by akaszynski · Pull Request #78 · pyvista/fvtk

akaszynski · 2026-06-19T19:03:09Z

Summary

Threads vtkCellDataToPointData and vtkPointDataToCellData by default (audited, 4-thread-capped fvtk policy), wrapping their existing per-output-element vtkSMPTools::For loops in fvtk::RunSafeFilterParallel. fvtk's default vtkSMPTools backend is Sequential, so these ran serial; this is a pure threading opt-in, no algorithm change.

vtkCellDataToPointData: 5 dispatch sites (For(0, numPts, …) across the abstract / static-cell-links fast paths).
vtkPointDataToCellData: 2 sites (For(0, numCells, …), categorical + non-categorical).

Why it's safe (bit-exactness)

Each output element is an independent average of read-only input: a point gathers from its incident cells (via the deterministically-built cell links), a cell gathers from its points — each summed in a fixed link/point order and written to its own pre-sized slot. No shared scatter, no cross-element mutable state, no insertion-order dependence ⇒ bit-identical under any thread count and byte-identical to serial stock VTK. Only the For compute loops are wrapped; the cell-links build stays outside the parallel scope.

Validation

Built the cp312-abi3 wheel on the manylinux_2_28 executor and ran the full bit-exact gate in-container against stock vtk==9.6.2: 266 passed.

Byte-exact-vs-stock: cell2point, point2cell, point2cell_ugrid cases green.
New test_large_interp_threaded_path_is_deterministic (sphere 700×700 ≈ 490k pts / ≈980k cells, > vtkSMPTools::THRESHOLD) hashes identically at 1 / 4 / 8 threads for both cd2pd and pd2cd.

🤖 Generated with Claude Code

…lation Wrap the per-output-element vtkSMPTools::For loops of vtkCellDataToPointData (5 dispatch sites over numPts) and vtkPointDataToCellData (2 sites over numCells) in fvtk::RunSafeFilterParallel so they thread under the audited 4-thread-capped default policy instead of the Sequential backend. Each output element is an independent average of read-only input: a point gathers from its incident cells (via the deterministically-built cell links) and a cell gathers from its points, each summed in a fixed link/point order and written to its own pre-sized slot. No shared scatter, no cross-element state => bit-identical under any thread count and to serial stock VTK. Only the For compute loops are wrapped; the cell-links build is left outside the scope. Adds a >THRESHOLD (sphere 700x700 ~490k pts / ~980k cells) determinism case parametrized over cd2pd/pd2cd. Byte-exactness vs stock is covered by the existing op_cell2point / op_point2cell / op_point2cell_ugrid bitexact cases. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

akaszynski force-pushed the feat/datainterp-threading branch from 54fb7d9 to 3fdf0fe Compare June 19, 2026 21:28

akaszynski merged commit 3fdf0fe into main Jun 19, 2026
9 checks passed

akaszynski deleted the feat/datainterp-threading branch June 19, 2026 22:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(data-interp): default-on multithreading for cell<->point data interpolation#78

perf(data-interp): default-on multithreading for cell<->point data interpolation#78
akaszynski merged 1 commit into
mainfrom
feat/datainterp-threading

akaszynski commented Jun 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

Conversation

akaszynski commented Jun 19, 2026

Summary

Why it's safe (bit-exactness)

Validation

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant