perf(data-interp): default-on multithreading for cell<->point data interpolation#78
Merged
Merged
Conversation
…lation Wrap the per-output-element vtkSMPTools::For loops of vtkCellDataToPointData (5 dispatch sites over numPts) and vtkPointDataToCellData (2 sites over numCells) in fvtk::RunSafeFilterParallel so they thread under the audited 4-thread-capped default policy instead of the Sequential backend. Each output element is an independent average of read-only input: a point gathers from its incident cells (via the deterministically-built cell links) and a cell gathers from its points, each summed in a fixed link/point order and written to its own pre-sized slot. No shared scatter, no cross-element state => bit-identical under any thread count and to serial stock VTK. Only the For compute loops are wrapped; the cell-links build is left outside the scope. Adds a >THRESHOLD (sphere 700x700 ~490k pts / ~980k cells) determinism case parametrized over cd2pd/pd2cd. Byte-exactness vs stock is covered by the existing op_cell2point / op_point2cell / op_point2cell_ugrid bitexact cases. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
54fb7d9 to
3fdf0fe
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Threads
vtkCellDataToPointDataandvtkPointDataToCellDataby default (audited, 4-thread-capped fvtk policy), wrapping their existing per-output-elementvtkSMPTools::Forloops infvtk::RunSafeFilterParallel. fvtk's default vtkSMPTools backend is Sequential, so these ran serial; this is a pure threading opt-in, no algorithm change.vtkCellDataToPointData: 5 dispatch sites (For(0, numPts, …)across the abstract / static-cell-links fast paths).vtkPointDataToCellData: 2 sites (For(0, numCells, …), categorical + non-categorical).Why it's safe (bit-exactness)
Each output element is an independent average of read-only input: a point gathers from its incident cells (via the deterministically-built cell links), a cell gathers from its points — each summed in a fixed link/point order and written to its own pre-sized slot. No shared scatter, no cross-element mutable state, no insertion-order dependence ⇒ bit-identical under any thread count and byte-identical to serial stock VTK. Only the
Forcompute loops are wrapped; the cell-links build stays outside the parallel scope.Validation
Built the cp312-abi3 wheel on the manylinux_2_28 executor and ran the full bit-exact gate in-container against stock
vtk==9.6.2: 266 passed.cell2point,point2cell,point2cell_ugridcases green.test_large_interp_threaded_path_is_deterministic(sphere 700×700 ≈ 490k pts / ≈980k cells, >vtkSMPTools::THRESHOLD) hashes identically at 1 / 4 / 8 threads for bothcd2pdandpd2cd.🤖 Generated with Claude Code