perf(transforms): default-on multithreading for the whole vtkTransform hierarchy#77
Merged
Merged
Conversation
…vector transforms Wrap the dispatch + scalar-fallback regions of TransformPoints, TransformNormals, and TransformVectors in fvtk::RunSafeFilterParallel so the existing vtkSMPTools::For loops (grain = THRESHOLD) run under the audited, 4-thread-capped default-threading policy instead of the Sequential backend. Each output tuple is an independent matrix*tuple computation written to a pre-sized array, so the result is bit-identical under any thread count and identical to serial stock VTK. TransformPointsNormalsVectors delegates to the three wrapped methods, so vtkTransformFilter / vtkTransformPolyDataFilter get threaded for free. Adds a >THRESHOLD (sphere 700x700, ~490k pts) transform determinism case to test_smp_determinism.py that hashes transformed points + normals and asserts byte-identical output at 1/4/8 threads. Byte-exactness vs stock is already covered by the bitexact suite's op_transform* cases. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ct transforms Complete the transform hierarchy started for vtkLinearTransform: wrap the per-point SMP loops of vtkHomogeneousTransform (TransformPoints + TransformPointsNormalsVectors) and vtkAbstractTransform (TransformPoints + TransformPointsNormalsVectors) in fvtk::RunSafeFilterParallel. Same bit-exactness argument as the linear case: each iteration reads input tuple ptId, computes a pure function of it (homogeneous matrix*point, or the read-only virtual InternalTransformPoint/InternalTransformDerivative against post-Update transform state), and writes to its own pre-sized output slot m+ptId. No reduction, no shared mutable state -> identical under any thread count and to serial stock VTK. This threads vtkPerspectiveTransform / vtkThinPlateSplineTransform (and every other nonlinear transform) under vtkTransformFilter / vtkTransformPolyDataFilter. Generalizes the large-mesh determinism test to all three families (linear / perspective / tps); byte-exactness vs stock is covered by the existing op_transform_perspective and op_transform_tps bitexact cases. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2eeb5c8 to
2dbbe80
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Threads the entire
vtkTransformhierarchy by default (audited, 4-thread-capped fvtk policy), wrapping the existingvtkSMPTools::Forloops infvtk::RunSafeFilterParallel. Stock VTK leaves these on the Sequential backend; this is a pure threading opt-in, no algorithm change.Loops threaded (all per-point/per-tuple independent, pre-sized outputs):
vtkLinearTransformTransformPoints,TransformNormals,TransformVectorsvtkTransformvtkHomogeneousTransformTransformPoints,TransformPointsNormalsVectorsvtkPerspectiveTransform(perspective divide)vtkAbstractTransformTransformPoints,TransformPointsNormalsVectorsvtkThinPlateSplineTransform+ every nonlinear transformTransformPointsNormalsVectorsdelegates / drives the per-point loop, sovtkTransformFilterandvtkTransformPolyDataFilterthread for free across every transform/render pipeline.Why it's safe (bit-exactness)
ptId, computes a pure function of it —matrix*tuple, the homogeneousw-divide, or the read-only virtualInternalTransformPoint/InternalTransformDerivativeagainst post-Updatetransform state — and writes its own pre-sized output slotm+ptId.RunSafeFilterParallelhas a re-entrancy guard (runs inline if already in a parallel scope), so nested-pipeline use can't oversubscribe or change results.Validation
Built the cp312-abi3 wheel on the manylinux_2_28 executor and ran the full bit-exact gate in-container against stock
vtk==9.6.2:transform*byte-exact-vs-stock cases green (transform,transform_nv,transform_pdf,transform_perspective,transform_tps).test_large_transform_threaded_path_is_deterministicparametrized over linear / perspective / tps (sphere 700×700 ≈ 490k pts, >vtkSMPTools::THRESHOLD, carries normals + vectors) hashes identically at 1 / 4 / 8 threads for all three families.🤖 Generated with Claude Code