Skip to content

perf(transforms): default-on multithreading for the whole vtkTransform hierarchy#77

Merged
akaszynski merged 2 commits into
mainfrom
feat/linear-transform-threading
Jun 19, 2026
Merged

perf(transforms): default-on multithreading for the whole vtkTransform hierarchy#77
akaszynski merged 2 commits into
mainfrom
feat/linear-transform-threading

Conversation

@akaszynski

@akaszynski akaszynski commented Jun 19, 2026

Copy link
Copy Markdown
Member

Summary

Threads the entire vtkTransform hierarchy by default (audited, 4-thread-capped fvtk policy), wrapping the existing vtkSMPTools::For loops in fvtk::RunSafeFilterParallel. Stock VTK leaves these on the Sequential backend; this is a pure threading opt-in, no algorithm change.

Loops threaded (all per-point/per-tuple independent, pre-sized outputs):

Class Methods Drives
vtkLinearTransform TransformPoints, TransformNormals, TransformVectors affine vtkTransform
vtkHomogeneousTransform TransformPoints, TransformPointsNormalsVectors vtkPerspectiveTransform (perspective divide)
vtkAbstractTransform TransformPoints, TransformPointsNormalsVectors vtkThinPlateSplineTransform + every nonlinear transform

TransformPointsNormalsVectors delegates / drives the per-point loop, so vtkTransformFilter and vtkTransformPolyDataFilter thread for free across every transform/render pipeline.

Why it's safe (bit-exactness)

  • Each iteration reads input tuple ptId, computes a pure function of it — matrix*tuple, the homogeneous w-divide, or the read-only virtual InternalTransformPoint/InternalTransformDerivative against post-Update transform state — and writes its own pre-sized output slot m+ptId.
  • No reduction, no shared mutable state, no insertion-order dependence ⇒ bit-identical under any thread count and byte-identical to serial stock VTK.
  • RunSafeFilterParallel has a re-entrancy guard (runs inline if already in a parallel scope), so nested-pipeline use can't oversubscribe or change results.

Validation

Built the cp312-abi3 wheel on the manylinux_2_28 executor and ran the full bit-exact gate in-container against stock vtk==9.6.2:

  • All transform* byte-exact-vs-stock cases green (transform, transform_nv, transform_pdf, transform_perspective, transform_tps).
  • test_large_transform_threaded_path_is_deterministic parametrized over linear / perspective / tps (sphere 700×700 ≈ 490k pts, > vtkSMPTools::THRESHOLD, carries normals + vectors) hashes identically at 1 / 4 / 8 threads for all three families.

🤖 Generated with Claude Code

@akaszynski akaszynski changed the title perf(vtkLinearTransform): default-on multithreading for point/normal/vector transforms perf(transforms): default-on multithreading for the whole vtkTransform hierarchy Jun 19, 2026
akaszynski and others added 2 commits June 19, 2026 14:38
…vector transforms

Wrap the dispatch + scalar-fallback regions of TransformPoints,
TransformNormals, and TransformVectors in fvtk::RunSafeFilterParallel so the
existing vtkSMPTools::For loops (grain = THRESHOLD) run under the audited,
4-thread-capped default-threading policy instead of the Sequential backend.

Each output tuple is an independent matrix*tuple computation written to a
pre-sized array, so the result is bit-identical under any thread count and
identical to serial stock VTK. TransformPointsNormalsVectors delegates to the
three wrapped methods, so vtkTransformFilter / vtkTransformPolyDataFilter get
threaded for free.

Adds a >THRESHOLD (sphere 700x700, ~490k pts) transform determinism case to
test_smp_determinism.py that hashes transformed points + normals and asserts
byte-identical output at 1/4/8 threads. Byte-exactness vs stock is already
covered by the bitexact suite's op_transform* cases.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ct transforms

Complete the transform hierarchy started for vtkLinearTransform: wrap the
per-point SMP loops of vtkHomogeneousTransform (TransformPoints +
TransformPointsNormalsVectors) and vtkAbstractTransform (TransformPoints +
TransformPointsNormalsVectors) in fvtk::RunSafeFilterParallel.

Same bit-exactness argument as the linear case: each iteration reads input
tuple ptId, computes a pure function of it (homogeneous matrix*point, or the
read-only virtual InternalTransformPoint/InternalTransformDerivative against
post-Update transform state), and writes to its own pre-sized output slot
m+ptId. No reduction, no shared mutable state -> identical under any thread
count and to serial stock VTK. This threads vtkPerspectiveTransform /
vtkThinPlateSplineTransform (and every other nonlinear transform) under
vtkTransformFilter / vtkTransformPolyDataFilter.

Generalizes the large-mesh determinism test to all three families
(linear / perspective / tps); byte-exactness vs stock is covered by the
existing op_transform_perspective and op_transform_tps bitexact cases.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@akaszynski akaszynski force-pushed the feat/linear-transform-threading branch from 2eeb5c8 to 2dbbe80 Compare June 19, 2026 20:39
@akaszynski akaszynski merged commit 2dbbe80 into main Jun 19, 2026
9 checks passed
@akaszynski akaszynski deleted the feat/linear-transform-threading branch June 19, 2026 21:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant