perf: opt-in (EnableFast) order-relaxed threading — cutter + contour by akaszynski · Pull Request #81 · pyvista/fvtk

akaszynski · 2026-06-20T03:33:33Z

Opt-in non-exact fast mode — default stays byte-exact

By default fvtk remains a byte-exact drop-in: this cutter runs serial and matches stock VTK exactly. Users opt in to the threaded fast path with fvtk.EnableFast() (or FVTK_FAST=1). This addresses the contract concern on the earlier revision — non-exactness is now strictly opt-in, not default.

import fvtk
fvtk.EnableFast()      # opt in to non-exact threaded fast paths
# ... vtkCutter on a linear UG now threads (order-relaxed) ...
fvtk.DisableFast()     # back to byte-exact (default)

Mechanism

fvtk::FastModeEnabled() / fvtk::RunFastFilterParallel() (Common/Core): the latter threads via the re-entrancy-guarded RunSafeFilterParallel only when fast mode is on (env FVTK_FAST, read live → runtime-toggleable); otherwise runs serially (byte-exact).
vtk3DLinearGridPlaneCutter's EXECUTE_SMPFOR macro now uses RunFastFilterParallel.
fvtk.EnableFast() / DisableFast() / IsFastEnabled() Python API set/clear FVTK_FAST.

Why opt-in (the non-exactness)

Profiling (py-spy --native) put 35% of vtkCutter self-time in this filter's ExtractEdges. Threading it reorders the per-thread triangle buffers, so output cell order differs from stock and varies run-to-run. Points, interpolated point scalars, and the constant plane normal are thread-invariant. So fast mode trades cell-order reproducibility for the speedup — your call, per call site.

Validation (in-container)

Default (no EnableFast): byte-exact — cutter runs serial, identical to stock.
EnableFast: order-relaxed gate passes — op_cutter_linear (f32/f64 × sizes 30/40) sets FVTK_FAST=1; fvtk threads, stock stays serial; compared order-relaxed (points/point-data strict; same triangle multiset carrying cell-data). Thread-count invariance (1/4/8) holds.
strict compare (seq vs threaded) fails — confirms threading actually engages.

Test infra

compare.py: order-relaxed mesh-equality mode (per-op order_relaxed flag via the manifest).
ops.py: op_cutter_linear, order_relaxed=True.
test_smp_determinism.py: cutter_linear in THREADED_OPS.

Deferred

Contour (vtkContour3DLinearGrid): surface-normal averaging is reduction-order-dependent → threading perturbs normal values, not just order → not order-relaxable even with EnableFast. Left serial.

🤖 Generated with Claude Code

…Cutter Adds an OPT-IN non-exact fast mode. By default fvtk stays a byte-exact drop-in: the linear-grid plane cutter runs serial and matches stock VTK exactly. Calling fvtk.EnableFast() (or setting FVTK_FAST=1) opts in to the multithreaded fast path, whose output is order-relaxed (same cells/points, cell ORDER depends on thread scheduling). Mechanism: - fvtk::FastModeEnabled() / fvtk::RunFastFilterParallel() (Common/Core): RunFastFilterParallel threads via the re-entrancy-guarded RunSafeFilter- Parallel ONLY when FastModeEnabled() (env FVTK_FAST, read live so it is runtime-toggleable); otherwise runs the body serially (byte-exact). - vtk3DLinearGridPlaneCutter: its EXECUTE_SMPFOR macro now uses RunFastFilterParallel. Profiling (py-spy --native) put 35% of vtkCutter self-time in this filter's ExtractEdges; under fvtk's Sequential SMP backend it ran serial. - fvtk.EnableFast()/DisableFast()/IsFastEnabled() Python API (package init) set/clear FVTK_FAST. Why opt-in: the threaded triangle emission composites per-thread buffers in thread order, so output CELL ORDER differs from the sequential reference AND varies run-to-run. Points, interpolated point scalars, and the constant plane normal are thread-INVARIANT. Default-off keeps byte-exactness for users who depend on it; EnableFast() trades cell-order reproducibility for the threaded speedup. Test infra (order-relaxed mesh-equality gate): - compare.py: order-relaxed mode (points + point-data strict; cells compared as a multiset keyed by (group/celltype, connectivity-tuple) carrying their cell-data; width-relaxed for int cell-data). - run_ops.py: propagate per-op order_relaxed flag into the manifest. - ops.py: op_cutter_linear -- large hex-UG plane cut (triangles ON) that sets FVTK_FAST=1 (stock ignores it) and drives the threaded path at batch- splitting sizes; order_relaxed=True, f32/f64, sizes 30/40. - test_smp_determinism.py: cutter_linear in THREADED_OPS (thread-count- invariance gate, order-relaxed). - test_bitexact.py: defensive failure formatting for relaxed mode. Contour deferred: vtkContour3DLinearGrid normal averaging is reduction- order-dependent, so threading perturbs normal VALUES (not just order) -- not order-relaxable. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…rid (normals-off) Extends the EnableFast() opt-in fast lane to the linear-grid isocontour. Threads its EXECUTE_SMPFOR sites via RunFastFilterParallel, but ONLY when ComputeNormals is OFF: the call-site gate ORs in GetComputeNormals(), so a contour that computes normals stays serial / byte-exact. This is required because the surface-normal averaging sums cell-normals at shared points in cell order, so threaded (reordered) cells would perturb normal VALUES, not just order -- not order-relaxable. With normals off the merge path produces thread-invariant points + point scalars; only triangle emission order varies (order-relaxed, like the cutter). - op_contour_linear: large hex-UG isocontour, ComputeNormals OFF, sets FVTK_FAST=1; order_relaxed=True, f32/f64, sizes 30/40. - contour_linear added to THREADED_OPS (thread-count-invariance gate). - __all__ exports EnableFast/DisableFast/IsFastEnabled. Profiling put 44% of vtkContourFilter self-time in this filter's ExtractEdges. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

akaszynski · 2026-06-20T05:13:09Z

Added contour (vtkContour3DLinearGrid, 44% of vtkContourFilter self-time) to the same EnableFast lane.

Gated on !ComputeNormals: a contour that computes normals stays serial / byte-exact (its normal averaging sums cell-normals at shared points in cell order, so reordered threaded cells would perturb normal values, not just order). Normals-off threads order-relaxed like the cutter.

Validated in-container:

bitexact: 284 passed (+6 contour_linear: 4 order-relaxed + 2 thread-count-invariance).
Gate proof: contour with normals + EnableFast → deterministic run-to-run + byte-exact; without normals → threads (conn varies run-to-run).

PR now covers cutter + contour under one opt-in contract.

akaszynski force-pushed the feat/cutter-threading branch from f07f271 to e2551f3 Compare June 20, 2026 03:34

akaszynski changed the title ~~perf(cutter): order-relaxed default-on threading for vtk3DLinearGridPlaneCutter~~ perf(cutter): opt-in (EnableFast) order-relaxed threading for vtk3DLinearGridPlaneCutter Jun 20, 2026

akaszynski force-pushed the feat/cutter-threading branch from e2551f3 to 11111a1 Compare June 20, 2026 04:13

akaszynski changed the title ~~perf(cutter): opt-in (EnableFast) order-relaxed threading for vtk3DLinearGridPlaneCutter~~ perf: opt-in (EnableFast) order-relaxed threading — cutter + contour Jun 20, 2026

akaszynski mentioned this pull request Jun 20, 2026

perf(geometry): opt-in fast UnstructuredGrid surface (vendored OpenMP extract_surface) #82

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: opt-in (EnableFast) order-relaxed threading — cutter + contour#81

perf: opt-in (EnableFast) order-relaxed threading — cutter + contour#81
akaszynski wants to merge 2 commits into
mainfrom
feat/cutter-threading

akaszynski commented Jun 20, 2026 •

edited

Loading

Uh oh!

akaszynski commented Jun 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

Conversation

akaszynski commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Opt-in non-exact fast mode — default stays byte-exact

Mechanism

Why opt-in (the non-exactness)

Validation (in-container)

Test infra

Deferred

Uh oh!

akaszynski commented Jun 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

akaszynski commented Jun 20, 2026 •

edited

Loading