perf(geometry): opt-in fast UnstructuredGrid surface (vendored OpenMP extract_surface) by akaszynski · Pull Request #82 · pyvista/fvtk

akaszynski · 2026-06-20T06:47:54Z

Summary

Adds an opt-in fast boundary-surface path to vtkDataSetSurfaceFilter for unstructured grids, by vendoring pyvista-algorithms' extract_surface OpenMP kernel (MIT) and wiring a thin VTK adapter into UnstructuredGridExecute.

Stacked on #81 (cutter+contour EnableFast). Base is feat/cutter-threading; review/merge #81 first. The net-new diff here is the FiltersGeometry surface path + the points-relaxed comparator.

What it does

Filters/Geometry/pvaExtractSurface.h — vendored MIT kernel (namespace fse), self-contained (<std> + <omp.h>), excluded from the unity build.
Filters/Geometry/fvtkFastSurface.{h,cxx} — VTK adapter: validates a concrete vtkUnstructuredGrid of supported linear 3D cells (tetra/hex/voxel/wedge/pyramid), float/double points, <2³¹ points; zero-copy int32 connectivity (fvtk's width-relaxed default); calls fse::extract_surface + compact_points; builds output points/polys, copies point & cell data, optional OriginalPointIds/CellIds.
Hook in vtkDataSetSurfaceFilter::UnstructuredGridExecute, gated on fvtk::FastModeEnabled(). Default off → byte-exact. When the grid isn't eligible, returns false and the standard path runs.

Why OpenMP directly (not vtkSMPTools)

The kernel is called directly with n_threads=0 (→ OMP_NUM_THREADS). Routing it through vtkSMPTools would add dispatch overhead and mutate the process-global LocalScope singleton, oversubscribing against the kernel's own OpenMP region. The TU is gated on FVTK_HAVE_OPENMP; without OpenMP the adapter compiles to a stub.

Correctness / gating

Output is order-relaxed AND point-order-relaxed (thread/hash-dependent cell + surface-point emission order). The bit-exact comparator gains point-order canonicalization (compare.py: _point_canonicalization/_remap_conn, relax_points), threaded through run_ops.py's points_relaxed manifest flag. New op_datasetsurface_fast (order+points relaxed) covers the path; a fast_mode() context manager restores FVTK_FAST so opt-in can't leak into byte-exact ops sharing the run_ops process.

Validation

Built cp312-abi3 manylinux_2_28 wheel on the executor; confirmed libgomp is NEEDED by vtkFiltersGeometry.abi3.so (OpenMP genuinely linked, not the stub).
Bit-exact gate: 288 passed — the 4 new datasetsurface_fast cases match stock under the points-relaxed gate; the 4 standard datasetsurface_ugrid cases remain byte-exact.

🤖 Generated with Claude Code

…nMP kernel Vendor pyvista-algorithms' extract_surface kernel (MIT) as Filters/Geometry/pvaExtractSurface.h and add a thin VTK adapter (fvtkFastSurface.{h,cxx}) wired into vtkDataSetSurfaceFilter's UnstructuredGridExecute. The fast path is OPT-IN: it activates only when fvtk::FastModeEnabled() (env FVTK_FAST / fvtk.EnableFast()) and the grid is a concrete UnstructuredGrid of supported linear 3D cells (tetra/hex/voxel/wedge/ pyramid), float/double points, <2^31 points; otherwise FastUnstructuredSurface returns false and the standard path runs (default byte-exact). The kernel uses OpenMP directly (n_threads=0 -> OMP_NUM_THREADS), avoiding vtkSMPTools dispatch + LocalScope oversubscription. The TU is excluded from the unity build and gated on FVTK_HAVE_OPENMP; without OpenMP the adapter is a stub. Output is order-relaxed AND point-order-relaxed (thread/hash-dependent cell and surface-point emission order). Extend the bit-exact comparator with point-order canonicalization (compare.py: _point_canonicalization/_remap_conn, relax_points) threaded through run_ops.py's points_relaxed manifest flag, and add op_datasetsurface_fast (order_relaxed + points_relaxed) covering the new path. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

akaszynski · 2026-06-20T08:42:37Z

Force-pushed a critical correctness fix (486cb74 → e66d79b).

While validating the follow-up clean port I discovered the fast path here was silently falling back to the standard path for the common homogeneous-grid case — so the points-relaxed tests were passing on the byte-exact fallback, not on the kernel.

Root cause: vtkUnstructuredGrid::GetCellTypesArray() returns vtkUnsignedCharArray::FastDownCast(this->Types), which is null for homogeneous grids whose types are stored as an implicit vtkConstantArray (e.g. anything built via SetCells(int type, cells) — all-tet/all-hex meshes). The adapter treated a null types array as "bail", so it never engaged.

Fix: acquire per-cell types robustly — zero-copy AOS pointer when available, else a per-cell GetCellType(i) copy. Verified the kernel now genuinely engages (output point order is reordered vs the standard path, same point set) and still matches stock under the points-relaxed gate (292/292 bit-exact).

akaszynski force-pushed the feat/fast-surface-port branch from 486cb74 to e66d79b Compare June 20, 2026 08:42

akaszynski mentioned this pull request Jun 20, 2026

perf(core): opt-in fast coincident-point merge (vendored OpenMP clean kernel) #83

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(geometry): opt-in fast UnstructuredGrid surface (vendored OpenMP extract_surface)#82

perf(geometry): opt-in fast UnstructuredGrid surface (vendored OpenMP extract_surface)#82
akaszynski wants to merge 1 commit into
feat/cutter-threadingfrom
feat/fast-surface-port

akaszynski commented Jun 20, 2026

Uh oh!

akaszynski commented Jun 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

Conversation

akaszynski commented Jun 20, 2026

Summary

What it does

Why OpenMP directly (not vtkSMPTools)

Correctness / gating

Validation

Uh oh!

akaszynski commented Jun 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant