perf(geometry): opt-in fast UnstructuredGrid surface (vendored OpenMP extract_surface)#82
perf(geometry): opt-in fast UnstructuredGrid surface (vendored OpenMP extract_surface)#82akaszynski wants to merge 1 commit into
Conversation
…nMP kernel
Vendor pyvista-algorithms' extract_surface kernel (MIT) as
Filters/Geometry/pvaExtractSurface.h and add a thin VTK adapter
(fvtkFastSurface.{h,cxx}) wired into vtkDataSetSurfaceFilter's
UnstructuredGridExecute. The fast path is OPT-IN: it activates only when
fvtk::FastModeEnabled() (env FVTK_FAST / fvtk.EnableFast()) and the grid is a
concrete UnstructuredGrid of supported linear 3D cells (tetra/hex/voxel/wedge/
pyramid), float/double points, <2^31 points; otherwise FastUnstructuredSurface
returns false and the standard path runs (default byte-exact).
The kernel uses OpenMP directly (n_threads=0 -> OMP_NUM_THREADS), avoiding
vtkSMPTools dispatch + LocalScope oversubscription. The TU is excluded from the
unity build and gated on FVTK_HAVE_OPENMP; without OpenMP the adapter is a stub.
Output is order-relaxed AND point-order-relaxed (thread/hash-dependent cell and
surface-point emission order). Extend the bit-exact comparator with point-order
canonicalization (compare.py: _point_canonicalization/_remap_conn, relax_points)
threaded through run_ops.py's points_relaxed manifest flag, and add
op_datasetsurface_fast (order_relaxed + points_relaxed) covering the new path.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
486cb74 to
e66d79b
Compare
|
Force-pushed a critical correctness fix ( While validating the follow-up clean port I discovered the fast path here was silently falling back to the standard path for the common homogeneous-grid case — so the points-relaxed tests were passing on the byte-exact fallback, not on the kernel. Root cause: Fix: acquire per-cell types robustly — zero-copy AOS pointer when available, else a per-cell |
Summary
Adds an opt-in fast boundary-surface path to
vtkDataSetSurfaceFilterfor unstructured grids, by vendoring pyvista-algorithms'extract_surfaceOpenMP kernel (MIT) and wiring a thin VTK adapter intoUnstructuredGridExecute.What it does
Filters/Geometry/pvaExtractSurface.h— vendored MIT kernel (namespace fse), self-contained (<std>+<omp.h>), excluded from the unity build.Filters/Geometry/fvtkFastSurface.{h,cxx}— VTK adapter: validates a concretevtkUnstructuredGridof supported linear 3D cells (tetra/hex/voxel/wedge/pyramid), float/double points, <2³¹ points; zero-copy int32 connectivity (fvtk's width-relaxed default); callsfse::extract_surface+compact_points; builds output points/polys, copies point & cell data, optionalOriginalPointIds/CellIds.vtkDataSetSurfaceFilter::UnstructuredGridExecute, gated onfvtk::FastModeEnabled(). Default off → byte-exact. When the grid isn't eligible, returns false and the standard path runs.Why OpenMP directly (not vtkSMPTools)
The kernel is called directly with
n_threads=0(→OMP_NUM_THREADS). Routing it throughvtkSMPToolswould add dispatch overhead and mutate the process-globalLocalScopesingleton, oversubscribing against the kernel's own OpenMP region. The TU is gated onFVTK_HAVE_OPENMP; without OpenMP the adapter compiles to a stub.Correctness / gating
Output is order-relaxed AND point-order-relaxed (thread/hash-dependent cell + surface-point emission order). The bit-exact comparator gains point-order canonicalization (
compare.py:_point_canonicalization/_remap_conn,relax_points), threaded throughrun_ops.py'spoints_relaxedmanifest flag. Newop_datasetsurface_fast(order+points relaxed) covers the path; afast_mode()context manager restoresFVTK_FASTso opt-in can't leak into byte-exact ops sharing the run_ops process.Validation
libgompisNEEDEDbyvtkFiltersGeometry.abi3.so(OpenMP genuinely linked, not the stub).datasetsurface_fastcases match stock under the points-relaxed gate; the 4 standarddatasetsurface_ugridcases remain byte-exact.🤖 Generated with Claude Code