Skip to content

perf(core): opt-in fast coincident-point merge (vendored OpenMP clean kernel)#83

Closed
akaszynski wants to merge 0 commit into
mainfrom
feat/fast-clean-port
Closed

perf(core): opt-in fast coincident-point merge (vendored OpenMP clean kernel)#83
akaszynski wants to merge 0 commit into
mainfrom
feat/fast-clean-port

Conversation

@akaszynski

Copy link
Copy Markdown
Member

Summary

Adds an opt-in fast path to vtkStaticCleanUnstructuredGrid for the exact-merge default regime, by vendoring pyvista-algorithms' clean kernel (MIT) and wiring a thin adapter into RequestData.

Stacked on #82 (fast surface). Base is feat/fast-surface-port; review/merge #81 then #82 first. Net-new here is the FiltersCore clean path + staticclean_fast op.

What it does

  • Filters/Core/pvaClean.h (+ pvaHash96.h / pvaLinearProbeTable.h / pvaParallelRadixSort.h) — vendored MIT kernel (namespace pvu::clean), self-contained, excluded from the unity build.
  • Filters/Core/fvtkFastClean.{h,cxx} — adapter: validates the exact-merge default regime (effective tolerance 0, no point-data averaging, no merge-by-data-array, concrete float/double grid, <2³¹ points, no polyhedra) and calls run_clean(remove_degenerate_cells=false) so cells stay 1:1 in input order.
  • Hook in vtkStaticCleanUnstructuredGrid::RequestData, gated on fvtk::FastModeEnabled(). Default off → byte-exact.

Faithful-to-stock details

  • Cells are kept 1:1 in input order → cell data passes through unchanged (matches stock PassData).
  • Merged-point data is copied from the kernel's canonical source (source_map); tolerance-0 merge means coincident points share exact coordinates, so output coordinates are bit-identical to stock.
  • Cell-type representation is preserved: a homogeneous grid keeps its implicit vtkConstantArray types (so the output's GetCellTypesArray() stays null, exactly like stock), a heterogeneous grid keeps its explicit per-cell array.

Robust cell-type acquisition

GetCellTypesArray() returns null for homogeneous grids (implicit constant types). The adapter handles this: zero-copy AOS pointer when present, else a per-cell GetCellType(i) copy. (Same fix applied to the surface adapter in #82.)

Validation

  • Built cp312-abi3 manylinux_2_28 wheel on the executor; vtkFiltersCore.abi3.so links libgomp (OpenMP real, not stub).
  • Confirmed the kernel genuinely engages (under EnableFast() the output point order is reordered vs the standard path, same point set) — not a silent fallback.
  • Bit-exact gate: 292 passed — the 4 staticclean_fast cases match stock under the points-relaxed gate; standard cases stay byte-exact.

🤖 Generated with Claude Code

@akaszynski akaszynski changed the base branch from feat/fast-surface-port to main June 20, 2026 10:05
@akaszynski akaszynski closed this Jun 20, 2026
@akaszynski akaszynski reopened this Jun 20, 2026
@akaszynski akaszynski closed this Jun 20, 2026
@akaszynski akaszynski reopened this Jun 20, 2026
@akaszynski akaszynski closed this Jun 20, 2026
@akaszynski akaszynski force-pushed the feat/fast-clean-port branch from a266856 to 64fbfc1 Compare June 20, 2026 11:26
@akaszynski

Copy link
Copy Markdown
Member Author

Merged to main via fast-forward (41fcfe9 clean kernel + 64fbfc1 MSVC gate fix), rebased onto the merged surface tip to drop duplicate lower-stack commits. The Windows build is green (the vendored OpenMP clean kernel compiles to the bit-exact stub on MSVC, falling back to stock vtkStaticClean). Live on main.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant