Update: GPU knn BVH search for irregular grid by chaseshyu · Pull Request #56 · GeoFLAC/DynEarthSol

chaseshyu · 2026-04-15T13:29:59Z

This pull request introduces improvements to the KNN (k-nearest neighbors) search infrastructure, especially for GPU/OpenACC builds, and enhances the efficiency and flexibility of marker and node interpolation routines. The changes include adding the knn-bvh submodule for GPU-accelerated KNN, refactoring the KNN class API, updating the build system to support the new backend, and improving logging and batching for large-scale searches. These updates aim to improve performance, support larger datasets, and provide clearer diagnostics during execution.

KNN backend integration and build system updates:

Previous hash grid search for GPU knn search becomes invalid when handling different size elements in irregular grid. knn BVH search is using Bounding Volume Hierarchy, a tree-structure acceleration technique, to gather k-nearest neighbors efficiently. This change reduces 61% of remeshing time on GPU.
Tests: Regular gird | 200k elements | GPU
This change

  Execute : 000:01:18.300342
  Initiate: 000:00:33.291851 (42.52%)
  Compute : 000:00:01.207063 ( 1.54%)/ 40 = 0.030177 s/step
  Remesh  : 000:00:35.669887 (45.56%)/ 1 = 35.67 s/remesh
  Output  : 000:00:08.131541 (10.39%)/ 5 = 1.63 s/output

Commit 5b41399

  Execute : 000:02:28.117907
  Initiate: 000:00:44.912479 (30.32%)
  Compute : 000:00:01.300568 ( 0.88%)/ 40 = 0.032514 s/step
  Remesh  : 000:01:32.118626 (62.19%)/ 1 = 92.12 s/remesh
  Output  : 000:00:09.786234 ( 6.61%)/ 5 = 1.96 s/output

Added the knn-bvh submodule and integrated its build process into the Makefile for OpenACC-enabled builds. The build now ensures the submodule is initialized and built as needed, and cleans up appropriately. (.gitmodules, Makefile, knn-bvh) [1] [2] [3] [4] [5]

KNN class and API refactoring:

Refactored the KNN class to support both CPU and GPU backends, added batching support, memory usage reporting, and a new search API that returns a pointer to the result buffer. The constructor now accepts a maximum capacity to minimize reallocations. (knn.hpp)

Performance and batching improvements for interpolation and marker routines:

Updated node and marker interpolation routines to use batched KNN queries, improving performance for large numbers of queries. The routines now print backend information, memory usage, and batch/block sizes for diagnostics. (brc-interpolation.cxx, markerset.cxx, nn-interpolation.cxx) [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12]

Auxiliary improvements and code cleanup:

Added a new method to Array2D for packing data into a GPU-friendly format (pack_to_xyz_float). (array2d.hpp)
Disabled the unused create_neighbor calls. [1] [2]

Documentation and CI:

Added a badge for the new nvc (NVIDIA HPC compiler) build workflow to the README.md. (README.md)
Added Docker build and push workflow with CUDA Dockerfile to stablize Github Action CI environment build.

These changes collectively modernize and optimize the KNN infrastructure, especially for GPU-accelerated workflows, and improve the maintainability and scalability of the codebase.

Close #9

…egular spatial search

…zation and restart processes

chatgpt-codex-connector · 2026-04-15T13:30:09Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

Copilot

Pull request overview

This PR upgrades the KNN infrastructure for OpenACC/GPU builds by switching the GPU backend to a BVH-based implementation (via the new knn-bvh submodule), and refactors interpolation/marker routines to use batched KNN queries with improved runtime diagnostics.

Changes:

Add and wire up the knn-bvh submodule + Makefile targets for OpenACC builds.
Refactor KNN API to support GPU BVH search, batching limits, and returning a results-buffer pointer.
Update interpolation and marker remapping to use batched GPU-direct KNN searches and improved logging/memory reporting.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
`.gitmodules`	Adds `knn-bvh` submodule configuration.
`Makefile`	Integrates `knn-bvh` include/lib, adds `prepare` step, and adds clean/build rules for the submodule.
`README.md`	Adds CI badge for the new nvc build workflow.
`array2d.hpp`	Adds helper to pack coordinates into `float3` for GPU KNN.
`brc-interpolation.cxx`	Switches node interpolation to batched KNN queries and new KNN API.
`dynearthsol.cxx`	Disables `create_neighbor()` calls during init/restart.
`knn.hpp`	Refactors KNN interface to return a `neighbor*`, adds backend/memory reporting helpers.
`knn.cxx`	Replaces hash-grid GPU KNN with BVH-backed search; adds batch sizing and result-buffer handling.
`markerset.cxx`	Switches marker remap/replenish KNN calls to new API and batching with diagnostics.
`nn-interpolation.cxx`	Switches NN/ACM interpolation KNN calls to new API and batching with diagnostics.
`parameters.hpp`	Adjusts neighbor type definition for ACC builds and includes BVH header under ACC.
`remeshing.cxx`	Adds GPU memory reporting and additional `acc wait` synchronization points; disables neighbor rebuild.
`utils.hpp`	Adds number formatting helper for log output.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-15T13:36:58Z

 class KNN
 {
 public:
-    KNN(const Param& param, const array_t& points_vec_, NANOKDTree& nano_kdtree_,
-            double resoTimes = 3);
+    KNN(const Param& param, const array_t& points_vec_, NANOKDTree& nano_kdtree_, bool is_msg_ = true,
+            int capacity = -1);
    ~KNN();

-    void search(const array_t& queries, neighbor_vec& neighbors, const int nquery,
-            int k, double resoTimes = 3);
+    // Search for k nearest neighbors.
+    neighbor* search(const array_t& queries, const int nquery, const int k_neig,
+            bool is_sync_to_host = true, const float* d_guess_radii_sq = nullptr);
+


knn.hpp relies on types like Param, array_t, NANOKDTree, and neighbor without including the headers (or forward declarations) that define them. This makes the header non-self-contained and brittle to include-order changes; add the necessary includes (or forward declare where possible) so #include "knn.hpp" compiles on its own.

Copilot · 2026-04-15T13:36:59Z

+#ifdef ACC
+    void pack_to_xyz_float(std::vector<float3>& buffer, std::size_t limit_size = 0) const {
+        std::size_t count = (limit_size > 0 && limit_size <= n_) ? limit_size : n_;
+
+        if (buffer.size() < count)
+            buffer.resize(count);
+
+        #pragma acc parallel loop gang vector
+        for (std::size_t i = 0; i < count; ++i) {
+            buffer[i].x = (float)(*this)[i][0];
+            buffer[i].y = (float)(*this)[i][1];
+#ifdef THREED
+            buffer[i].z = (float)(*this)[i][2];
+#else
+            buffer[i].z = 0.0;
+#endif
+        }
+    }


pack_to_xyz_float() uses the float3 type, but this header does not define or include a definition for float3. With current include order (array2d.hpp is included before knn_bvh.hpp), this will fail to compile in ACC builds. Consider defining a project-local POD type for 3 floats in a common header, or include the header that defines float3 before this method is compiled.

Copilot · 2026-04-15T13:36:59Z


+$(KNN_BVH_LIB):
+	$(MAKE) -C $(KNN_BVH_DIR) NDIM=$(ndims)


$(KNN_BVH_LIB) is only set when openacc=1, but this unconditional rule still expands to a target even when the variable is empty (effectively creating a rule for :). Wrap this rule in the same ifeq ($(openacc), 1) guard (or give KNN_BVH_LIB a safe default) so non-OpenACC builds don't get an invalid/unintended make rule.

Suggested change

$(KNN_BVH_LIB):

$(MAKE) -C $(KNN_BVH_DIR) NDIM=$(ndims)

ifeq ($(openacc), 1)

$(KNN_BVH_LIB):

$(MAKE) -C $(KNN_BVH_DIR) NDIM=$(ndims)

endif

… point count

…olation

…ile; add instructions for submodules in README and DEVELOPING

… -fsanitize=address warnings

…move redundant environment variable in nvc-build workflow

…cker build to improve build performance during PR

… nvc-build workflow

…ficiency

chaseshyu added 6 commits April 15, 2026 07:49

update: replace hashgrid search with LBVH for cuda knn to improve irr…

1fa1967

…egular spatial search

update: reformat remeshing message for interpolation

158094c

update: change knn-bvh submodule URL to GeoFLAC repository

3328ff1

update: comment out unused create_neighbor calls to optimize initiali…

3b793ca

…zation and restart processes

update: add GPU memory usage logging during remeshing process

c6f5699

update: add nvc build badge to README

90204f5

chaseshyu requested review from Copilot, echoi and tan2 April 15, 2026 13:30

chaseshyu added enhancement New feature or request bugfixes Fix bugs labels Apr 15, 2026

Copilot started reviewing on behalf of chaseshyu April 15, 2026 13:30 View session

Copilot AI reviewed Apr 15, 2026

View reviewed changes

chaseshyu added 7 commits April 15, 2026 09:07

bugfix: fix git safe directory in nvc CI workflow

52aab29

bugfix: specify SM parameter for openacc builds in nvc CI workflow

e8ad185

bugfix: fix logging output in marker replenish to display correct new…

6534658

… point count

bugfix: remove unused variable 'resoTimes' in nearest neighbor interp…

bc55e6a

…olation

update: rename DEVELOPING to DEVELOPING.md

9796c6e

update: improve submodule handling and OpenACC build process in Makef…

62a0e4f

…ile; add instructions for submodules in README and DEVELOPING

bugfix: move constructor initialization to correct order to eliminate…

96fb15b

… -fsanitize=address warnings

chaseshyu linked an issue Apr 16, 2026 that may be closed by this pull request

Attributes initialization order #59

Closed

bugfix: fix misleading indent of fflush(stdout)

090a848

chaseshyu force-pushed the update/GPU-knn-bvh-search branch 2 times, most recently from 1e7b1f9 to 8de695c Compare April 16, 2026 18:00

feature: add Docker build and push workflow with CUDA Dockerfile

913708f

chaseshyu force-pushed the update/GPU-knn-bvh-search branch 3 times, most recently from d9d918c to 78280c7 Compare April 16, 2026 19:28

update: upgrade NVHPC version in Dockerfile from multi to 12.9 and re…

70dd432

…move redundant environment variable in nvc-build workflow

update: add concurrency nvc build workflow setting and caching for Do…

398e933

…cker build to improve build performance during PR

chaseshyu force-pushed the update/GPU-knn-bvh-search branch from 78280c7 to de19b62 Compare April 16, 2026 19:44

chaseshyu added 2 commits April 16, 2026 15:01

feature: add matrix strategy for building multiple OpenACC targets in…

bbb9f71

… nvc-build workflow

bugfix: add check for open PRs to skip duplicate CI runs on push events.

468c6d0

chaseshyu force-pushed the update/GPU-knn-bvh-search branch from a21485f to 468c6d0 Compare April 16, 2026 20:03

update: refactor build steps in nvc-build workflow for clarity and ef…

a6a8f78

…ficiency

chaseshyu force-pushed the update/GPU-knn-bvh-search branch from a08a12d to a6a8f78 Compare April 16, 2026 20:28

tan2 approved these changes Apr 22, 2026

View reviewed changes

tan2 merged commit cd1dac4 into master Apr 22, 2026
15 checks passed

tan2 deleted the update/GPU-knn-bvh-search branch April 22, 2026 08:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update: GPU knn BVH search for irregular grid#56

Update: GPU knn BVH search for irregular grid#56
tan2 merged 20 commits intomasterfrom
update/GPU-knn-bvh-search

chaseshyu commented Apr 15, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot commented Apr 15, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 15, 2026

Uh oh!

Uh oh!

Uh oh!

Copilot AI Apr 15, 2026

Uh oh!

Copilot AI Apr 15, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

chaseshyu commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector Bot commented Apr 15, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

chaseshyu commented Apr 15, 2026 •

edited

Loading