Skip to content

Conversation

@lohedges
Copy link
Contributor

This PR adds support for using OpenCL as the platform for performing GCMC sampling on the GPU using pyopencl. The existing CUDA specific kernels have been adapted to support both CUDA and OpenCL via the use of pre-processor macros to abstract platform specific functionality. In the Python layer, the code has been refactored using bothpycuda and pyopencl for interfacing with each platform, i.e. memory setup/transfer, context handling is performed by the matching Python interface, with no change to the existing public API of the GCMCSampler.

Unlike cuRAND CUDA, OpenCL has no native random number support so generation of random numbers has been moved to the host. These are now generated using NumPy and are passed to kernels at runtime as required. The number of random numbers needed per batch is small, so the overhead is pretty small. To mitigate random number generation overheads, batches of numbers are pre-computed in a background thread while the GPU kernels are working, such that numbers are immediately ready when needed. Using the same RNG for both platforms is also desirable from a testing perspective, since it allows us to directly compare CUDA and OpenCL results.

The existing unit tests have now been parameterised over the two available platforms and an additional unit test has been added to confirm that, given the same RNG seed, single-point energies agree across both platforms. I have also tested all of the example scripts, which produce the same results.

In the process of adding OpenCL support I also took the opportunity to profile and optimise the GPU kernels and GCMCSampler. Trivial optimisations, none of which involve reduced precision maths operations, have improved performance by roughly 30-40%. I have also exposed options to enable/disable compiler optimisations during kernel compilation, with the default optimisations matching those used by OpenMM for the respective platforms. Benchmarks show that the CUDA and OpenCL platforms are largely comparable in performance, with most of the discrepancy during a simulation coming from the platform performance differences for OpenMM.

Note that the addition of OpenCL support should enable the use loch on other OpenMM platforms that we don't directly support. For example, it would be possible to run OpenMM dynamics using the Metal or HIP platforms, while using OpenCL for loch.

@lohedges lohedges added the enhancement New feature or request label Jan 21, 2026
@lohedges lohedges merged commit 0d82318 into devel Jan 21, 2026
3 checks passed
@lohedges lohedges deleted the feature_opencl branch January 21, 2026 13:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants