Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 14 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,18 +10,19 @@
[![Conda Version](https://anaconda.org/openbiosim/loch/badges/downloads.svg)](https://anaconda.org/openbiosim/loch)
[![License: GPL v3](https://img.shields.io/badge/License-GPL_v3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0.en.html)

CUDA accelerated Grand Canonical Monte Carlo (GCMC) water sampling code. Built
CUDA/OpenCL accelerated Grand Canonical Monte Carlo (GCMC) water sampling code. Built
on top of [Sire](https://github.com/OpenBioSim/sire),
[BioSimSpace](https://github.com/OpenBioSim/biosimspace),
[OpenMM](https://github.com/openmm/openmm), and
[PyCUDA](https://documen.tician.de/pycuda/index.html#).
[OpenMM](https://github.com/openmm/openmm),
[PyCUDA](https://documen.tician.de/pycuda/index.html#),
and [PyOpenCL](https://documen.tician.de/pyopencl/).

## Installation

First, create a conda environment with the required dependencies:

```
conda create -f environment.yaml
conda env create -f environment.yaml
conda activate loch
```

Expand Down Expand Up @@ -49,7 +50,7 @@ conda install -c conda-forge -c openbiosim/label/dev loch

Instead of computing the energy change for each trial insertion/deletion with
OpenMM, the calculation is performed at the reaction field (RF) level using
a custom CUDA kernel, allowing multiple candidates to be evaluated
a custom CUDA/OpenCL kernel, allowing multiple candidates to be evaluated
simultaneously. Particle mesh Ewald (PME) is handled via the method for
sampling from an approximate potential (in this case the RF potential)
introduced [here](https://doi.org/10.1063/1.1563597). Parallelisation of the
Expand Down Expand Up @@ -228,8 +229,9 @@ to enhance sampling.
Once finished, `mu_ex` will contain the computed excess chemical potential in units
kcal/mol.

Note that the simulation requires a system with CUDA support. Please set the
`CUDA_VISIBLE_DEVICES` environment variable accordingly.
Note that the simulation requires a system with CUDA or OpenCL support. Please
set the `CUDA_VISIBLE_DEVICES` or `OPENCL_VISIBLE_DEVICES` environment variable
accordingly.

The standard volume can be computed as follows:

Expand Down Expand Up @@ -263,13 +265,11 @@ Free Energy Perturbation (FEP) with GCMC using `loch` is supported via the

## Notes

* Make sure that `nvcc` is in your `PATH`. If you require a different `nvcc` to that
provided by conda, you can set the `PYCUDA_NVCC` environment variable to point
to the desired `nvcc` binary, or use the `nvcc` kwarg in the `GCMCSampler` constructor.
Depending on your setup, you may also need to install the `cuda-nvvm` package from
`conda-forge`.

* A future version supporting AMD GPUs via PyOpenCL is planned.
* When using the CUDA platform, make sure that `nvcc` is in your `PATH`. If you require
a different `nvcc` to that provided by conda, you can set the `PYCUDA_NVCC` environment
variable to point to the desired `nvcc` binary, or use the `nvcc` kwarg in the
`GCMCSampler` constructor. Depending on your setup, you may also need to install the
`cuda-nvvm` package from `conda-forge`.

* OpenMM-to-Sire roundtrip example:

Expand Down
39 changes: 24 additions & 15 deletions WHITEPAPER.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,23 @@
# Loch: CUDA accelerated Grand Canonical Monte Carlo (GCMC) water sampling
# Loch: GPU accelerated Grand Canonical Monte Carlo (GCMC) water sampling

## Introduction

We present `loch`, a high-performance CUDA-accelerated Python package designed
We present `loch`, a high-performance GPU-accelerated Python package designed
for Grand Canonical Monte Carlo (GCMC) water sampling in molecular simulations
via [OpenMM](https://openmm.org/). To enable parallelisation of insertion and
deletion attempts, `loch` leverages GPU capabilities using a custom CUDA kernel
for nonbonded interactions. This allows thousands of GCMC trials to be attempted
in parallel, significantly enhancing sampling efficiency compared to traditional
CPU-based implementations that perform sequential attempts via the OpenMM Python
API. Additionally, electrostatics for GCMC attempts are computed using the
reaction field (RF) method, with accepted candidates being re-evaluated with a
correction step based on the difference between reaction field and Particle Mesh
Ewald (PME) potential energies. The use of an approximate potential for trial
moves leads to a substantial speed-up in GCMC move evaluation. `loch` has been
designed to be modular, allowing standalone GCMC sampling, or integration with
OpenMM-based molecular dynamics simulation code, e.g. as has been done in the
[SOMD2](https://github.com/openbiosim/somd2) free-energy perturbation engine.
deletion attempts, `loch` leverages GPU capabilities using a custom CUDA/OpenCL
kernel for nonbonded interactions. This allows thousands of GCMC trials to be
attempted in parallel, significantly enhancing sampling efficiency compared to
traditional CPU-based implementations that perform sequential attempts via the
OpenMM Python API. Additionally, electrostatics for GCMC attempts are computed
using the reaction field (RF) method, with accepted candidates being
re-evaluated with a correction step based on the difference between reaction
field and Particle Mesh Ewald (PME) potential energies. The use of an
approximate potential for trial moves leads to a substantial speed-up in GCMC
move evaluation. `loch` has been designed to be modular, allowing standalone
GCMC sampling, or integration with OpenMM-based molecular dynamics simulation
code, e.g. as has been done in the [SOMD2](https://github.com/openbiosim/somd2)
free-energy perturbation engine.

## Parallelisation strategy

Expand Down Expand Up @@ -52,6 +53,14 @@ each iteration, as more trials need to be evaluated in parallel, and more data
needs to be transferred to and from the GPU, in which case it might be more
efficient to simply perform more iterations with a smaller batch size.

To enable reproduciblility across GPU platforms we choose to generate random
numbers on the host using NumPy's random number generator, then transfer these
to the GPU kernels where required. This avoids differences in random number
generation across different GPU architectures and drivers, making testing
and validation of the implementation significantly easier. In benchmarks we
have found the NumPy approach to be as performant as using GPU-based random
numbers for the typical batch sizes employed in `loch`.

## Sampling from an approximate potential

In order to further accelerate the evaluation of GCMC insertion and deletion
Expand Down Expand Up @@ -91,7 +100,7 @@ Other than the cost of evaluating GCMC trials using PME, performance is aslo
impacted by the cost of updating nonbonded parameters and atomic positions
in the OpenMM context after each accepted insertion or deletion. (No updates
are required for trial moves, since these are all evaluated via the custom
CUDA kernel.) [Recent updates](https://github.com/openmm/openmm/pull/4610)
CUDA/OpenCL kernel.) [Recent updates](https://github.com/openmm/openmm/pull/4610)
to OpenMM have helped mitigate the cost of modifying force field parameters,
allowing updates for only the subset of parameters that have changed within
a particular force. However, updating atomic positions still requires
Expand Down
1 change: 1 addition & 0 deletions environment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,4 @@ dependencies:
- biosimspace
- loguru
- pycuda
- pyopencl
10 changes: 10 additions & 0 deletions examples/bpti/bpti.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,14 @@
choices=["info", "debug", "error"],
required=False,
)
parser.add_argument(
"--platform",
help="The GPU platform to use",
type=str,
default="auto",
choices=["auto", "cuda", "opencl"],
required=False,
)

args = parser.parse_args()

Expand All @@ -78,6 +86,7 @@
num_ghost_waters=100,
bulk_sampling_probability=0,
log_level=args.log_level,
platform=args.platform,
overwrite=True,
)

Expand All @@ -92,6 +101,7 @@
pressure=None,
constraint="h_bonds",
timestep="2 fs",
platform=args.platform,
)
d.randomise_velocities()

Expand Down
10 changes: 10 additions & 0 deletions examples/scytalone/sd.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,14 @@
choices=["info", "debug", "error"],
required=False,
)
parser.add_argument(
"--platform",
help="The GPU platform to use",
type=str,
default="auto",
choices=["auto", "cuda", "opencl"],
required=False,
)
args = parser.parse_args()

# Store the ligand index.
Expand All @@ -90,6 +98,7 @@
ghost_file=f"ghosts_{lig}.txt",
log_file=f"gcmc_{lig}.txt",
log_level=args.log_level,
platform=args.platform,
overwrite=True,
)

Expand All @@ -104,6 +113,7 @@
pressure=None,
constraint="h_bonds",
timestep="2 fs",
platform=args.platform,
)
d.randomise_velocities()

Expand Down
11 changes: 11 additions & 0 deletions examples/water/water.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,14 @@
choices=["info", "debug", "error"],
required=False,
)
parser.add_argument(
"--platform",
help="The GPU platform to use",
type=str,
default="auto",
choices=["auto", "cuda", "opencl"],
required=False,
)
args = parser.parse_args()

# Load the water box.
Expand All @@ -91,6 +99,8 @@
temperature=args.temperature,
num_ghost_waters=100,
log_level=args.log_level,
platform=args.platform,
overwrite=True,
)

# Create a dynamics object using the modified GCMC system.
Expand All @@ -104,6 +114,7 @@
pressure=None,
constraint="h_bonds",
timestep="2 fs",
platform=args.platform,
)
d.randomise_velocities()

Expand Down
1 change: 1 addition & 0 deletions recipes/loch/template.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ requirements:
- loguru
- pip
- pycuda # [not macos]
- pyopencl
- python
- setuptools
- sire
Expand Down
2 changes: 1 addition & 1 deletion src/loch/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
######################################################################
# Loch: GPU accelerated GCMC water sampling engine.
#
# Copyright: 2025
# Copyright: 2025-2026
#
# Authors: The OpenBioSim Team <team@openbiosim.org>
#
Expand Down
Loading
Loading