From c1ba704dfd3c1bee328839e96be0c6e1ad0eeedc Mon Sep 17 00:00:00 2001 From: Vimal Kumar Date: Wed, 25 Feb 2026 19:25:13 +0530 Subject: [PATCH] docs: add build instructions for NVIDIA Jetson (aarch64 + CUDA) The prebuilt aarch64 wheels on PyPI are CPU-only. Add instructions for building from source with CUDA on Jetson devices (JetPack 6.x), including the required cmake flags and a note in hardware_support.md. Tested on Jetson Orin Nano with JetPack 6.x, CUDA 12.x, Python 3.10. Relates to #1908. --- docs/hardware_support.md | 4 ++++ docs/installation.md | 47 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 51 insertions(+) diff --git a/docs/hardware_support.md b/docs/hardware_support.md index 88506b547..60f76c4b2 100644 --- a/docs/hardware_support.md +++ b/docs/hardware_support.md @@ -20,3 +20,7 @@ See the [environment variables](environment_variables.md) `CT2_USE_MKL` and `CT2 * NVIDIA GPUs with a Compute Capability greater or equal to 3.5 The driver requirement depends on the CUDA version. See the [CUDA Compatibility guide](https://docs.nvidia.com/deploy/cuda-compatibility/index.html) for more information. + +```{note} +**NVIDIA Jetson (aarch64):** Jetson devices are supported when building from source with `-DWITH_CUDA=ON`. The prebuilt aarch64 wheels on PyPI are CPU-only. See {ref}`installation:nvidia jetson (aarch64 + cuda)` for build instructions. +``` diff --git a/docs/installation.md b/docs/installation.md index 0fc4263b7..9874adf23 100644 --- a/docs/installation.md +++ b/docs/installation.md @@ -91,6 +91,53 @@ If you installed the C++ library in a custom directory, you should configure add * When running your Python application, add the CTranslate2 library path to `LD_LIBRARY_PATH`. ``` +### NVIDIA Jetson (aarch64 + CUDA) + +The prebuilt aarch64 wheels on PyPI are CPU-only. On NVIDIA Jetson devices (Orin Nano, Orin NX, AGX Orin) running JetPack 6.x, build from source with CUDA enabled: + +```bash +sudo apt-get install -y build-essential cmake git python3-pip python3-venv +``` + +CUDA and cuDNN are provided by JetPack — no separate install needed. + +Compile the C++ library with CUDA: + +```bash +git clone --recursive https://github.com/OpenNMT/CTranslate2.git +cd CTranslate2 +mkdir build && cd build +cmake .. -DWITH_CUDA=ON -DWITH_CUDNN=ON -DWITH_MKL=OFF -DOPENMP_RUNTIME=COMP +make -j$(nproc) +sudo make install +sudo ldconfig +``` + +```{note} +`-DWITH_MKL=OFF` is required because Intel MKL is not available on ARM. `-DOPENMP_RUNTIME=COMP` uses the compiler's OpenMP instead of Intel's. +``` + +Build and install the Python wheel: + +```bash +cd ../python +pip install -r install_requirements.txt +python setup.py bdist_wheel +pip install dist/ctranslate2*.whl +``` + +Verify CUDA support: + +```python +import ctranslate2 +print(ctranslate2.get_supported_compute_types("cuda")) +# {'float16', 'float32', 'int8', 'int8_float16', 'int8_float32', 'bfloat16', 'int8_bfloat16'} +``` + +```{tip} +If using a virtual environment, you may need to set `LD_LIBRARY_PATH=/usr/local/cuda/lib64` at runtime. +``` + ### Build options The following options can be set with `-DOPTION=VALUE` during the CMake configuration: