From c1ba704dfd3c1bee328839e96be0c6e1ad0eeedc Mon Sep 17 00:00:00 2001
From: Vimal Kumar <vimal78@gmail.com>
Date: Wed, 25 Feb 2026 19:25:13 +0530
Subject: [PATCH] docs: add build instructions for NVIDIA Jetson (aarch64 +
 CUDA)

The prebuilt aarch64 wheels on PyPI are CPU-only. Add instructions for
building from source with CUDA on Jetson devices (JetPack 6.x), including
the required cmake flags and a note in hardware_support.md.

Tested on Jetson Orin Nano with JetPack 6.x, CUDA 12.x, Python 3.10.

Relates to #1908.
---
 docs/hardware_support.md |  4 ++++
 docs/installation.md     | 47 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 51 insertions(+)

diff --git a/docs/hardware_support.md b/docs/hardware_support.md
index 88506b547..60f76c4b2 100644
--- a/docs/hardware_support.md
+++ b/docs/hardware_support.md
@@ -20,3 +20,7 @@ See the [environment variables](environment_variables.md) `CT2_USE_MKL` and `CT2
 * NVIDIA GPUs with a Compute Capability greater or equal to 3.5
 
 The driver requirement depends on the CUDA version. See the [CUDA Compatibility guide](https://docs.nvidia.com/deploy/cuda-compatibility/index.html) for more information.
+
+```{note}
+**NVIDIA Jetson (aarch64):** Jetson devices are supported when building from source with `-DWITH_CUDA=ON`. The prebuilt aarch64 wheels on PyPI are CPU-only. See {ref}`installation:nvidia jetson (aarch64 + cuda)` for build instructions.
+```
diff --git a/docs/installation.md b/docs/installation.md
index 0fc4263b7..9874adf23 100644
--- a/docs/installation.md
+++ b/docs/installation.md
@@ -91,6 +91,53 @@ If you installed the C++ library in a custom directory, you should configure add
 * When running your Python application, add the CTranslate2 library path to `LD_LIBRARY_PATH`.
 ```
 
+### NVIDIA Jetson (aarch64 + CUDA)
+
+The prebuilt aarch64 wheels on PyPI are CPU-only. On NVIDIA Jetson devices (Orin Nano, Orin NX, AGX Orin) running JetPack 6.x, build from source with CUDA enabled:
+
+```bash
+sudo apt-get install -y build-essential cmake git python3-pip python3-venv
+```
+
+CUDA and cuDNN are provided by JetPack — no separate install needed.
+
+Compile the C++ library with CUDA:
+
+```bash
+git clone --recursive https://github.com/OpenNMT/CTranslate2.git
+cd CTranslate2
+mkdir build && cd build
+cmake .. -DWITH_CUDA=ON -DWITH_CUDNN=ON -DWITH_MKL=OFF -DOPENMP_RUNTIME=COMP
+make -j$(nproc)
+sudo make install
+sudo ldconfig
+```
+
+```{note}
+`-DWITH_MKL=OFF` is required because Intel MKL is not available on ARM. `-DOPENMP_RUNTIME=COMP` uses the compiler's OpenMP instead of Intel's.
+```
+
+Build and install the Python wheel:
+
+```bash
+cd ../python
+pip install -r install_requirements.txt
+python setup.py bdist_wheel
+pip install dist/ctranslate2*.whl
+```
+
+Verify CUDA support:
+
+```python
+import ctranslate2
+print(ctranslate2.get_supported_compute_types("cuda"))
+# {'float16', 'float32', 'int8', 'int8_float16', 'int8_float32', 'bfloat16', 'int8_bfloat16'}
+```
+
+```{tip}
+If using a virtual environment, you may need to set `LD_LIBRARY_PATH=/usr/local/cuda/lib64` at runtime.
+```
+
 ### Build options
 
 The following options can be set with `-DOPTION=VALUE` during the CMake configuration: