Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions docs/hardware_support.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,3 +20,7 @@ See the [environment variables](environment_variables.md) `CT2_USE_MKL` and `CT2
* NVIDIA GPUs with a Compute Capability greater or equal to 3.5

The driver requirement depends on the CUDA version. See the [CUDA Compatibility guide](https://docs.nvidia.com/deploy/cuda-compatibility/index.html) for more information.

```{note}
**NVIDIA Jetson (aarch64):** Jetson devices are supported when building from source with `-DWITH_CUDA=ON`. The prebuilt aarch64 wheels on PyPI are CPU-only. See {ref}`installation:nvidia jetson (aarch64 + cuda)` for build instructions.
```
47 changes: 47 additions & 0 deletions docs/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,53 @@ If you installed the C++ library in a custom directory, you should configure add
* When running your Python application, add the CTranslate2 library path to `LD_LIBRARY_PATH`.
```

### NVIDIA Jetson (aarch64 + CUDA)

The prebuilt aarch64 wheels on PyPI are CPU-only. On NVIDIA Jetson devices (Orin Nano, Orin NX, AGX Orin) running JetPack 6.x, build from source with CUDA enabled:

```bash
sudo apt-get install -y build-essential cmake git python3-pip python3-venv
```

CUDA and cuDNN are provided by JetPack — no separate install needed.

Compile the C++ library with CUDA:

```bash
git clone --recursive https://github.com/OpenNMT/CTranslate2.git
cd CTranslate2
mkdir build && cd build
cmake .. -DWITH_CUDA=ON -DWITH_CUDNN=ON -DWITH_MKL=OFF -DOPENMP_RUNTIME=COMP
make -j$(nproc)
sudo make install
sudo ldconfig
```

```{note}
`-DWITH_MKL=OFF` is required because Intel MKL is not available on ARM. `-DOPENMP_RUNTIME=COMP` uses the compiler's OpenMP instead of Intel's.
```

Build and install the Python wheel:

```bash
cd ../python
pip install -r install_requirements.txt
python setup.py bdist_wheel
pip install dist/ctranslate2*.whl
```

Verify CUDA support:

```python
import ctranslate2
print(ctranslate2.get_supported_compute_types("cuda"))
# {'float16', 'float32', 'int8', 'int8_float16', 'int8_float32', 'bfloat16', 'int8_bfloat16'}
```

```{tip}
If using a virtual environment, you may need to set `LD_LIBRARY_PATH=/usr/local/cuda/lib64` at runtime.
```

### Build options

The following options can be set with `-DOPTION=VALUE` during the CMake configuration:
Expand Down