Skip to content

Commit 59a5fa2

Browse files
author
cuda-python-bot
committed
Deploy doc preview for PR 2186 (3694e06)
1 parent 7cacab5 commit 59a5fa2

267 files changed

Lines changed: 3211 additions & 1244 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

docs/pr-preview/pr-2186/cuda-bindings/latest/_sources/install.rst.txt

Lines changed: 35 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -80,11 +80,43 @@ For example:
8080

8181
Tegra users can install the cuDLA conda package from conda-forge through ``conda install -c conda-forge libcudla cuda-version=13``, if it does not already exist on the system.
8282

83+
Development environment
84+
-----------------------
85+
86+
The sections above cover end-user installation. The section below focuses on
87+
a repeatable *development* workflow (editable installs and running tests).
88+
89+
Installing the latest nightly (top-of-tree builds)
90+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
91+
92+
These are useful for users looking to test new features or bug fixes prior to
93+
their inclusion in a release.
94+
95+
CI publishes wheels as GitHub Actions artifacts on every push to ``main``. To
96+
obtain the most recent build, use the following commands:
97+
98+
.. code-block:: console
99+
100+
$ # Find the latest successful CI run on main:
101+
$ RUN_ID=$(gh run list -R NVIDIA/cuda-python -w ci.yml -b main -s success -L1 --json databaseId -q '.[0].databaseId')
102+
103+
$ # Download the wheel (pick your Python version and platform):
104+
$ gh run download "$RUN_ID" -R NVIDIA/cuda-python -p "cuda-bindings-python312-cuda13*-linux-64-*"
105+
106+
$ # Install the downloaded wheel:
107+
$ pip install cuda-bindings-python312-cuda13*-linux-64-*/cuda_bindings*.whl[all]
108+
109+
Replace ``python312`` with your Python version (e.g. ``python310``, ``python311``,
110+
``python313``, ``python314``, ``python314t``). For aarch64, replace ``linux-64``
111+
with ``linux-aarch64``; for Windows, use ``win-64``. Only the current CUDA
112+
major version is built on ``main``; wheels for the prior CUDA major are
113+
published from the corresponding backport branch.
114+
83115
Installing from Source
84-
----------------------
116+
~~~~~~~~~~~~~~~~~~~~~~
85117

86118
Requirements
87-
~~~~~~~~~~~~
119+
^^^^^^^^^^^^
88120

89121
* CUDA Toolkit headers[^1]
90122
* CUDA Runtime static library[^2]
@@ -106,7 +138,7 @@ See :doc:`Environment Variables <environment_variables>` for a description of ot
106138
Only ``cydriver``, ``cyruntime`` and ``cynvrtc`` are impacted by the header requirement.
107139

108140
Editable Install
109-
~~~~~~~~~~~~~~~~
141+
^^^^^^^^^^^^^^^^
110142

111143
You can use:
112144

docs/pr-preview/pr-2186/cuda-bindings/latest/_sources/module/driver.rst.txt

Lines changed: 205 additions & 81 deletions
Large diffs are not rendered by default.

docs/pr-preview/pr-2186/cuda-bindings/latest/_sources/module/runtime.rst.txt

Lines changed: 194 additions & 120 deletions
Large diffs are not rendered by default.

docs/pr-preview/pr-2186/cuda-bindings/latest/api.html

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2264,10 +2264,10 @@ <h1>CUDA Python API Reference<a class="headerlink" href="#cuda-python-api-refere
22642264
<li class="toctree-l3"><a class="reference internal" href="module/driver.html#cuda.bindings.driver.CUdevResourceType"><code class="docutils literal notranslate"><span class="pre">CUdevResourceType</span></code></a></li>
22652265
<li class="toctree-l3"><a class="reference internal" href="module/driver.html#cuda.bindings.driver.CUdevWorkqueueConfigScope"><code class="docutils literal notranslate"><span class="pre">CUdevWorkqueueConfigScope</span></code></a></li>
22662266
<li class="toctree-l3"><a class="reference internal" href="module/driver.html#cuda.bindings.driver.CUdevResourceDesc"><code class="docutils literal notranslate"><span class="pre">CUdevResourceDesc</span></code></a></li>
2267-
<li class="toctree-l3"><a class="reference internal" href="module/driver.html#id7"><code class="docutils literal notranslate"><span class="pre">CUdevSmResource</span></code></a></li>
2268-
<li class="toctree-l3"><a class="reference internal" href="module/driver.html#id13"><code class="docutils literal notranslate"><span class="pre">CUdevWorkqueueConfigResource</span></code></a></li>
2269-
<li class="toctree-l3"><a class="reference internal" href="module/driver.html#id18"><code class="docutils literal notranslate"><span class="pre">CUdevWorkqueueResource</span></code></a></li>
2270-
<li class="toctree-l3"><a class="reference internal" href="module/driver.html#id21"><code class="docutils literal notranslate"><span class="pre">CU_DEV_SM_RESOURCE_GROUP_PARAMS</span></code></a></li>
2267+
<li class="toctree-l3"><a class="reference internal" href="module/driver.html#id73"><code class="docutils literal notranslate"><span class="pre">CUdevSmResource</span></code></a></li>
2268+
<li class="toctree-l3"><a class="reference internal" href="module/driver.html#id79"><code class="docutils literal notranslate"><span class="pre">CUdevWorkqueueConfigResource</span></code></a></li>
2269+
<li class="toctree-l3"><a class="reference internal" href="module/driver.html#id84"><code class="docutils literal notranslate"><span class="pre">CUdevWorkqueueResource</span></code></a></li>
2270+
<li class="toctree-l3"><a class="reference internal" href="module/driver.html#id87"><code class="docutils literal notranslate"><span class="pre">CU_DEV_SM_RESOURCE_GROUP_PARAMS</span></code></a></li>
22712271
<li class="toctree-l3"><a class="reference internal" href="module/driver.html#cuda.bindings.driver.cuGreenCtxCreate"><code class="docutils literal notranslate"><span class="pre">cuGreenCtxCreate()</span></code></a></li>
22722272
<li class="toctree-l3"><a class="reference internal" href="module/driver.html#cuda.bindings.driver.cuGreenCtxDestroy"><code class="docutils literal notranslate"><span class="pre">cuGreenCtxDestroy()</span></code></a></li>
22732273
<li class="toctree-l3"><a class="reference internal" href="module/driver.html#cuda.bindings.driver.cuCtxFromGreenCtx"><code class="docutils literal notranslate"><span class="pre">cuCtxFromGreenCtx()</span></code></a></li>

docs/pr-preview/pr-2186/cuda-bindings/latest/examples.html

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1301,65 +1301,65 @@
13011301
<section id="examples">
13021302
<h1>Examples<a class="headerlink" href="#examples" title="Link to this heading">#</a></h1>
13031303
<p>This page links to the <code class="docutils literal notranslate"><span class="pre">cuda.bindings</span></code> examples shipped in the
1304-
<a class="extlink-cuda-bindings-examples reference external" href="https://github.com/NVIDIA/cuda-python/tree/d98094fefa33d0e6629204cf0665a5bce3f66a39/cuda_bindings/examples/">cuda-python repository</a>.
1304+
<a class="extlink-cuda-bindings-examples reference external" href="https://github.com/NVIDIA/cuda-python/tree/3694e06171e9b2316396377103e31ba605eaef6e/cuda_bindings/examples/">cuda-python repository</a>.
13051305
Use it as a quick index when you want a runnable sample for a specific API area
13061306
or CUDA feature.</p>
13071307
<section id="introduction">
13081308
<h2>Introduction<a class="headerlink" href="#introduction" title="Link to this heading">#</a></h2>
13091309
<ul class="simple">
1310-
<li><p><a class="extlink-cuda-bindings-example reference external" href="https://github.com/NVIDIA/cuda-python/blob/d98094fefa33d0e6629204cf0665a5bce3f66a39/cuda_bindings/examples/0_Introduction/clock_nvrtc.py">clock_nvrtc.py</a>
1310+
<li><p><a class="extlink-cuda-bindings-example reference external" href="https://github.com/NVIDIA/cuda-python/blob/3694e06171e9b2316396377103e31ba605eaef6e/cuda_bindings/examples/0_Introduction/clock_nvrtc.py">clock_nvrtc.py</a>
13111311
uses NVRTC-compiled CUDA code and the device clock to time a reduction
13121312
kernel.</p></li>
1313-
<li><p><a class="extlink-cuda-bindings-example reference external" href="https://github.com/NVIDIA/cuda-python/blob/d98094fefa33d0e6629204cf0665a5bce3f66a39/cuda_bindings/examples/0_Introduction/simple_cubemap_texture.py">simple_cubemap_texture.py</a>
1313+
<li><p><a class="extlink-cuda-bindings-example reference external" href="https://github.com/NVIDIA/cuda-python/blob/3694e06171e9b2316396377103e31ba605eaef6e/cuda_bindings/examples/0_Introduction/simple_cubemap_texture.py">simple_cubemap_texture.py</a>
13141314
demonstrates cubemap texture sampling and transformation.</p></li>
1315-
<li><p><a class="extlink-cuda-bindings-example reference external" href="https://github.com/NVIDIA/cuda-python/blob/d98094fefa33d0e6629204cf0665a5bce3f66a39/cuda_bindings/examples/0_Introduction/simple_p2p.py">simple_p2p.py</a>
1315+
<li><p><a class="extlink-cuda-bindings-example reference external" href="https://github.com/NVIDIA/cuda-python/blob/3694e06171e9b2316396377103e31ba605eaef6e/cuda_bindings/examples/0_Introduction/simple_p2p.py">simple_p2p.py</a>
13161316
shows peer-to-peer memory access and transfers between multiple GPUs.</p></li>
1317-
<li><p><a class="extlink-cuda-bindings-example reference external" href="https://github.com/NVIDIA/cuda-python/blob/d98094fefa33d0e6629204cf0665a5bce3f66a39/cuda_bindings/examples/0_Introduction/simple_zero_copy.py">simple_zero_copy.py</a>
1317+
<li><p><a class="extlink-cuda-bindings-example reference external" href="https://github.com/NVIDIA/cuda-python/blob/3694e06171e9b2316396377103e31ba605eaef6e/cuda_bindings/examples/0_Introduction/simple_zero_copy.py">simple_zero_copy.py</a>
13181318
uses zero-copy mapped host memory for vector addition.</p></li>
1319-
<li><p><a class="extlink-cuda-bindings-example reference external" href="https://github.com/NVIDIA/cuda-python/blob/d98094fefa33d0e6629204cf0665a5bce3f66a39/cuda_bindings/examples/0_Introduction/system_wide_atomics.py">system_wide_atomics.py</a>
1319+
<li><p><a class="extlink-cuda-bindings-example reference external" href="https://github.com/NVIDIA/cuda-python/blob/3694e06171e9b2316396377103e31ba605eaef6e/cuda_bindings/examples/0_Introduction/system_wide_atomics.py">system_wide_atomics.py</a>
13201320
demonstrates system-wide atomic operations on managed memory.</p></li>
1321-
<li><p><a class="extlink-cuda-bindings-example reference external" href="https://github.com/NVIDIA/cuda-python/blob/d98094fefa33d0e6629204cf0665a5bce3f66a39/cuda_bindings/examples/0_Introduction/vector_add_drv.py">vector_add_drv.py</a>
1321+
<li><p><a class="extlink-cuda-bindings-example reference external" href="https://github.com/NVIDIA/cuda-python/blob/3694e06171e9b2316396377103e31ba605eaef6e/cuda_bindings/examples/0_Introduction/vector_add_drv.py">vector_add_drv.py</a>
13221322
uses the CUDA Driver API and unified virtual addressing for vector addition.</p></li>
1323-
<li><p><a class="extlink-cuda-bindings-example reference external" href="https://github.com/NVIDIA/cuda-python/blob/d98094fefa33d0e6629204cf0665a5bce3f66a39/cuda_bindings/examples/0_Introduction/vector_add_mmap.py">vector_add_mmap.py</a>
1323+
<li><p><a class="extlink-cuda-bindings-example reference external" href="https://github.com/NVIDIA/cuda-python/blob/3694e06171e9b2316396377103e31ba605eaef6e/cuda_bindings/examples/0_Introduction/vector_add_mmap.py">vector_add_mmap.py</a>
13241324
uses virtual memory management APIs such as <code class="docutils literal notranslate"><span class="pre">cuMemCreate</span></code> and
13251325
<code class="docutils literal notranslate"><span class="pre">cuMemMap</span></code> for vector addition.</p></li>
13261326
</ul>
13271327
</section>
13281328
<section id="concepts-and-techniques">
13291329
<h2>Concepts and techniques<a class="headerlink" href="#concepts-and-techniques" title="Link to this heading">#</a></h2>
13301330
<ul class="simple">
1331-
<li><p><a class="extlink-cuda-bindings-example reference external" href="https://github.com/NVIDIA/cuda-python/blob/d98094fefa33d0e6629204cf0665a5bce3f66a39/cuda_bindings/examples/2_Concepts_and_Techniques/stream_ordered_allocation.py">stream_ordered_allocation.py</a>
1331+
<li><p><a class="extlink-cuda-bindings-example reference external" href="https://github.com/NVIDIA/cuda-python/blob/3694e06171e9b2316396377103e31ba605eaef6e/cuda_bindings/examples/2_Concepts_and_Techniques/stream_ordered_allocation.py">stream_ordered_allocation.py</a>
13321332
demonstrates <code class="docutils literal notranslate"><span class="pre">cudaMallocAsync</span></code> and <code class="docutils literal notranslate"><span class="pre">cudaFreeAsync</span></code> together with
13331333
memory-pool release thresholds.</p></li>
13341334
</ul>
13351335
</section>
13361336
<section id="cuda-features">
13371337
<h2>CUDA features<a class="headerlink" href="#cuda-features" title="Link to this heading">#</a></h2>
13381338
<ul class="simple">
1339-
<li><p><a class="extlink-cuda-bindings-example reference external" href="https://github.com/NVIDIA/cuda-python/blob/d98094fefa33d0e6629204cf0665a5bce3f66a39/cuda_bindings/examples/3_CUDA_Features/global_to_shmem_async_copy.py">global_to_shmem_async_copy.py</a>
1339+
<li><p><a class="extlink-cuda-bindings-example reference external" href="https://github.com/NVIDIA/cuda-python/blob/3694e06171e9b2316396377103e31ba605eaef6e/cuda_bindings/examples/3_CUDA_Features/global_to_shmem_async_copy.py">global_to_shmem_async_copy.py</a>
13401340
compares asynchronous global-to-shared-memory copy strategies in matrix
13411341
multiplication kernels.</p></li>
1342-
<li><p><a class="extlink-cuda-bindings-example reference external" href="https://github.com/NVIDIA/cuda-python/blob/d98094fefa33d0e6629204cf0665a5bce3f66a39/cuda_bindings/examples/3_CUDA_Features/simple_cuda_graphs.py">simple_cuda_graphs.py</a>
1342+
<li><p><a class="extlink-cuda-bindings-example reference external" href="https://github.com/NVIDIA/cuda-python/blob/3694e06171e9b2316396377103e31ba605eaef6e/cuda_bindings/examples/3_CUDA_Features/simple_cuda_graphs.py">simple_cuda_graphs.py</a>
13431343
shows both manual CUDA graph construction and stream-capture-based replay.</p></li>
13441344
</ul>
13451345
</section>
13461346
<section id="libraries-and-tools">
13471347
<h2>Libraries and tools<a class="headerlink" href="#libraries-and-tools" title="Link to this heading">#</a></h2>
13481348
<ul class="simple">
1349-
<li><p><a class="extlink-cuda-bindings-example reference external" href="https://github.com/NVIDIA/cuda-python/blob/d98094fefa33d0e6629204cf0665a5bce3f66a39/cuda_bindings/examples/4_CUDA_Libraries/conjugate_gradient_multi_block_cg.py">conjugate_gradient_multi_block_cg.py</a>
1349+
<li><p><a class="extlink-cuda-bindings-example reference external" href="https://github.com/NVIDIA/cuda-python/blob/3694e06171e9b2316396377103e31ba605eaef6e/cuda_bindings/examples/4_CUDA_Libraries/conjugate_gradient_multi_block_cg.py">conjugate_gradient_multi_block_cg.py</a>
13501350
implements a conjugate-gradient solver with cooperative groups and
13511351
multi-block synchronization.</p></li>
1352-
<li><p><a class="extlink-cuda-bindings-example reference external" href="https://github.com/NVIDIA/cuda-python/blob/d98094fefa33d0e6629204cf0665a5bce3f66a39/cuda_bindings/examples/4_CUDA_Libraries/nvidia_smi.py">nvidia_smi.py</a>
1352+
<li><p><a class="extlink-cuda-bindings-example reference external" href="https://github.com/NVIDIA/cuda-python/blob/3694e06171e9b2316396377103e31ba605eaef6e/cuda_bindings/examples/4_CUDA_Libraries/nvidia_smi.py">nvidia_smi.py</a>
13531353
uses NVML to implement a Python subset of <code class="docutils literal notranslate"><span class="pre">nvidia-smi</span></code>.</p></li>
13541354
</ul>
13551355
</section>
13561356
<section id="advanced-and-interoperability">
13571357
<h2>Advanced and interoperability<a class="headerlink" href="#advanced-and-interoperability" title="Link to this heading">#</a></h2>
13581358
<ul class="simple">
1359-
<li><p><a class="extlink-cuda-bindings-example reference external" href="https://github.com/NVIDIA/cuda-python/blob/d98094fefa33d0e6629204cf0665a5bce3f66a39/cuda_bindings/examples/extra/iso_fd_modelling.py">iso_fd_modelling.py</a>
1359+
<li><p><a class="extlink-cuda-bindings-example reference external" href="https://github.com/NVIDIA/cuda-python/blob/3694e06171e9b2316396377103e31ba605eaef6e/cuda_bindings/examples/extra/iso_fd_modelling.py">iso_fd_modelling.py</a>
13601360
runs isotropic finite-difference wave propagation across multiple GPUs with
13611361
peer-to-peer halo exchange.</p></li>
1362-
<li><p><a class="extlink-cuda-bindings-example reference external" href="https://github.com/NVIDIA/cuda-python/blob/d98094fefa33d0e6629204cf0665a5bce3f66a39/cuda_bindings/examples/extra/jit_program.py">jit_program.py</a>
1362+
<li><p><a class="extlink-cuda-bindings-example reference external" href="https://github.com/NVIDIA/cuda-python/blob/3694e06171e9b2316396377103e31ba605eaef6e/cuda_bindings/examples/extra/jit_program.py">jit_program.py</a>
13631363
JIT-compiles a SAXPY kernel with NVRTC and launches it through the Driver
13641364
API.</p></li>
13651365
</ul>

0 commit comments

Comments
 (0)