Skip to content

Update cuda.bindings to 13.0.0#792

Merged
leofang merged 91 commits into
NVIDIA:mainfrom
leofang:unreleased-13.0
Aug 6, 2025
Merged

Update cuda.bindings to 13.0.0#792
leofang merged 91 commits into
NVIDIA:mainfrom
leofang:unreleased-13.0

Conversation

@leofang

@leofang leofang commented Aug 4, 2025

Copy link
Copy Markdown
Member

Description

Close #791.

Checklist

  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

vzhurba01 and others added 30 commits May 27, 2025 14:48
…ersion_13

Make previously overlooked 12.9.0 → 13.0.0 changes
* Update SUPPORTED_WINDOWS_DLLS: kitpicks/cuda-r13-0/13.0.0/013/local_installers/cuda_13.0.0_windows.exe

* Update SUPPORTED_LINUX_SONAMES: kitpicks/cuda-r13-0/13.0.0/013/local_installers/cuda_13.0.0_580.31_linux.run

* 013 → 014: SUPPORTED_LINUX_SONAMES unchanged

* 013 → 014: SUPPORTED_WINDOWS_DLLS unchanged

* cybind update with 13.0.0 headers (014)

* Bump cuda/bindings/_version.py → 13.0.0

* test_nvjitlink.py: remove sm_60, add sm_100

* Updates from cybind after removing all 11.x headers (affects "automatically generated" comments only).

* Add new toolshed/reformat_cuda_enums_as_py.py (reads cuda.h, driver_types.h headers directly).

* Use new toolshed/reformat_cuda_enums_as_py.py to regenerate driver_cu_result_explanations.py, runtime_cuda_error_explanations.py

* Use `driver.cuDeviceGetUuid()` instead of `driver.cuDeviceGetUuid_v2()` with CTK 13.

* Adjustments for locating nvvm directory in CTK 13 installations.
* Add missing error handling (tests/test_nvjitlink.py)

* Add missing `const` in cudaMemcpyBatchAsync call (cuda/bindings/runtime.pyx.in)

* Add qa/13.0.0/01_linux.sh

* Remove qa/13.0.0/01_linux.sh after it was moved to a new upstream qa branch.

* Strictly correct casts for cudaMemcpyBatchAsync (generated by cython_gen).

* Pragmatic minimal fix for cudaMemcpyBatchAsync casts (works with Linux and
Windows). (generated with cython-gen)
Fix accident from updating `SUPPORTED_WINDOWS_DLLS` for CTK 13
)

* Linux update from cuda_13.0.0_580.46_kitpicks025_linux.run: no-op b/o NVIDIA/cuda-python-private#95

* Windows update from cuda_13.0.0_kitpicks025_windows.exe
…s overlooked. Direct commit for simplicity.
…VIDIA#94)

* CCCL_INCLUDE_PATH fixes in test_event.py, test_launcher.py

* Add new file (accidentally missing in a prior commit).

* Fix pre-commit errors in new tests/helpers.py

* 12→13 compatibility fixes in cuda/core/experimental/_graph.py

* CTK 12 compatibility (tests/test_cuda_utils.py)

* Make the cuda/core/experimental/_graph.py changes backwards compatible.

* Do not try to hide `13` in cuda_core/tests/test_cuda_utils.py

* More elegant handling of `CCCL_INCLUDE_PATHS` in cuda_core/tests/helpers.py

* Remove stray empty line (cuda_core/tests/conftest.py).

* Fix logic error computing CCCL_INCLUDE_PATHS in cuda_core/tests/helpers.py
@rwgk

rwgk commented Aug 6, 2025

Copy link
Copy Markdown
Contributor

/ok to test

@copy-pr-bot

copy-pr-bot Bot commented Aug 6, 2025

Copy link
Copy Markdown
Contributor

/ok to test

@rwgk, there was an error processing your request: E1

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/1/

@rwgk

rwgk commented Aug 6, 2025

Copy link
Copy Markdown
Contributor

/ok to test de1e83a

@rwgk

rwgk commented Aug 6, 2025

Copy link
Copy Markdown
Contributor

There is still a problem with 13.0.0 wheels, both Linux and Windows, but I think for nvvm only.

I'm puzzled, because it works interactively (I tried both Linux and Windows). I'll continue working on this in about one hour; good chance that I can keep working on it then until I wrestle this down for good.

@rwgk

rwgk commented Aug 6, 2025

Copy link
Copy Markdown
Contributor

/ok to test e18a5e8

@rwgk

rwgk commented Aug 6, 2025

Copy link
Copy Markdown
Contributor

I started the testing to see where we stand with the CI (mainly with Windows).

I know from local testing that we're still up against this error (Linux):

tests/test_kernelParams.py::test_kernelParams_empty nvrtc: error: failed to open libnvrtc-builtins.so.13.0.
  Make sure that libnvrtc-builtins.so.13.0 is installed correctly.^@
FAILED

I need to work on understanding why this happens, because this looks as expected:

rwgk-win11.localdomain:~/forked/cuda-python $ ll Cp13WslVenv/lib/python3.12/site-packages/nvidia/cu13/lib/
total 375M
-rw-r--r-- 1 rgrossekunst rgrossekunst 3.1M Aug  5 21:39 libcufile.so.0
-rw-r--r-- 1 rgrossekunst rgrossekunst  43K Aug  5 21:39 libcufile_rdma.so.1
-rw-r--r-- 1 rgrossekunst rgrossekunst  95M Aug  5 21:39 libnvJitLink.so.13
-rw-r--r-- 1 rgrossekunst rgrossekunst 4.2M Aug  5 21:39 libnvrtc-builtins.alt.so.13.0
-rw-r--r-- 1 rgrossekunst rgrossekunst 4.2M Aug  5 21:39 libnvrtc-builtins.so.13.0
-rw-r--r-- 1 rgrossekunst rgrossekunst 105M Aug  5 21:39 libnvrtc.alt.so.13
-rw-r--r-- 1 rgrossekunst rgrossekunst 105M Aug  5 21:39 libnvrtc.so.13
-rw-r--r-- 1 rgrossekunst rgrossekunst  61M Aug  5 21:42 libnvvm.so.4
rwgk-win11.localdomain:~/forked/cuda-python $

But why then the error?

@rwgk

rwgk commented Aug 6, 2025

Copy link
Copy Markdown
Contributor

Windows passes!

But for Linux, I think there is a bug/oversight in the nvidia_cuda_nvrtc wheel.

For comparison, with CTK 12:

(Ctk12NvidiaWheelsVenv) rwgk-win11.localdomain:~ $ pip install "nvidia-cuda-nvrtc-cu12"
Collecting nvidia-cuda-nvrtc-cu12
  Using cached nvidia_cuda_nvrtc_cu12-12.9.86-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl.metadata (1.7 kB)
Using cached nvidia_cuda_nvrtc_cu12-12.9.86-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl (89.6 MB)
Installing collected packages: nvidia-cuda-nvrtc-cu12
Successfully installed nvidia-cuda-nvrtc-cu12-12.9.86
(Ctk12NvidiaWheelsVenv) rwgk-win11.localdomain:~/Ctk12NvidiaWheelsVenv/lib/python3.12/site-packages/nvidia/cuda_nvrtc/lib $ readelf -d libnvrtc.so.12 | grep -E 'RPATH|RUNPATH'
 0x000000000000001d (RUNPATH)            Library runpath: [$ORIGIN:]
(Ctk12NvidiaWheelsVenv) rwgk-win11.localdomain:~/Ctk12NvidiaWheelsVenv/lib/python3.12/site-packages/nvidia/cuda_nvrtc/lib

Now with CTK 13:

(Ctk13NvidiaWheelsVenv) rwgk-win11.localdomain:~ $ pip install "nvidia-cuda-nvrtc~=13.0"
Collecting nvidia-cuda-nvrtc~=13.0
  Using cached nvidia_cuda_nvrtc-13.0.48-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl.metadata (1.7 kB)
Using cached nvidia_cuda_nvrtc-13.0.48-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl (90.2 MB)
Installing collected packages: nvidia-cuda-nvrtc
Successfully installed nvidia-cuda-nvrtc-13.0.48
(Ctk13NvidiaWheelsVenv) rwgk-win11.localdomain:~/Ctk13NvidiaWheelsVenv/lib/python3.12/site-packages/nvidia/cu13/lib $ readelf -d libnvrtc.so.13 |
 grep -E 'RPATH|RUNPATH'
(Ctk13NvidiaWheelsVenv) rwgk-win11.localdomain:~/Ctk13NvidiaWheelsVenv/lib/python3.12/site-packages/nvidia/cu13/lib $

I'm afraid we have to work around that somehow.

@rwgk

rwgk commented Aug 6, 2025

Copy link
Copy Markdown
Contributor

/ok to test ff339b6

@rwgk

rwgk commented Aug 6, 2025

Copy link
Copy Markdown
Contributor

For completeness, this ChatGPT conversation explains how I arrived at commit ff339b6:

https://chatgpt.com/share/6892fd31-aafc-8008-a461-58ff0c602a0a

@kkraus14

kkraus14 commented Aug 6, 2025

Copy link
Copy Markdown
Collaborator

/ok to test 37ef8e0

kkraus14
kkraus14 previously approved these changes Aug 6, 2025
@github-project-automation github-project-automation Bot moved this from Todo to In Review in CCCL Aug 6, 2025
@leofang leofang marked this pull request as ready for review August 6, 2025 15:04
@leofang

leofang commented Aug 6, 2025

Copy link
Copy Markdown
Member Author

Since the CI was already green and the latest changes were doc-only, let me admin-merge this and run the CI in #795.

@leofang leofang merged commit c016d65 into NVIDIA:main Aug 6, 2025
1 check passed
@github-project-automation github-project-automation Bot moved this from In Review to Done in CCCL Aug 6, 2025
@leofang leofang deleted the unreleased-13.0 branch August 6, 2025 15:27
@github-actions

github-actions Bot commented Aug 6, 2025

Copy link
Copy Markdown
Doc Preview CI
Preview removed because the pull request was closed or merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI/CD CI/CD infrastructure cuda.bindings Everything related to the cuda.bindings module feature New feature or request P0 High priority - Must do!

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

Support CUDA 13.0 [BUG]: cuda_bindings/examples globalToShmemAsyncCopy_test.py "catastrophic error" is masked by pytest skipped

5 participants