Add a compile-only HIP CI job (and fix the documented HIP build)#53
Open
jeffdaily wants to merge 4 commits into
Open
Add a compile-only HIP CI job (and fix the documented HIP build)#53jeffdaily wants to merge 4 commits into
jeffdaily wants to merge 4 commits into
Conversation
Adds a build-hip job to the CI workflow that compiles the HIP device code in the official rocm/dev-ubuntu-24.04 container. Like the existing CUDA matrix entry, this runs on a GitHub-hosted runner that has no AMD GPU, so it is a compilation gate rather than a runtime test: it builds the HIP targets (hip_test, hip_random_test) offline for gfx90a, gfx1100, and gfx1201 but does not execute them. This catches HIP build regressions (header drift, hip* API renames) across ROCm releases. Running the device tests would require a runner with an AMD GPU. Test Plan: ran the job's exact configure and build against a local ROCm 7.2.1 toolchain on a gfx90a host; both HIP targets compiled for all three architectures. ``` cmake -S . -B build \ -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_HIP_COMPILER=/opt/rocm/llvm/bin/clang++ \ -DCMAKE_HIP_ARCHITECTURES="gfx90a;gfx1100;gfx1201" \ -DVSNRAY_ENABLE_HIP=ON -DVSNRAY_ENABLE_CUDA=OFF \ -DVSNRAY_ENABLE_COMMON=ON -DVSNRAY_ENABLE_VIEWER=OFF \ -DVSNRAY_ENABLE_EXAMPLES=OFF cmake --build build -j$(nproc) ``` Authored with the assistance of an AI coding agent.
The rocm/dev-ubuntu-24.04 container ships without git on PATH, so
actions/checkout fell back to the REST API download and could not fetch
submodules ("Input 'submodules' not supported when falling back to download
using the GitHub REST API"). Install git in a step before checkout so the
submodule fetch works; the remaining build dependencies stay in the
post-checkout step.
Authored with the assistance of an AI coding agent.
find_package(hip) failed in the rocm/dev-ubuntu-24.04 container with hip_DIR-NOTFOUND because /opt/rocm is not on CMake's default search path there (no ROCM_PATH env or /opt/rocm/bin on PATH at configure time). Point CMAKE_PREFIX_PATH at /opt/rocm so the hip, hipcub, and rocthrust config packages resolve. Reproduced the failure and verified the fix with a restricted-PATH configure and build locally. Authored with the assistance of an AI coding agent.
The HIP build block told the reader to set the HIP compiler by absolute path but then relied on find_package(hip/hipCUB/rocThrust) discovering ROCm via /opt/rocm/bin being on PATH. On a clean ROCm install (or a container) where ROCm is not on PATH, the documented command fails with hip_DIR-NOTFOUND. Pass CMAKE_PREFIX_PATH=/opt/rocm so the command works as written regardless of the user's PATH, and note when to override it. Authored with the assistance of an AI coding agent.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Follow-up to #51, in response to your question about adding HIP to CI. Two related changes, both addressing what breaks on a clean ROCm environment (a fresh container with no ROCm on PATH):
A compile-only
build-hipCI job. It compiles the HIP device code in the official rocm/dev-ubuntu-24.04 container. Like the existing CUDA matrix entry, it runs on a GitHub-hosted runner that has no AMD GPU, so it is a compilation gate rather than a runtime test: it builds the HIP targets (hip_test, hip_random_test) offline for gfx90a, gfx1100, and gfx1201 but does not execute them. This catches HIP build regressions (header drift, hip* API renames, CMake wiring) across ROCm releases. Running the device tests would require a runner with an AMD GPU (a self-hosted runner, or one of the GPU-backed hosted-runner programs), which is the part that depends on hardware access rather than a workflow edit.A one-line README fix. The HIP build block set the HIP compiler by absolute path but then relied on find_package(hip/hipCUB/rocThrust) discovering ROCm via /opt/rocm/bin being on PATH. On a clean install where ROCm is not on PATH, the documented command fails with hip_DIR-NOTFOUND. Passing CMAKE_PREFIX_PATH=/opt/rocm makes the command work as written. Standing up the CI job is what surfaced this: the container reproduced exactly the clean-environment case the README did not cover.
The build-hip job is green on this branch (both Release and Debug).