#67 - Add bare-metal CMake build tutorial#95
Conversation
|
| Filename | Overview |
|---|---|
| docs/tutorials/bare-metal-cmake-build.md | New 446-line tutorial covering DPDK patch + DAQIRI bare-metal build end-to-end; technically sound with prior-round fixes absorbed; no code changes. |
| docs/index.html | Converts the "Building from Source with CMake" Coming Soon tile to a live link pointing to the new tutorial; title updated to "Bare-Metal CMake Build". |
| mkdocs.yml | Adds the new tutorial entry to the Tutorials nav between System Configuration and Benchmarking Examples; consistent with docs/index.html landing page order. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Prerequisite Verification\nnvidia-smi / nvcc / ibv_devinfo / lspci] --> B[Step 1: Configure DOCA APT Repo]
B --> C[Step 2: Install Build Tooling\nbuild-essential / ninja / meson / CMake 3.20+\nlibibverbs-dev / librdmacm-dev / libmlx5-1]
C --> D[Step 3: Build Patched DPDK 25.11\ngit clone daqiri to download tarball\ngit apply dmabuf.patch + dpdk.nvidia.patch\nmeson setup + ninja + sudo meson install]
D --> E[Step 4: Configure DAQIRI with CMake\ncmake -S . -B build\nDDAQIRI_MGR / DCMAKE_CUDA_ARCHITECTURES]
E --> F[Step 5.1: Build\ncmake --build build -j nproc]
F --> G[Step 5.2: Install\ncmake --install build --prefix /opt/daqiri]
G --> H{ldd resolves\nlibrte_* / libibverbs\nlibcudart?}
H -- Yes --> I[Step 5.3: Smoke Test\ndaqiri_bench_raw_gpudirect\ndaqiri_bench_raw_tx_rx.yaml]
H -- No --> J[Step 6: Troubleshooting\nPKG_CONFIG_PATH / hugepages\nlibverbs / compute_121]
I --> K[Next Steps\nSystem Configuration\nBenchmarking Examples]
Reviews (5): Last reviewed commit: "#67 - Scrub em-dashes and hedges from ba..." | Re-trigger Greptile
| git apply \ | ||
| --exclude=.mailmap \ | ||
| --exclude=doc/guides/rel_notes/release_26_03.rst \ | ||
| "$OLDPWD/dpdk_patches/dmabuf.patch" | ||
| git apply "$OLDPWD/dpdk_patches/dpdk.nvidia.patch" | ||
| ``` |
There was a problem hiding this comment.
git apply fails on a non-git directory
/tmp/dpdk-${DPDK_VERSION} is an extracted tarball, not a git repository. git apply requires the working directory to be inside a git repo (it walks up looking for .git) and will fail with fatal: not a git repository on a bare extract. The conventional tool for applying patches to tarball-extracted source trees is patch -p1:
patch -p1 --exclude=.mailmap --exclude=doc/guides/rel_notes/release_26_03.rst \
< "$OLDPWD/dpdk_patches/dmabuf.patch"
patch -p1 < "$OLDPWD/dpdk_patches/dpdk.nvidia.patch"
Alternatively, add git init && git add -A && git commit -m "baseline" before the git apply calls, but patch -p1 is the idiomatic approach for tarballs and removes the git dependency from the DPDK build step.
There was a problem hiding this comment.
Switched the DPDK patch commands from git apply to patch -p1 (the extracted tarball is not a git repository).
|
|
||
| The CUDA architectures DAQIRI compiles for are currently hardcoded to `80;90;121` (A100, H100, GB10) at [`src/CMakeLists.txt:25`](https://github.com/NVIDIA/daqiri/blob/main/src/CMakeLists.txt). If your GPU is a different generation, or your installed CUDA Toolkit does not understand one of those architectures (most commonly `121` on older toolkits), override it on the CMake command line: | ||
|
|
||
| ```bash | ||
| # Example: only build for Ada (sm_89, e.g. RTX 6000 Ada) | ||
| cmake -S . -B build ... -DCMAKE_CUDA_ARCHITECTURES=89 | ||
|
|
||
| # Example: A100 + Ada + Hopper, skip GB10 | ||
| cmake -S . -B build ... -DCMAKE_CUDA_ARCHITECTURES="80;89;90" | ||
| ``` | ||
|
|
||
| The override works because `CMAKE_CUDA_ARCHITECTURES` is a standard CMake variable; the `set()` in `src/CMakeLists.txt` runs after the command-line cache value is established. Common values: | ||
|
|
||
| | GPU family | Architecture | |
There was a problem hiding this comment.
-DCMAKE_CUDA_ARCHITECTURES override is silently ignored
The tutorial tells IGX Orin (sm_89) and RTX Pro Server (sm_100) users that passing -DCMAKE_CUDA_ARCHITECTURES=89 on the CMake command line overrides the hardcoded value. It does not. src/CMakeLists.txt line 25 uses a plain set(CMAKE_CUDA_ARCHITECTURES "80;90;121"), which creates a normal CMake variable that shadows the cache variable set by -D. Within src/'s scope — where all CUDA sources are compiled — CMAKE_CUDA_ARCHITECTURES is unconditionally "80;90;121", regardless of the command-line cache value.
The PR description's own follow-up captures this: "Consider promoting CMAKE_CUDA_ARCHITECTURES from a hardcoded set() in src/CMakeLists.txt to a proper cache variable so the override is discoverable." Until that change lands in src/CMakeLists.txt, the override instructions in this section are incorrect and will mislead users on platforms that need a different architecture — especially RTX Pro 6000 Blackwell (sm_100), which is not in the default list and won't have pre-compiled code.
There was a problem hiding this comment.
Corrected the CMAKE_CUDA_ARCHITECTURES guidance: src/CMakeLists.txt:25 uses a plain set() that shadows the cache variable, so -DCMAKE_CUDA_ARCHITECTURES=... on the CMake command line is silently ignored. The tutorial now tells users to edit that line directly and adds a warning that the default list does not cover sm_86 / sm_89 / sm_100.
|
|
||
| `ldd` should show `librte_*` (DPDK), `libibverbs.so`, `librdmacm.so`, `libcudart.so`, and `libyaml-cpp.so` resolving. Unresolved entries usually mean either DPDK was installed to a prefix not on the dynamic loader path (add it to `/etc/ld.so.conf.d/` and re-run `sudo ldconfig`), or the RDMA libraries from [Step 2](#step-2-install-build-tooling-and-rdma-libraries) were not actually installed. | ||
|
|
||
| ### 5.3 Smoke test |
There was a problem hiding this comment.
Instead of the SW loopback, we should use the standard daqiri_bench_raw_tx_rx.yaml for the smoke test.
There was a problem hiding this comment.
Thank you, I updated that!
|
I walked through the bare-metal tutorial on the IGX devkit using PR #95. Environment:
Integration note:
Blocking items found during the walkthrough:
The tutorial uses: patch -p1 \
--exclude=.mailmap \
--exclude='doc/guides/rel_notes/release_26_03.rst' \
< "$OLDPWD/dpdk_patches/dmabuf.patch"On this host, GNU patch 2.7.6 rejects The workaround that succeeded was: git apply \
--exclude=.mailmap \
--exclude='doc/guides/rel_notes/release_26_03.rst' \
/tmp/daqiri-pr95-baremetal/dpdk_patches/dmabuf.patch
git apply /tmp/daqiri-pr95-baremetal/dpdk_patches/dpdk.nvidia.patchThis worked even in the extracted DPDK tarball directory.
During the same walkthrough, the tutorial could not reach the raw TX/RX smoke test because Those CUDA-related findings have been split out:
So I would not block this tutorial PR on implementing the CUDA fix here, but the tutorial should account for the outcome once #107 lands. Usability issues found during the walkthrough:
The long Kitware apt-source Suggested doc style: DISABLE_DRIVERS='baseband/*,bus/ifpga/*,...'
meson setup ${DPDK_BUILD_DIR} \
--prefix=${DPDK_INSTALL_PREFIX} \
-Dtests=false \
-Dplatform=generic \
-Denable_docs=false \
-Ddisable_drivers="${DISABLE_DRIVERS}"
Successful parts:
|
dleshchev
left a comment
There was a problem hiding this comment.
This can now be easily applied as the PR 107 landed!
I walked through the bare-metal tutorial on the IGX devkit using PR #95.
Environment:
Ubuntu 22.04.5 LTS, aarch64, IGX/Tegra kernel 5.15.0-1039-nvidia-tegra-igx
RTX 6000 Ada
CUDA Toolkit 12.6.68
ConnectX-7, mlx5_0 / mlx5_1, both active
DPDK 25.11 built and installed successfully after patch-command workaround
Integration note:
PR #95 currently needs a rebase onto current main. GitHub reports it as conflicting, and the conflicts are in the docs restructure area (docs/concepts.md, docs/index.html, mkdocs.yml). Rebasing should make the new tutorial compatible with the latest docs structure.
Blocking items found during the walkthrough:
DPDK patch command does not work as written.
The tutorial uses:
patch -p1
--exclude=.mailmap
--exclude='doc/guides/rel_notes/release_26_03.rst'
< "$OLDPWD/dpdk_patches/dmabuf.patch"
On this host, GNU patch 2.7.6 rejects --exclude:
patch: unrecognized option '--exclude=.mailmap'
The workaround that succeeded was:
git apply
--exclude=.mailmap
--exclude='doc/guides/rel_notes/release_26_03.rst'
/tmp/daqiri-pr95-baremetal/dpdk_patches/dmabuf.patch
git apply /tmp/daqiri-pr95-baremetal/dpdk_patches/dpdk.nvidia.patch
This worked even in the extracted DPDK tarball directory.
CUDA 12.6 / RTX 6000 Ada issues blocked the tutorial smoke-test path.
During the same walkthrough, the tutorial could not reach the raw TX/RX smoke test because DAQIRI_BUILD_EXAMPLES=ON failed on CUDA Toolkit 12.6.68, and RTX 6000 Ada also needs CUDA arch handling different from the tutorial's default path.
Those CUDA-related findings have been solved with PR 107 landed.
Usability issues found during the walkthrough:
Long copy/paste commands are fragile.
The long Kitware apt-source echo command and the long Meson -Ddisable_drivers=... command both broke when line-wrapped during terminal copy/paste. The Meson configure still ran, but with a truncated disable list until we rewrote the value into a shell variable.
Suggested doc style:
DISABLE_DRIVERS='baseband/,bus/ifpga/,...'
meson setup ${DPDK_BUILD_DIR}
--prefix=${DPDK_INSTALL_PREFIX}
-Dtests=false
-Dplatform=generic
-Denable_docs=false
-Ddisable_drivers="${DISABLE_DRIVERS}"
Minor clarity improvements:
ibv_devinfo printed the libvmw_pvrdma-rdmav34.so warning, but still reported both mlx5 devices. System Configuration already calls this harmless; this tutorial could mention it too.
/proc/config.gz was absent, but the documented /boot/config-$(uname -r) fallback worked and confirmed CONFIG_DMA_SHARED_BUFFER=y.
apt selected ibverbs-providers instead of libmlx5-1; the install still worked, but the package wording may be less portable than intended.
Successful parts:
Step 2 package install worked.
DPDK 25.11 configured, built, installed, and pkg-config --modversion libdpdk returned 25.11.0.
DAQIRI core library built and installed to /opt/daqiri.
A minimal consumer compile/link test against /opt/daqiri/lib/pkgconfig/daqiri.pc succeeded.
Signed-off-by: Chloe Crozier <chloecrozier@gmail.com>
Signed-off-by: Chloe Crozier <chloecrozier@gmail.com>
500e36a to
4a8edef
Compare
DGX Spark TestingValidated on a 2-Spark pair (GB10, ConnectX-7 fw 28.45.4028, Ubuntu 24.04 ARM, CUDA 13.0, driver 580.95.05), p0↔p0 cross-cabled, 2048 hugepages, peermem not loaded. Followed the tutorial verbatim on both hosts (apt deps → DPDK 25.11 patch+build → Passed: items 1, 2, 3, 5 of the DGX-Spark follow-up checklist. Default arch list resolved to Needs fixing:
|
Signed-off-by: Chloe Crozier <chloecrozier@gmail.com>
Signed-off-by: Chloe Crozier <chloecrozier@gmail.com>
Closes #67
Summary
docs/tutorials/bare-metal-cmake-build.mdwalking through a full bare-metal build: prerequisite checks, DOCA repo, dependency install, patched DPDK 25.11 from source, CMake configuration, install, smoke test, troubleshooting, and per-platform notes for DGX Spark, IGX Orin, and x86_64 RTX Pro Server.mkdocs.ymldirectly after "System Configuration".docs/index.html) to a live link.The current Getting Started page gives a 5-line CMake snippet and says "see the Dockerfile" for dependency details. Users on DGX Spark, IGX Orin, or an x86_64 RTX Pro server who can't or don't want to use the container had no guided path. This is the initial version of that guided path, scoped so it can land now and grow over time as we exercise it on more hardware/platforms.
Rebased onto current
main: the previous branch carried the#69API-reference restructure commits, which have since landed via PR #94, plus PR #107 made-DCMAKE_CUDA_ARCHITECTURES=...actually honored. The branch is now a clean delta on top ofmaincontaining only the bare-metal tutorial work.UPDATED Test plan
python scripts/check_doc_refs.py— clean (8 binaries, 20 YAML configs, 11 doc files)mkdocs build --strict— clean (no broken links or anchors)python scripts/check_html_links.py site/— all landing-page links resolve5.15.0-1039-nvidia-tegra-igx, RTX 6000 Ada, CUDA 12.6.68, ConnectX-7): Step 2 apt install, DPDK 25.11 configure/build/install (pkg-config --modversion libdpdk→25.11.0), DAQIRI core build and install to/opt/daqiri, minimal consumer compile/link against/opt/daqiri/lib/pkgconfig/daqiri.pc. Documented in @dleshchev's review on PR #67 - Add bare-metal CMake build tutorial #95.Review feedback addressed
From @greptile-apps
patch -p1for the initial fix, then back togit applyafter the IGX walkthrough showed GNUpatch2.7.x does not accept--exclude=. The tutorial now explains both reasons: (a)--exclude=is required to skip.mailmap/ release-note hunks that don't exist in a stock tarball, and (b)git applyworks against an extracted tarball even though it is not a git repo.src/CMakeLists.txtwith "search forDAQIRI_HAS_SOCKET_IDX/CMAKE_CUDA_ARCHITECTURES" pointers that survive source-file edits.CMAKE_CUDA_ARCHITECTURESoverride now actually works (PR #106 - Fix CUDA 12.6 raw benchmark build #107). The tutorial uses plaincmake -S . -B build ... -DCMAKE_CUDA_ARCHITECTURES=89on the command line; thesedworkaround and the "silently ignored" warning are gone, so the related "sed silently succeeds if pattern already changed" concern is moot.From @RamyaGuru
daqiri_bench_raw_tx_rx.yamlloopback rather than the SW loopback.From @dleshchev (IGX walkthrough on PR #95)
git apply --exclude=....echoand the Meson-Ddisable_drivers=...are now built from shell variables (KITWARE_LINE,DISABLE_DRIVERS) so terminal line-wrap can't silently truncate the argument. Added aCopy/paste safetytip explaining why.ibv_devinfolibvmw_pvrdma-rdmav34.sowarning — added a new failure tile in the Prerequisite Verification section calling it harmless and cross-linking to the System Configuration page that already covers it./proc/config.gzabsent on IGX/Tegra — the existing failure tile now explicitly calls out the IGX/Tegra kernel, and the IGX Orin platform tab also points back to the/boot/config-$(uname -r)fallback.aptselectsibverbs-providersinstead oflibmlx5-1— added a note explaining apt may resolvelibmlx5-1to theibverbs-providersvirtual package, that this works fine, and what to do if you specifically want the DOCA-provided build.main. This branch now picks up the new default arch list (80;90, with121appended only on CUDA Toolkit ≥ 13.0); the IGX Orin tab tells users to pass-DCMAKE_CUDA_ARCHITECTURES=89for RTX 6000 Ada; a newcompute_121troubleshooting tile explains when the error can still appear and how to drop the121entry.Follow-ups (in addition to this PR)
CMAKE_CUDA_ARCHITECTUREStable as new GPUs (e.g. additional Blackwell SKUs) get added to the lab.Platform-specific things to test / add
DGX Spark (GB10)
80;90plus121appended automatically on CUDA Toolkit ≥ 13.0) compiles against the installed CUDA Toolkit.daqiri_bench_raw_tx_rx_spark.yaml(post-#15 #77 - Fix DGX Spark example to a true over-the-wire loopback (TX/… #108 cross-port BDFs).daqiri_bench_rdma_tx_rx_spark.yaml.host_pinnedis the right buffer kind on current firmware.IGX
DAQIRI_BUILD_EXAMPLES=ONon CUDA 12.6 at the time, now unblocked by PR #106 - Fix CUDA 12.6 raw benchmark build #107.-DCMAKE_CUDA_ARCHITECTURES=89now that PR #106 - Fix CUDA 12.6 raw benchmark build #107 has landed.aarch64-linux-gnuPKG_CONFIG_PATHhint is correct.nvidia-peermemstill needs to be loaded when not using the patched DPDK.x86_64 RTX Pro Server
100).100).89).mesonbuilds against the x86_64 DOCA libs.pkg-configpicks up/usr/local/lib/x86_64-linux-gnu/pkgconfigwithout extra env tweaks.