#15 - Add DGX Spark sweep tooling and RDMA loopback prereqs by dleshchev · Pull Request #116 · NVIDIA/daqiri

dleshchev · 2026-06-03T17:08:15Z

Rebuilds the Spark tooling stack item as a clean follow-up after PR #96 landed on main.\n\nChanges:\n- Adds the Spark benchmark sweep wrapper and data-fill driver.\n- Adds the Spark RDMA loopback setup helper with current p0-to-p1 defaults.\n- Raises RDMA bench outstanding depth to match the YAML buffer depth.\n- Keeps failed/malformed bench cells from being emitted as successful zero rows.\n\nValidation:\n- bash -n examples/run_spark_bench.sh\n- bash -n scripts/spark_data_fill.sh\n- bash -n scripts/setup_spark_rdma_loopback.sh\n- python3 scripts/check_doc_refs.py\n- git diff --check origin/main...HEAD\n\nLocal hardware benchmark/build not run in this shell; the repo requires compile/run inside the project container with the Spark NIC/GPU environment.

greptile-apps · 2026-06-03T17:14:27Z

Greptile Summary

Adds the DGX Spark benchmark sweep stack on top of the base infra merged in PR #96: a sweep wrapper (run_spark_bench.sh), a one-shot data-fill driver (spark_data_fill.sh), and an idempotent RDMA loopback setup script (setup_spark_rdma_loopback.sh), plus a kMaxOutstanding bump in rdma_bench.cpp to match the YAML buffer depth.

examples/run_spark_bench.sh: Sweeps payload × batch × target-gbps across DPDK, RDMA, and socket backends; emits one CSV row per successful cell and propagates failures without producing false zero rows.
scripts/spark_data_fill.sh: Drives the DPDK/socket sweep and drop-curve modes with hugepage pre-flight, inter-run cleanup, and live log streaming via tee.
scripts/setup_spark_rdma_loopback.sh: Sets up the p0↔p1 RDMA loopback with per-port routing tables and static ARP entries; safe to re-run.

Confidence Score: 5/5

The changes are additive tooling scripts and a one-line constant bump; they do not touch the library core, the Manager vtable, or the BurstParams contract.

All five changed files are new shell scripts or a trivial constant change in an example binary. The kMaxOutstanding bump is well-documented with the pool-drain constraint inline. The scripts handle failure isolation correctly and do not emit false zero rows. DCO sign-offs and commit format are both present.

No files require special attention; the only finding is a misleading header comment in examples/run_spark_bench.sh.

Important Files Changed

Filename	Overview
.gitignore	Adds bench-results/ output directory and pcie_schematic.png artifact to gitignore; clean and correct.
examples/rdma_bench.cpp	Raises kMaxOutstanding from 5 to 20 to match the YAML num_bufs, with an inline comment documenting the pool-drain constraint and the known structural limitation (interleaving drain/post is deferred).
examples/run_spark_bench.sh	New Spark sweep wrapper; failure cells correctly return nonzero instead of emitting false zero rows. Header comment for RX_IFACE is misleading — the variable is unused in this script and /proc/net/udp provides no per-interface filtering.
scripts/setup_spark_rdma_loopback.sh	New idempotent RDMA loopback setup script; correctly flushes routes/rules before re-adding, reads MACs from sysfs with env overrides, and uses per-port routing tables.
scripts/spark_data_fill.sh	One-shot data-fill driver with pre-flight hugepage checks, orphan-hugepage cleanup between runs, and correct PIPESTATUS capture for pipeline exit-code propagation.

Sequence Diagram

sequenceDiagram
    participant User
    participant DataFill as spark_data_fill.sh
    participant Wrapper as run_spark_bench.sh
    participant EnvCapture as bench_capture_environment.sh
    participant Bench as BenchBinary
    participant CSV

    User->>DataFill: run with backends
    DataFill->>DataFill: preflight hugepages, MAC, carrier
    loop each backend x mode
        DataFill->>DataFill: clean_orphan_hugepages
        DataFill->>Wrapper: backend mode
        Wrapper->>EnvCapture: capture env state
        loop each cell payload x batch x target_gbps
            Wrapper->>Wrapper: snapshot udp/cpu/dmon
            Wrapper->>Bench: execute with generated YAML
            Bench-->>Wrapper: stdout stats + stderr drops
            alt cell succeeded
                Wrapper->>CSV: append row
            else cell failed
                Wrapper->>Wrapper: FAILURES++
            end
        end
        Wrapper-->>DataFill: exit status
        DataFill->>DataFill: clean_orphan_hugepages
    end
    DataFill-->>User: summary and result dirs

_{Reviews (2): Last reviewed commit: "#15 - Add DGX Spark sweep tooling and RD..." | Re-trigger Greptile}

greptile-apps · 2026-06-03T17:14:31Z

+    BASE_YAML="$SCRIPT_DIR/daqiri_bench_socket_tcp_tx_rx.yaml"
+    BENCH_BIN="$BUILD_DIR/examples/daqiri_bench_socket"
+    CPU_MASTER=8; CPU_TX=17; CPU_RX=18
+    ;;
+  *) echo "Unknown backend: $BACKEND" >&2; exit 1 ;;
+esac


Socket-udp BATCHES_SWEEP never varies the config

BATCHES_SWEEP=(256 32 1) is declared for socket-udp, but generate_yaml for both socket-udp and socket-tcp only substitutes message_size — the queue-level batch_size: 1 in daqiri_bench_socket_udp_tx_rx.yaml is never updated. Every element of BATCHES_SWEEP therefore produces an identical YAML, so the CSV will contain three rows per payload with different batch values but indistinguishable bench configs and throughput numbers. Either add a batch_size substitution leg to generate_yaml for socket backends (matching the DPDK -e "s|^( *batch_size: ).*|\1$batch|" pattern), or collapse BATCHES_SWEEP=(1) for socket-udp to make the sweep intent explicit.

greptile-apps · 2026-06-03T17:14:32Z

+  // Matches the per-MR num_bufs in the YAML configs. Higher values deadlock
+  // the bench: post_req blocks in get_tx_packet_burst when the pool is empty,
+  // but free_tx_burst (which refills it) only runs later in the same loop
+  // iteration via get_rx_burst. Until the loop is refactored to interleave
+  // drain with post, this constant must stay <= num_bufs.
+  static constexpr int kMaxOutstanding = 20;


Doc-sync gap: examples/rdma_bench.cpp change not reflected in docs

Per the project doc-sync rule, any change to examples/*.cpp requires updating docs/tutorials/benchmarking_examples.md, docs/tutorials/configuration-walkthrough.md, and the benchmark table in AGENTS.md in the same PR. The new kMaxOutstanding value and its deadlock constraint are meaningful to anyone tuning RDMA buffer depth, and run_spark_bench.sh adds a new benchmark entry-point that both docs should mention. None of those three files are touched in this PR.

Rule Used: DAQIRI has no automated doc-sync gate beyond mkdoc... (source)

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Adds Spark-focused benchmark sweep wrappers and host RDMA loopback setup now that the reusable bench infrastructure is on main. - examples/run_spark_bench.sh: Spark-tuned sweep driver with per-backend payload/batch matrices, CPU pins, drop-source dispatch, and one CSV row per successful cell into bench-results/. Bench failures or missing completion stats now keep artifacts but return nonzero instead of producing false zero rows. - scripts/spark_data_fill.sh: one-shot driver for the DPDK / socket-UDP / socket-TCP bench matrix, with hugepage pre-flight, orphan-hugepage cleanup between runs, and aggregate failure propagation. - scripts/setup_spark_rdma_loopback.sh: idempotent Spark host prereq for the p0-to-p1 RoCE loopback. Defaults match the Spark profile and MACs are read from sysfs unless explicitly overridden. - examples/rdma_bench.cpp: raise kMaxOutstanding from 5 to 20 to match the Spark RDMA YAML buffer depth and improve small-payload throughput without exceeding num_bufs. - .gitignore: ignore generated Spark bench artifacts. Includes fixes for two parsing bugs Greptile flagged on the original draft of run_spark_bench.sh: /proc/net/udp drops are decimal, and socket bench uses sent_packets / sent_bytes rather than RDMA send_completions / send_bytes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: rgurunathan <rgurunathan@nvidia.com> Signed-off-by: Denis Leshchev <dleshchev@nvidia.com>

greptile-apps Bot reviewed Jun 3, 2026

View reviewed changes

dleshchev force-pushed the review/pr-97-spark-tooling branch from ec36695 to 9837f99 Compare June 3, 2026 17:19

dleshchev merged commit 4cd7514 into main Jun 3, 2026
1 check passed

dleshchev deleted the review/pr-97-spark-tooling branch June 3, 2026 17:28

This was referenced Jun 3, 2026

#15 - Reapply RDMA benchmark profiling updates #118

Merged

#15 - Add DGX Spark sweep tooling and RDMA loopback prereqs #97

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

#15 - Add DGX Spark sweep tooling and RDMA loopback prereqs#116

#15 - Add DGX Spark sweep tooling and RDMA loopback prereqs#116
dleshchev merged 1 commit into
mainfrom
review/pr-97-spark-tooling

dleshchev commented Jun 3, 2026

Uh oh!

greptile-apps Bot commented Jun 3, 2026 •

edited

Loading

Greptile Summary

Uh oh!

greptile-apps Bot Jun 3, 2026

Uh oh!

greptile-apps Bot Jun 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dleshchev commented Jun 3, 2026

Uh oh!

greptile-apps Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

greptile-apps Bot commented Jun 3, 2026 •

edited

Loading