#15 - Add DGX Spark sweep tooling and RDMA loopback prereqs#116
Conversation
|
| Filename | Overview |
|---|---|
| .gitignore | Adds bench-results/ output directory and pcie_schematic.png artifact to gitignore; clean and correct. |
| examples/rdma_bench.cpp | Raises kMaxOutstanding from 5 to 20 to match the YAML num_bufs, with an inline comment documenting the pool-drain constraint and the known structural limitation (interleaving drain/post is deferred). |
| examples/run_spark_bench.sh | New Spark sweep wrapper; failure cells correctly return nonzero instead of emitting false zero rows. Header comment for RX_IFACE is misleading — the variable is unused in this script and /proc/net/udp provides no per-interface filtering. |
| scripts/setup_spark_rdma_loopback.sh | New idempotent RDMA loopback setup script; correctly flushes routes/rules before re-adding, reads MACs from sysfs with env overrides, and uses per-port routing tables. |
| scripts/spark_data_fill.sh | One-shot data-fill driver with pre-flight hugepage checks, orphan-hugepage cleanup between runs, and correct PIPESTATUS capture for pipeline exit-code propagation. |
Sequence Diagram
sequenceDiagram
participant User
participant DataFill as spark_data_fill.sh
participant Wrapper as run_spark_bench.sh
participant EnvCapture as bench_capture_environment.sh
participant Bench as BenchBinary
participant CSV
User->>DataFill: run with backends
DataFill->>DataFill: preflight hugepages, MAC, carrier
loop each backend x mode
DataFill->>DataFill: clean_orphan_hugepages
DataFill->>Wrapper: backend mode
Wrapper->>EnvCapture: capture env state
loop each cell payload x batch x target_gbps
Wrapper->>Wrapper: snapshot udp/cpu/dmon
Wrapper->>Bench: execute with generated YAML
Bench-->>Wrapper: stdout stats + stderr drops
alt cell succeeded
Wrapper->>CSV: append row
else cell failed
Wrapper->>Wrapper: FAILURES++
end
end
Wrapper-->>DataFill: exit status
DataFill->>DataFill: clean_orphan_hugepages
end
DataFill-->>User: summary and result dirs
Reviews (2): Last reviewed commit: "#15 - Add DGX Spark sweep tooling and RD..." | Re-trigger Greptile
| BASE_YAML="$SCRIPT_DIR/daqiri_bench_socket_tcp_tx_rx.yaml" | ||
| BENCH_BIN="$BUILD_DIR/examples/daqiri_bench_socket" | ||
| CPU_MASTER=8; CPU_TX=17; CPU_RX=18 | ||
| ;; | ||
| *) echo "Unknown backend: $BACKEND" >&2; exit 1 ;; | ||
| esac |
There was a problem hiding this comment.
Socket-udp
BATCHES_SWEEP never varies the config
BATCHES_SWEEP=(256 32 1) is declared for socket-udp, but generate_yaml for both socket-udp and socket-tcp only substitutes message_size — the queue-level batch_size: 1 in daqiri_bench_socket_udp_tx_rx.yaml is never updated. Every element of BATCHES_SWEEP therefore produces an identical YAML, so the CSV will contain three rows per payload with different batch values but indistinguishable bench configs and throughput numbers. Either add a batch_size substitution leg to generate_yaml for socket backends (matching the DPDK -e "s|^( *batch_size: ).*|\1$batch|" pattern), or collapse BATCHES_SWEEP=(1) for socket-udp to make the sweep intent explicit.
| // Matches the per-MR num_bufs in the YAML configs. Higher values deadlock | ||
| // the bench: post_req blocks in get_tx_packet_burst when the pool is empty, | ||
| // but free_tx_burst (which refills it) only runs later in the same loop | ||
| // iteration via get_rx_burst. Until the loop is refactored to interleave | ||
| // drain with post, this constant must stay <= num_bufs. | ||
| static constexpr int kMaxOutstanding = 20; |
There was a problem hiding this comment.
Doc-sync gap:
examples/rdma_bench.cpp change not reflected in docs
Per the project doc-sync rule, any change to examples/*.cpp requires updating docs/tutorials/benchmarking_examples.md, docs/tutorials/configuration-walkthrough.md, and the benchmark table in AGENTS.md in the same PR. The new kMaxOutstanding value and its deadlock constraint are meaningful to anyone tuning RDMA buffer depth, and run_spark_bench.sh adds a new benchmark entry-point that both docs should mention. None of those three files are touched in this PR.
Rule Used: DAQIRI has no automated doc-sync gate beyond mkdoc... (source)
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Adds Spark-focused benchmark sweep wrappers and host RDMA loopback setup now that the reusable bench infrastructure is on main. - examples/run_spark_bench.sh: Spark-tuned sweep driver with per-backend payload/batch matrices, CPU pins, drop-source dispatch, and one CSV row per successful cell into bench-results/. Bench failures or missing completion stats now keep artifacts but return nonzero instead of producing false zero rows. - scripts/spark_data_fill.sh: one-shot driver for the DPDK / socket-UDP / socket-TCP bench matrix, with hugepage pre-flight, orphan-hugepage cleanup between runs, and aggregate failure propagation. - scripts/setup_spark_rdma_loopback.sh: idempotent Spark host prereq for the p0-to-p1 RoCE loopback. Defaults match the Spark profile and MACs are read from sysfs unless explicitly overridden. - examples/rdma_bench.cpp: raise kMaxOutstanding from 5 to 20 to match the Spark RDMA YAML buffer depth and improve small-payload throughput without exceeding num_bufs. - .gitignore: ignore generated Spark bench artifacts. Includes fixes for two parsing bugs Greptile flagged on the original draft of run_spark_bench.sh: /proc/net/udp drops are decimal, and socket bench uses sent_packets / sent_bytes rather than RDMA send_completions / send_bytes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: rgurunathan <rgurunathan@nvidia.com> Signed-off-by: Denis Leshchev <dleshchev@nvidia.com>
ec36695 to
9837f99
Compare
Rebuilds the Spark tooling stack item as a clean follow-up after PR #96 landed on main.\n\nChanges:\n- Adds the Spark benchmark sweep wrapper and data-fill driver.\n- Adds the Spark RDMA loopback setup helper with current p0-to-p1 defaults.\n- Raises RDMA bench outstanding depth to match the YAML buffer depth.\n- Keeps failed/malformed bench cells from being emitted as successful zero rows.\n\nValidation:\n- bash -n examples/run_spark_bench.sh\n- bash -n scripts/spark_data_fill.sh\n- bash -n scripts/setup_spark_rdma_loopback.sh\n- python3 scripts/check_doc_refs.py\n- git diff --check origin/main...HEAD\n\nLocal hardware benchmark/build not run in this shell; the repo requires compile/run inside the project container with the Spark NIC/GPU environment.