Skip to content

Docs updates#114

Merged
dleshchev merged 10 commits into
mainfrom
cburdick/daqiri-docs-graphics
Jun 4, 2026
Merged

Docs updates#114
dleshchev merged 10 commits into
mainfrom
cburdick/daqiri-docs-graphics

Conversation

@cliffburdick
Copy link
Copy Markdown
Collaborator

  • Reworked landing page
  • Added PCIe as coming soon
  • Added new landing picture
  • Added decision tree

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Jun 2, 2026

Greptile Summary

This PR restructures the DAQIRI documentation site: benchmarking content is reorganized under a new top-level docs/benchmarks/ section (overview, socket/RDMA, and raw Ethernet pages), the landing page is reworked with a new hero layout and graphic overlay, and PCIe is surfaced as a coming-soon stream type. A small C++ change in the socket manager also ships with the PR.

  • Docs restructure: docs/tutorials/benchmarking_examples.md is split into benchmarks.md, socket_benchmarking.md, and raw_benchmarking.md; mkdocs.yml, docs/index.html, README.md, AGENTS.md, and all internal cross-links are updated to match.
  • Landing page: Hero section reworked with a two-column grid, new daqiri-landing-graphic.svg, and a CSS/JS lightbox overlay; active-link scrollspy added to the navbar.
  • Socket manager: Adds a UDP payload-size guard (kMaxUdpPayloadBytes = 65507) and an early-return reuse check in socket_connect_to_server to avoid recreating an already-running TCP connection.

Confidence Score: 5/5

Safe to merge; all changes are documentation and a small defensive socket manager guard with no user-visible regression risk.

The code change is a narrow defensive guard (UDP payload size check and TCP connection reuse early-return) that improves error handling without altering the happy path. The docs reorganization is consistent across mkdocs.yml, index.html, README, AGENTS.md, and all internal cross-links. The only issues found are cosmetic: a skipped tutorial number (08→10) on the landing page and one commit title missing the required issue-number prefix.

docs/index.html (tutorial number gap 08→10) and docs/benchmarks/raw_benchmarking.md (RDMA tuning section belongs in socket_benchmarking.md).

Important Files Changed

Filename Overview
docs/index.html Reworked hero section with new graphic, overlay lightbox, and active-link scrollspy; tutorial grid renumbered but skips 09 (jumps 08→10).
docs/benchmarks/benchmarks.md New overview page with backend decision table and common workflow guide; PCIe noted as coming soon.
docs/benchmarks/socket_benchmarking.md New page covering TCP/UDP and RoCE/RDMA benchmarks with namespace isolation setup; content is accurate and well-structured.
docs/benchmarks/raw_benchmarking.md Moved and expanded from benchmarking_examples.md; contains an RDMA-specific tuning section that conceptually belongs in socket_benchmarking.md.
mkdocs.yml Nav restructured with Benchmarking as a top-level section containing three sub-pages; docs/index.html updated in sync.
src/managers/socket/daqiri_socket_mgr.cpp Adds UDP payload-size guard (kMaxUdpPayloadBytes), TCP running-state check in send_tx_burst, and early-return reuse guard in socket_connect_to_server; stale-connection cleanup deferred per known design decision.
README.md Added Benchmarking section with namespace-based socket and RDMA examples; updated documentation table links to new paths.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[benchmarks.md<br/>Overview + Decision Tree] --> B[socket_benchmarking.md<br/>TCP / UDP / RoCE]
    A --> C[raw_benchmarking.md<br/>DPDK / Raw Ethernet]
    B --> C
    C --> D[configuration-walkthrough.md]
    E[index.html Landing Page] --> A
    E --> B
    E --> C
    F[mkdocs.yml nav] --> A
    F --> B
    F --> C
Loading

Reviews (6): Last reviewed commit: "#15 - Remove generated PCIe schematic ar..." | Re-trigger Greptile

Comment on lines 1286 to 1289

conn->running.store(false);
close_fd(conn->fd);

std::lock_guard<std::mutex> lock(state_mutex_);
connections_.erase(conn->conn_id);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Stale connections accumulate in connections_ map

Removing the connections_.erase(conn->conn_id) call from tcp_rx_loop means closed connections are never removed from the map. The new early-return guard in socket_connect_to_server correctly detects a live vs. stale entry via running.load(), but the stale shared_ptr<ConnectionState> object and its associated resources stay allocated for the lifetime of SocketMgr. In a benchmark or production process that cycles connections (network drops, repeated runs), each reconnect leaves a dead entry behind. Cleanup could be deferred to the point where the stale entry is detected in socket_connect_to_server, rather than on the loop-exit thread where the original race existed.

@cliffburdick cliffburdick force-pushed the cburdick/daqiri-docs-graphics branch from c3453f2 to 95b78f9 Compare June 3, 2026 15:16
@RamyaGuru
Copy link
Copy Markdown
Collaborator

Looks like this PR does not depend on PR #98. I favor merging this in before #98 because I can put the Spark performance report in the new Benchmarking section. I'm making some slight updates to the nav tree because it currently is hard to get back to the Benchmarks nav entry point if you click on the wrong thing. Will try to push those changes soon.

@RamyaGuru
Copy link
Copy Markdown
Collaborator

I made a new docs/benchmarks folder to organize all the related .md files there and fix minor issues with the nav tree. Also merged in the latest from main. Checked everything with an "mkdocs serve." I'm good to merge this into main if there are no concerns!

cliffburdick and others added 8 commits June 3, 2026 12:37
Signed-off-by: Cliff Burdick <cburdick@nvidia.com>
Signed-off-by: Cliff Burdick <cburdick@nvidia.com>
Signed-off-by: Cliff Burdick <cburdick@nvidia.com>
top-level "Benchmarking" nav section instead of being split between a
single top-level Benchmarks link and a Tutorials > Benchmarking submenu.

- Move docs/tutorials/{benchmarking,socket_benchmarking,benchmarking_examples}.md
  into docs/benchmarks/, renaming benchmarking.md to benchmarks.md and
  benchmarking_examples.md to raw_benchmarking.md.
- Restructure mkdocs.yml nav so Benchmarking is a top-level section with
  Overview, Socket and RDMA Benchmarking, and Raw Ethernet Benchmarking
  entries; drop the duplicate Tutorials > Benchmarking submenu.
- Drop the hide: navigation frontmatter from the raw Ethernet page so it
  inherits the new section sidebar.
- Update cross-references and link paths in docs/index.html, README.md,
  AGENTS.md, getting-started.md, configuration-walkthrough.md,
  system_configuration.md, .claude/rules/docs-sync.md, and
  .greptile/rules.md to the new locations.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Ramya Gurunathan <rgurunathan@nvidia.com>
The bare-metal CMake build tutorial and Greptile doc-sync rule both
reference the old docs/tutorials/benchmarking_examples.md path. Update
to the renamed docs/benchmarks/raw_benchmarking.md and adjust link text
to match the new "Raw Ethernet Benchmarking" page title.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Ramya Gurunathan <rgurunathan@nvidia.com>
The landing-page tutorials grid was overcrowded after the docs reorg;
the Benchmarking Overview tile largely duplicates the new top-level
Benchmarking nav entry. Renumber subsequent tiles 06-09 down to 05-08.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Ramya Gurunathan <rgurunathan@nvidia.com>
Add the bare-metal tutorial (introduced by #95, brought into this
branch by merging origin/main) to the two hand-mirrored nav lists that
are not generated from mkdocs.yml:

- docs/index.html — landing-page Tutorials hover dropdown
- docs/javascripts/tab-dropdowns.js — top-tab dropdown rendered on
  every docs page

Without this the entry only appears in the mkdocs Material sidebar.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Ramya Gurunathan <rgurunathan@nvidia.com>
Signed-off-by: Cliff Burdick <cburdick@nvidia.com>
@cliffburdick cliffburdick force-pushed the cburdick/daqiri-docs-graphics branch from 2d23793 to 4ed1f5a Compare June 3, 2026 19:39
Signed-off-by: Cliff Burdick <cburdick@nvidia.com>
@dleshchev
Copy link
Copy Markdown
Collaborator

can we make top links consistent in the index page and the "docs"?
the main/index page has the following options:Features
Quick Start
Concepts
Benchmarking
Tutorials
API Reference

when clicking on "benchmarks" -> leading to docs (.../daqiri/tutorials/ and others)
the options are
Getting Started
Concepts
Benchmarks
API Reference
Tutorials

@dleshchev
Copy link
Copy Markdown
Collaborator

pages in docs (.../daqiri/tutorials/benchmarking_examples/) do not have side menu/navigation; but .../daqiri/tutorials/benchmarking/ does. I like side menu - should we put it everywhere?

@dleshchev
Copy link
Copy Markdown
Collaborator

.../daqiri/tutorials/benchmarking/ is not accessible/visible via top links from any of other pages e.g. .../daqiri/tutorials/benchmarking_examples/

Copy link
Copy Markdown
Collaborator

@dleshchev dleshchev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some navigations issues to be addressed

Comment thread docs/benchmarks/benchmarks.md
Comment thread docs/benchmarks/socket_benchmarking.md
Comment thread README.md
- [Benchmarking Examples](https://nvidia.github.io/daqiri/tutorials/benchmarking_examples/) — run `daqiri_bench_raw_gpudirect` with a loopback test
- [Benchmarking Overview](https://nvidia.github.io/daqiri/benchmarks/benchmarks/) — choose between Linux sockets, RoCE/RDMA, and raw Ethernet benchmarks
- [Socket and RDMA Benchmarking](https://nvidia.github.io/daqiri/benchmarks/socket_benchmarking/) — run TCP/UDP sockets and RoCE/RDMA with matching namespace isolation
- [Raw Ethernet Benchmarking](https://nvidia.github.io/daqiri/benchmarks/raw_benchmarking/) — run `daqiri_bench_raw_gpudirect` with a physical loopback test
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I can reach that page

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can get it from my local copy

Comment thread mkdocs.yml
- Overview: benchmarks/benchmarks.md
- Socket and RDMA Benchmarking: benchmarks/socket_benchmarking.md
- Raw Ethernet Benchmarking: benchmarks/raw_benchmarking.md
- API Reference:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should these pages generate side menu for the API Guide, configuration yaml reference, and c++ api usage? also, python api usage link is not visible from the top down menu and only exists in the c++ api usage page

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's talk on slack since I'm not really sure what you mean

Signed-off-by: Cliff Burdick <cburdick@nvidia.com>
@dleshchev dleshchev merged commit d217271 into main Jun 4, 2026
3 checks passed
dleshchev added a commit that referenced this pull request Jun 4, 2026
Resolve conflicts from the docs PR #114 restructure: keep the #113 cross-host
DGX-Spark bullets and benchmarking section, but repoint their links to the
relocated docs/benchmarks/raw_benchmarking.md (single-host RDMA now lives in
socket_benchmarking.md). Also fix the cross-host section's system_configuration.md
link for its new docs/benchmarks/ location.

check_doc_refs.py, mkdocs build --strict, and check_html_links.py all pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Denis Leshchev <dleshchev@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants