Use UID-based graph JSON v2 for repro extraction#280
Conversation
4e4d199 to
83fcc7b
Compare
|
@cudnn-ci-bot run |
|
🚀 Running mirror pipeline Branch: cudnn-gh/pr-280-2347b21 |
|
@CodeRabbit review |
✅ Action performedReview finished.
|
📝 WalkthroughWalkthroughThis PR refactors the cuDNN frontend library and repro tool to replace graph_uid with gid identifiers and migrate tensor representation from dict-keyed to tid-indexed lists. Core library changes introduce per-tensor IDs, update JSON serialization format, and modify graph key/logging. The repro tool is comprehensively refactored to support the new tensor list API, ragged detection, and gid-based log parsing with tensor dump injection. ChangesCore Graph and Tensor ID Infrastructure
Repro Tool Tensor Lookup and Payload Validation
Test Infrastructure and Payload Updates
🎯 4 (Complex) | ⏱️ ~60 minutes
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (2)
tools/cudnn_repro/tests/test_cudnn_repro_fp8.py (1)
179-186: ⚡ Quick winReplace lambda assignment with a
defstatement.Lambda assignments reduce readability and are flagged by Ruff E731. Use a named function instead.
♻️ Refactor to use def
- payload = lambda tag: {"json_version": "2.0", "gid": 1, "nodes": [{"tag": tag}], "tensors": []} - - assert operations.detect_operation_key(payload("SDPA_FWD")) == "sdpa_fwd" - assert operations.detect_operation_key(payload("SDPA_BWD")) == "sdpa_bwd" - assert operations.detect_operation_key(payload("SDPA_FP8_FWD")) == "sdpa_fp8_fwd" - assert operations.detect_operation_key(payload("SDPA_FP8_BWD")) == "sdpa_fp8_bwd" - assert operations.detect_operation_key(payload("SDPA_MXFP8_FWD")) == "sdpa_fp8_fwd" - assert operations.detect_operation_key(payload("SDPA_MXFP8_BWD")) == "sdpa_fp8_bwd" + def payload(tag): + return {"json_version": "2.0", "gid": 1, "nodes": [{"tag": tag}], "tensors": []} + + assert operations.detect_operation_key(payload("SDPA_FWD")) == "sdpa_fwd" + assert operations.detect_operation_key(payload("SDPA_BWD")) == "sdpa_bwd" + assert operations.detect_operation_key(payload("SDPA_FP8_FWD")) == "sdpa_fp8_fwd" + assert operations.detect_operation_key(payload("SDPA_FP8_BWD")) == "sdpa_fp8_bwd" + assert operations.detect_operation_key(payload("SDPA_MXFP8_FWD")) == "sdpa_fp8_fwd" + assert operations.detect_operation_key(payload("SDPA_MXFP8_BWD")) == "sdpa_fp8_bwd"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tools/cudnn_repro/tests/test_cudnn_repro_fp8.py` around lines 179 - 186, The test defines payload as a lambda (payload = lambda tag: {...}) which triggers Ruff E731; replace it with a normal named function (def payload(tag): return {...}) keeping the same body and usages so the subsequent asserts calling payload("SDPA_FWD") etc. and operations.detect_operation_key remain unchanged; update only the payload definition in tests/test_cudnn_repro_fp8.py to remove the lambda and use def to satisfy the linter.tools/cudnn_repro/cudnn_repro/log_parser.py (1)
36-41: 💤 Low valueMinor type hint style inconsistency.
This function uses lowercase
tupleandlistfor generics, while_parse_context_entryat line 25 and_apply_tensor_dumpsat line 43 useTuplefrom typing. Consider using a consistent style throughout the module—either the importedTuple/Listor the built-in lowercase forms.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tools/cudnn_repro/cudnn_repro/log_parser.py` around lines 36 - 41, The return type annotation in _parse_tensor_dump currently mixes built-in generics (tuple[int, list[int]] | None) while other functions like _parse_context_entry and _apply_tensor_dumps use typing.Tuple/typing.List; make the module consistent by changing _parse_tensor_dump's signature to use the same typing forms (e.g., Tuple[int, List[int]] | None) or conversely update the other functions to built-in generics—also update/ensure the appropriate imports (Tuple, List) are present or remove them if switching all to built-ins so all functions (_parse_tensor_dump, _parse_context_entry, _apply_tensor_dumps) use the same style.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tools/cudnn_repro/cudnn_repro/log_parser.py`:
- Around line 79-86: The tensor-dump accumulator current_dumps should be reset
whenever an execution marker is seen to avoid misattributing dumps; in the
EXECUTE_GRAPH_PATTERN handling block (where gid = int(match.group(1)),
current_entry = graph_entries_by_gid.get(gid) and
execution_entries.append(_apply_tensor_dumps(current_entry, current_dumps)) is
called) move or add the statement current_dumps = {} so it executes
unconditionally when a new execution marker is encountered (i.e., outside the
inner if current_entry is not None), ensuring dumps are cleared regardless of
whether graph_entries_by_gid returned None.
In `@tools/cudnn_repro/cudnn_repro/sdpa_fp8_bwd.py`:
- Around line 144-152: The code returns ragged offsets using
inputs.get("RAGGED_OFFSET_Q") and inputs.get("RAGGED_OFFSET_KV") but misses the
plural key fallback; update the ragged_offset_q and ragged_offset_kv expressions
to try the singular then fall back to the plural keys (e.g. use
inputs.get("RAGGED_OFFSET_Q") or inputs.get("RAGGED_OFFSETS_Q") and similarly
for KV) before passing to utils.tensor_entry/utils.seq_len so behavior matches
sdpa_fp8_fwd.py and sdpa_fwd.extract_seq_and_ragged; keep the other symbols
(payload, tensors, inputs, utils.seq_len, utils.tensor_entry, seed) unchanged.
In `@tools/cudnn_repro/cudnn_repro/sdpa_fp8_fwd.py`:
- Around line 146-154: The function returning seq_len/ragged fields currently
only looks up "RAGGED_OFFSET_Q" and "RAGGED_OFFSET_KV" in the inputs dict and
misses plural keys; update the lookups to fall back to "RAGGED_OFFSETS_Q" and
"RAGGED_OFFSETS_KV" (like sdpa_fwd.extract_seq_and_ragged does) so
ragged_offset_q and ragged_offset_kv use inputs.get("RAGGED_OFFSET_Q") or
inputs.get("RAGGED_OFFSETS_Q") and inputs.get("RAGGED_OFFSET_KV") or
inputs.get("RAGGED_OFFSETS_KV") respectively when calling
utils.tensor_entry/tensor_seq_len, leaving other fields (seq_len_q, seq_len_kv,
rng_data_seed) unchanged.
---
Nitpick comments:
In `@tools/cudnn_repro/cudnn_repro/log_parser.py`:
- Around line 36-41: The return type annotation in _parse_tensor_dump currently
mixes built-in generics (tuple[int, list[int]] | None) while other functions
like _parse_context_entry and _apply_tensor_dumps use typing.Tuple/typing.List;
make the module consistent by changing _parse_tensor_dump's signature to use the
same typing forms (e.g., Tuple[int, List[int]] | None) or conversely update the
other functions to built-in generics—also update/ensure the appropriate imports
(Tuple, List) are present or remove them if switching all to built-ins so all
functions (_parse_tensor_dump, _parse_context_entry, _apply_tensor_dumps) use
the same style.
In `@tools/cudnn_repro/tests/test_cudnn_repro_fp8.py`:
- Around line 179-186: The test defines payload as a lambda (payload = lambda
tag: {...}) which triggers Ruff E731; replace it with a normal named function
(def payload(tag): return {...}) keeping the same body and usages so the
subsequent asserts calling payload("SDPA_FWD") etc. and
operations.detect_operation_key remain unchanged; update only the payload
definition in tests/test_cudnn_repro_fp8.py to remove the lambda and use def to
satisfy the linter.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 24bba5a3-7814-4537-8eb9-f6ea51621510
📒 Files selected for processing (20)
include/cudnn_frontend/graph_interface.hinclude/cudnn_frontend/graph_properties.hinclude/cudnn_frontend/node/scaled_dot_product_flash_attention.hinclude/cudnn_frontend/node_interface.hinclude/cudnn_frontend/utils/serialize.htools/cudnn_repro/cudnn_repro/log_parser.pytools/cudnn_repro/cudnn_repro/operations.pytools/cudnn_repro/cudnn_repro/sdpa_bwd.pytools/cudnn_repro/cudnn_repro/sdpa_fp8_bwd.pytools/cudnn_repro/cudnn_repro/sdpa_fp8_fwd.pytools/cudnn_repro/cudnn_repro/sdpa_fwd.pytools/cudnn_repro/cudnn_repro/utils.pytools/cudnn_repro/tests/helpers.pytools/cudnn_repro/tests/test_cudnn_repro_bwd.pytools/cudnn_repro/tests/test_cudnn_repro_cli.pytools/cudnn_repro/tests/test_cudnn_repro_closed_loop.pytools/cudnn_repro/tests/test_cudnn_repro_fp8.pytools/cudnn_repro/tests/test_cudnn_repro_graph_uid.pytools/cudnn_repro/tests/test_cudnn_repro_log_parser.pytools/cudnn_repro/tests/test_cudnn_repro_schema.py
💤 Files with no reviewable changes (1)
- tools/cudnn_repro/tests/test_cudnn_repro_graph_uid.py
8a18b07 to
4d9191c
Compare
d2128f3 to
f399013
Compare
|
@cudnn-ci-bot run |
|
🚀 Running mirror pipeline Branch: cudnn-gh/pr-280-baa373e |
e92f270 to
b409d1d
Compare
|
@cudnn-ci-bot run |
|
🚀 Running mirror pipeline Branch: cudnn-gh/pr-280-b409d1d |
b409d1d to
2c58507
Compare
2c58507 to
dfaf2ec
Compare
@coderabbitai ignore
Summary
2.0giduid; materialize missing graph-generated UIDs before graph dump/serialization so graph JSON and execution variant packs can be joined without tensor namestensorsas a list; nodeinputs/outputsreference tensors byuidtools/cudnn_reproparsingExecuting gid ...and tensor dumps asTensor Dump uid: ...tools/cudnn_reproto usegid+uidonly, including ragged seq-len/ragged-offset tensor dumps for SDPA fwd/bwd/FP8 casestest_repro --perfskip reference construction, matching benchmark behavior and avoiding the 256 GiB score/bias allocationGraph JSON shape
Verification
28 passed, 2 skipped3 passed32 passed--perf: passes; no 256 GiB OOMtest_mhas_v2family sweep extracts repro commands for L0/L1, fwd/bwd, ragged, paged, bias, FP8 cases