Add LRC microbenchmarks for L2 request coalescing analysis#82
Open
William-An wants to merge 2 commits intodevfrom
Open
Add LRC microbenchmarks for L2 request coalescing analysis#82William-An wants to merge 2 commits intodevfrom
William-An wants to merge 2 commits intodevfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
lrc_max_mergedmicrobenchmark to measure the maximum number of L2 sector requests that can be merged per LRC (L2 Request Coalescer) entry on NVIDIA GPUslrc_queue_sizestub with documented TODO for future work on discovering LRC queue depth per L2 sub-partition.gitignoreto exclude IDE/AI tool directoriesDetails
lrc_max_merged
Measures how many concurrent sector requests from multiple warps/threadblocks can be coalesced into a single L2 lookup by the LRC. Supports three launch modes:
Designed to be profiled with
ncu— includes a run script (run_lrc_merged.sh) with relevant L2 sector metrics.lrc_queue_size (stub)
Documents the open challenges for measuring LRC queue depth, including the need to reverse-engineer L2 address-to-sub-partition mapping.
Corresponds to Accel-Sim config parameter:
-gpgpu_lrc_max_entriesTest plan
lrc_max_mergedwithmakeon SM_90+ targetrun_lrc_merged.shwith ncu and verify L2 sector countslrc_queue_sizebuilds and prints stub message🤖 Generated with Claude Code