loaded-latency: add mode-2 multipass random chain (v1)#2
Open
realmzstevenmiao wants to merge 1 commit into
Open
loaded-latency: add mode-2 multipass random chain (v1)#2realmzstevenmiao wants to merge 1 commit into
realmzstevenmiao wants to merge 1 commit into
Conversation
Merge the 8-pass random Hamiltonian chain randomization (mode 2)
Defeating Neoverse V2 L2 prefetchers that learn the single-chain pattern used by mode 1.
Changes:
- args.c: add -R/--lat-randomize-mode <n> to select randomization mode
(1 = pair-swap shuffle, 2 = 8-pass random chain). -r remains
shorthand for mode 1.
- memlatency.c:
* Add make_multipass_chain() building LAT_CHAIN_PASSES (=8)
independent Hamiltonian cycles via ptr_t slots per cacheline,
concatenated head-to-tail into one closed loop. Each pass uses an
independent Fisher-Yates random visitation order and a different
slot offset within each cacheline, defeating stride, next-line,
and short-history temporal prefetchers.
* Refactor make_pairswap_chain() to use an external order[] array
(mirroring mode 2). Drop per-node order/index bookkeeping fields;
nodes now carry only ->next.
* Shrink local node_t in lat_initialize() accordingly; union is now
{ void *next; ptr_t ptrs[LAT_CHAIN_PASSES]; } + cacheline pad.
* Replace dead #if 0 debug block with #ifdef LAT_DEBUG_CHAIN that
derives cacheline index/slot from the pointer and buffer base.
Signed-off-by: Stefan Andersson DAG <stefan.dag.andersson@ericsson.com>
Signed-off-by: Steven Miao <Steven.Miao@arm.com>
Author
|
cat smoke-mode1-vs-mode2.sh |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Merge the 8-pass random Hamiltonian chain randomization (mode 2) Defeating Neoverse V2 L2 prefetchers that learn the single-chain pattern used by mode 1.
Changes:
args.c: add -R/--lat-randomize-mode to select randomization mode (1 = pair-swap shuffle, 2 = 8-pass random chain). -r remains shorthand for mode 1.
memlatency.c:
./smoke-mode1-vs-mode2.sh
Using CPU 31 on NUMA node 0
size mode1_ns mode2_ns ratio
32KiB 1.213600 1.213517 1.00x
1MiB 4.149233 6.914998 1.67x
2MiB 6.859965 23.385340 3.41x
4MiB 7.384905 31.402270 4.25x
8MiB 4.850773 34.441810 7.10x
16MiB 5.888085 35.799870 6.08x
64MiB 32.145800 36.532270 1.14x