Commit 0bb129e
Fix peer access test synchronization issue
Add missing device synchronization calls to ensure resident device
operations are complete before peer device accesses memory.
The test was failing because when dev0 accesses peer memory from dev1,
PatternGen only syncs dev0 (the accessing device) but not dev1 (the
resident device). This can cause synchronization issues where dev0
reads peer memory before dev1 has completed all operations.
Changes:
- Sync dev1 after IPC import (Test 1) to ensure import operations complete
- Sync dev1 after granting peer access (Test 3) before dev0 accesses
peer memory
This follows CUDA best practices: when accessing peer memory, sync the
resident device to ensure its operations are complete before the peer
device reads the memory.
Fixes test failures on ARM64 with CUDA 13.2 RC025.
Co-authored-by: Cursor <cursoragent@cursor.com>1 parent 6bdcda0 commit 0bb129e
1 file changed
Lines changed: 5 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
94 | 94 | | |
95 | 95 | | |
96 | 96 | | |
| 97 | + | |
| 98 | + | |
97 | 99 | | |
98 | 100 | | |
99 | 101 | | |
| |||
106 | 108 | | |
107 | 109 | | |
108 | 110 | | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
109 | 114 | | |
110 | 115 | | |
111 | 116 | | |
| |||
0 commit comments