Skip to content

Add device libs from ROCm 7.0.2 for Julia 1.13#927

Draft
luraess wants to merge 2 commits into
mainfrom
lr/devlibs
Draft

Add device libs from ROCm 7.0.2 for Julia 1.13#927
luraess wants to merge 2 commits into
mainfrom
lr/devlibs

Conversation

@luraess
Copy link
Copy Markdown
Member

@luraess luraess commented Jun 2, 2026

Allow to use device libs jll from ROCm 7.0.2 with Julia 1.13 (both using LLVM 20)

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMDGPU.jl Benchmarks

Details
Benchmark suite Current: 0f2cf51 Previous: 4acab23 Ratio
amdgpu/synchronization/context/device 590 ns 610 ns 0.97
amdgpu/synchronization/stream/blocking 240 ns 250 ns 0.96
amdgpu/synchronization/stream/nonblocking 330 ns 340 ns 0.97
array/accumulate/Float32/1d 87541 ns 87531 ns 1.00
array/accumulate/Float32/dims=1 291384 ns 284324 ns 1.02
array/accumulate/Float32/dims=1L 135202 ns 136202 ns 0.99
array/accumulate/Float32/dims=2 125842 ns 130732 ns 0.96
array/accumulate/Float32/dims=2L 2829419 ns 2828569 ns 1.00
array/accumulate/Int64/1d 94831 ns 97161 ns 0.98
array/accumulate/Int64/dims=1 285454 ns 288994 ns 0.99
array/accumulate/Int64/dims=1L 167902 ns 168202 ns 1.00
array/accumulate/Int64/dims=2 123692 ns 122011 ns 1.01
array/accumulate/Int64/dims=2L 3006432 ns 3011391 ns 1.00
array/broadcast 97071 ns 90771 ns 1.07
array/construct 1750 ns 1720 ns 1.02
array/copy 40061 ns 40620 ns 0.99
array/copyto!/cpu_to_gpu 183123 ns 183343 ns 1.00
array/copyto!/gpu_to_cpu 114612 ns 110042 ns 1.04
array/copyto!/gpu_to_gpu 60211 ns 62360 ns 0.97
array/iteration/findall/bool 185313 ns 182132 ns 1.02
array/iteration/findall/int 195263 ns 197473 ns 0.99
array/iteration/findfirst/bool 120652 ns 119891 ns 1.01
array/iteration/findfirst/int 116661 ns 116792 ns 1.00
array/iteration/findmin/1d 170582 ns 171032 ns 1.00
array/iteration/findmin/2d 156143 ns 156302 ns 1.00
array/iteration/logical 357045 ns 355325 ns 1.00
array/iteration/scalar 289754 ns 298004 ns 0.97
array/permutedims/2d 74761 ns 75601 ns 0.99
array/permutedims/3d 75111 ns 74561 ns 1.01
array/permutedims/4d 77801 ns 77811 ns 1.00
array/random/rand/Float32 52731 ns 51371 ns 1.03
array/random/rand/Int64 56971 ns 58680 ns 0.97
array/random/rand!/Float32 125951 ns 89801 ns 1.40
array/random/rand!/Int64 95351 ns 92831 ns 1.03
array/random/randn/Float32 88011 ns 89121 ns 0.99
array/random/randn!/Float32 105941 ns 106151 ns 1.00
array/reductions/mapreduce/Float32/1d 134312 ns 134252 ns 1.00
array/reductions/mapreduce/Float32/dims=1 95831 ns 95481 ns 1.00
array/reductions/mapreduce/Float32/dims=1L 777671 ns 777531 ns 1.00
array/reductions/mapreduce/Float32/dims=2 97802 ns 97652 ns 1.00
array/reductions/mapreduce/Float32/dims=2L 299864 ns 297084 ns 1.01
array/reductions/mapreduce/Int64/1d 134332 ns 134682 ns 1.00
array/reductions/mapreduce/Int64/dims=1 96061 ns 95482 ns 1.01
array/reductions/mapreduce/Int64/dims=1L 788990 ns 780741 ns 1.01
array/reductions/mapreduce/Int64/dims=2 96991 ns 96681 ns 1.00
array/reductions/mapreduce/Int64/dims=2L 300624 ns 301144 ns 1.00
array/reductions/reduce/Float32/1d 134202 ns 133982 ns 1.00
array/reductions/reduce/Float32/dims=1 95031 ns 95211 ns 1.00
array/reductions/reduce/Float32/dims=1L 775321 ns 779121 ns 1.00
array/reductions/reduce/Float32/dims=2 97802 ns 97351 ns 1.00
array/reductions/reduce/Float32/dims=2L 295894 ns 300314 ns 0.99
array/reductions/reduce/Int64/1d 134601 ns 134572 ns 1.00
array/reductions/reduce/Int64/dims=1 95531 ns 95201 ns 1.00
array/reductions/reduce/Int64/dims=1L 783070 ns 781051 ns 1.00
array/reductions/reduce/Int64/dims=2 96481 ns 96791 ns 1.00
array/reductions/reduce/Int64/dims=2L 298954 ns 309514 ns 0.97
array/reverse/1d 43401 ns 43141 ns 1.01
array/reverse/1dL 76111 ns 75821 ns 1.00
array/reverse/1dL_inplace 104501 ns 109662 ns 0.95
array/reverse/1d_inplace 135322 ns 136772 ns 0.99
array/reverse/2d 52451 ns 52490 ns 1.00
array/reverse/2dL 102461 ns 102021 ns 1.00
array/reverse/2dL_inplace 83991 ns 99182 ns 0.85
array/reverse/2d_inplace 82481 ns 80741 ns 1.02
array/sorting/1d 360615 ns 342015 ns 1.05
integration/byval/reference 39201 ns 39100 ns 1.00
integration/byval/slices=1 40221 ns 40241 ns 1.00
integration/byval/slices=2 144032 ns 136322 ns 1.06
integration/byval/slices=3 239443 ns 239984 ns 1.00
integration/volumerhs 5036499 ns 5019219 ns 1.00
kernel/indexing 109092 ns 66381 ns 1.64
kernel/indexing_checked 52251 ns 72832 ns 0.72
kernel/launch 1250 ns 1300 ns 0.96
kernel/rand 196533 ns 197612 ns 0.99
latency/import 1497961572 ns 1499177203 ns 1.00
latency/precompile 11979260449 ns 11953233850 ns 1.00
latency/ttfp 10409294487 ns 10369770427 ns 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@luraess luraess marked this pull request as draft June 3, 2026 07:41
@luraess
Copy link
Copy Markdown
Member Author

luraess commented Jun 3, 2026

Still working on this. It turns out that there were issues with device libs 6.2.1 and another test failure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants