Skip to content

Fix atomic cmpexchg intrinsic signature#794

Merged
maleadt merged 1 commit into
JuliaGPU:mainfrom
christiangnrd:cmpexchg
May 30, 2026
Merged

Fix atomic cmpexchg intrinsic signature#794
maleadt merged 1 commit into
JuliaGPU:mainfrom
christiangnrd:cmpexchg

Conversation

@christiangnrd
Copy link
Copy Markdown
Member

The macOS 26 shader validation layer gets upset otherwise.

Co-Authored-By: Tim Besard <383068+maleadt@users.noreply.github.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 29, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 81.56%. Comparing base (ae7ee24) to head (7ec5fbf).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #794   +/-   ##
=======================================
  Coverage   81.56%   81.56%           
=======================================
  Files          66       66           
  Lines        3141     3141           
=======================================
  Hits         2562     2562           
  Misses        579      579           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metal Benchmarks

Details
Benchmark suite Current: 7ec5fbf Previous: ae7ee24 Ratio
array/accumulate/Float32/1d 816458 ns 812000 ns 1.01
array/accumulate/Float32/dims=1 997667 ns 981104 ns 1.02
array/accumulate/Float32/dims=1L 10515187.5 ns 10016291 ns 1.05
array/accumulate/Float32/dims=2 1280542 ns 1244500 ns 1.03
array/accumulate/Float32/dims=2L 4841417 ns 6573125.5 ns 0.74
array/accumulate/Int64/1d 982083 ns 961959 ns 1.02
array/accumulate/Int64/dims=1 1113417 ns 1142458 ns 0.97
array/accumulate/Int64/dims=1L 12516417 ns 11897834 ns 1.05
array/accumulate/Int64/dims=2 1473479.5 ns 1458500 ns 1.01
array/accumulate/Int64/dims=2L 9146792 ns 9470625 ns 0.97
array/broadcast 334416.5 ns 363750 ns 0.92
array/construct 5625 ns 5791 ns 0.97
array/permutedims/2d 627875 ns 614166.5 ns 1.02
array/permutedims/3d 1129625 ns 1128000 ns 1.00
array/permutedims/4d 1358750 ns 1977209 ns 0.69
array/private/copy 358041.5 ns 422833.5 ns 0.85
array/private/copyto!/cpu_to_gpu 239667 ns 367208 ns 0.65
array/private/copyto!/gpu_to_cpu 235875 ns 364333.5 ns 0.65
array/private/copyto!/gpu_to_gpu 254479.5 ns 332458 ns 0.77
array/private/iteration/findall/bool 1129792 ns 1071333 ns 1.05
array/private/iteration/findall/int 1292000 ns 1244125 ns 1.04
array/private/iteration/findfirst/bool 1260625 ns 1326875 ns 0.95
array/private/iteration/findfirst/int 1305792 ns 1360792 ns 0.96
array/private/iteration/findmin/1d 1430250 ns 1453000 ns 0.98
array/private/iteration/findmin/2d 1183584 ns 1209959 ns 0.98
array/private/iteration/logical 1770417 ns 1627666 ns 1.09
array/private/iteration/scalar 1693500 ns 2603000 ns 0.65
array/random/rand/Float32 599792 ns 618416.5 ns 0.97
array/random/rand/Int64 654833 ns 679750 ns 0.96
array/random/rand!/Float32 526750 ns 552084 ns 0.95
array/random/rand!/Int64 482125 ns 504416 ns 0.96
array/random/randn/Float32 572166.5 ns 600000 ns 0.95
array/random/randn!/Float32 487000 ns 534333 ns 0.91
array/reductions/mapreduce/Float32/1d 496292 ns 734458.5 ns 0.68
array/reductions/mapreduce/Float32/dims=1 459458 ns 499667 ns 0.92
array/reductions/mapreduce/Float32/dims=1L 717937 ns 765000 ns 0.94
array/reductions/mapreduce/Float32/dims=2 461854.5 ns 500750 ns 0.92
array/reductions/mapreduce/Float32/dims=2L 1033104 ns 1333208 ns 0.77
array/reductions/mapreduce/Int64/1d 792375 ns 924083.5 ns 0.86
array/reductions/mapreduce/Int64/dims=1 771021 ns 789334 ns 0.98
array/reductions/mapreduce/Int64/dims=1L 1281209 ns 1667542 ns 0.77
array/reductions/mapreduce/Int64/dims=2 955416.5 ns 989083 ns 0.97
array/reductions/mapreduce/Int64/dims=2L 2261292 ns 2259083 ns 1.00
array/reductions/reduce/Float32/1d 492083 ns 727333 ns 0.68
array/reductions/reduce/Float32/dims=1 461000 ns 495416 ns 0.93
array/reductions/reduce/Float32/dims=1L 715770.5 ns 844021 ns 0.85
array/reductions/reduce/Float32/dims=2 461416.5 ns 494041 ns 0.93
array/reductions/reduce/Float32/dims=2L 1039167 ns 1347166.5 ns 0.77
array/reductions/reduce/Int64/1d 791750 ns 913917 ns 0.87
array/reductions/reduce/Int64/dims=1 766125 ns 775500 ns 0.99
array/reductions/reduce/Int64/dims=1L 1218833 ns 1723125 ns 0.71
array/reductions/reduce/Int64/dims=2 961041 ns 953708 ns 1.01
array/reductions/reduce/Int64/dims=2L 2261875 ns 2260834 ns 1.00
array/shared/copy 162750 ns 237020.5 ns 0.69
array/shared/copyto!/cpu_to_gpu 41542 ns 41125 ns 1.01
array/shared/copyto!/gpu_to_cpu 47833 ns 42042 ns 1.14
array/shared/copyto!/gpu_to_gpu 40209 ns 41000 ns 0.98
array/shared/iteration/findall/bool 1135166 ns 1075541 ns 1.06
array/shared/iteration/findall/int 1291708 ns 1246917 ns 1.04
array/shared/iteration/findfirst/bool 1040479.5 ns 1078458.5 ns 0.96
array/shared/iteration/findfirst/int 1087709 ns 1080625 ns 1.01
array/shared/iteration/findmin/1d 1207687.5 ns 1210417 ns 1.00
array/shared/iteration/findmin/2d 1184083 ns 1220500 ns 0.97
array/shared/iteration/logical 1624542 ns 1470292 ns 1.10
array/shared/iteration/scalar 5826.5 ns 5937.5 ns 0.98
integration/byval/reference 1173729.5 ns 1157812.5 ns 1.01
integration/byval/slices=1 1174667 ns 1157250 ns 1.02
integration/byval/slices=2 2118166.5 ns 2087500 ns 1.01
integration/byval/slices=3 20220250 ns 7967250 ns 2.54
integration/metaldevrt 457792 ns 463083 ns 0.99
kernel/indexing 316916.5 ns 354291 ns 0.89
kernel/indexing_checked 329583 ns 356375 ns 0.92
kernel/launch 13542 ns 13458 ns 1.01
kernel/rand 345875 ns 372625 ns 0.93
latency/import 1408823896 ns 1387370125 ns 1.02
latency/precompile 30185175187.5 ns 29863472292 ns 1.01
latency/ttfp 1682623583.5 ns 1656507125 ns 1.02
metal/synchronization/context 814.3203883495146 ns 835.6741573033707 ns 0.97
metal/synchronization/stream 429.02010050251255 ns 439.8131313131313 ns 0.98

This comment was automatically generated by workflow using github-action-benchmark.

@maleadt maleadt merged commit 45c44ae into JuliaGPU:main May 30, 2026
15 checks passed
@christiangnrd christiangnrd deleted the cmpexchg branch May 30, 2026 11:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants