Fix atomic cmpexchg intrinsic signature#794
Merged
Merged
Conversation
Co-Authored-By: Tim Besard <383068+maleadt@users.noreply.github.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #794 +/- ##
=======================================
Coverage 81.56% 81.56%
=======================================
Files 66 66
Lines 3141 3141
=======================================
Hits 2562 2562
Misses 579 579 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Contributor
There was a problem hiding this comment.
Metal Benchmarks
Details
| Benchmark suite | Current: 7ec5fbf | Previous: ae7ee24 | Ratio |
|---|---|---|---|
array/accumulate/Float32/1d |
816458 ns |
812000 ns |
1.01 |
array/accumulate/Float32/dims=1 |
997667 ns |
981104 ns |
1.02 |
array/accumulate/Float32/dims=1L |
10515187.5 ns |
10016291 ns |
1.05 |
array/accumulate/Float32/dims=2 |
1280542 ns |
1244500 ns |
1.03 |
array/accumulate/Float32/dims=2L |
4841417 ns |
6573125.5 ns |
0.74 |
array/accumulate/Int64/1d |
982083 ns |
961959 ns |
1.02 |
array/accumulate/Int64/dims=1 |
1113417 ns |
1142458 ns |
0.97 |
array/accumulate/Int64/dims=1L |
12516417 ns |
11897834 ns |
1.05 |
array/accumulate/Int64/dims=2 |
1473479.5 ns |
1458500 ns |
1.01 |
array/accumulate/Int64/dims=2L |
9146792 ns |
9470625 ns |
0.97 |
array/broadcast |
334416.5 ns |
363750 ns |
0.92 |
array/construct |
5625 ns |
5791 ns |
0.97 |
array/permutedims/2d |
627875 ns |
614166.5 ns |
1.02 |
array/permutedims/3d |
1129625 ns |
1128000 ns |
1.00 |
array/permutedims/4d |
1358750 ns |
1977209 ns |
0.69 |
array/private/copy |
358041.5 ns |
422833.5 ns |
0.85 |
array/private/copyto!/cpu_to_gpu |
239667 ns |
367208 ns |
0.65 |
array/private/copyto!/gpu_to_cpu |
235875 ns |
364333.5 ns |
0.65 |
array/private/copyto!/gpu_to_gpu |
254479.5 ns |
332458 ns |
0.77 |
array/private/iteration/findall/bool |
1129792 ns |
1071333 ns |
1.05 |
array/private/iteration/findall/int |
1292000 ns |
1244125 ns |
1.04 |
array/private/iteration/findfirst/bool |
1260625 ns |
1326875 ns |
0.95 |
array/private/iteration/findfirst/int |
1305792 ns |
1360792 ns |
0.96 |
array/private/iteration/findmin/1d |
1430250 ns |
1453000 ns |
0.98 |
array/private/iteration/findmin/2d |
1183584 ns |
1209959 ns |
0.98 |
array/private/iteration/logical |
1770417 ns |
1627666 ns |
1.09 |
array/private/iteration/scalar |
1693500 ns |
2603000 ns |
0.65 |
array/random/rand/Float32 |
599792 ns |
618416.5 ns |
0.97 |
array/random/rand/Int64 |
654833 ns |
679750 ns |
0.96 |
array/random/rand!/Float32 |
526750 ns |
552084 ns |
0.95 |
array/random/rand!/Int64 |
482125 ns |
504416 ns |
0.96 |
array/random/randn/Float32 |
572166.5 ns |
600000 ns |
0.95 |
array/random/randn!/Float32 |
487000 ns |
534333 ns |
0.91 |
array/reductions/mapreduce/Float32/1d |
496292 ns |
734458.5 ns |
0.68 |
array/reductions/mapreduce/Float32/dims=1 |
459458 ns |
499667 ns |
0.92 |
array/reductions/mapreduce/Float32/dims=1L |
717937 ns |
765000 ns |
0.94 |
array/reductions/mapreduce/Float32/dims=2 |
461854.5 ns |
500750 ns |
0.92 |
array/reductions/mapreduce/Float32/dims=2L |
1033104 ns |
1333208 ns |
0.77 |
array/reductions/mapreduce/Int64/1d |
792375 ns |
924083.5 ns |
0.86 |
array/reductions/mapreduce/Int64/dims=1 |
771021 ns |
789334 ns |
0.98 |
array/reductions/mapreduce/Int64/dims=1L |
1281209 ns |
1667542 ns |
0.77 |
array/reductions/mapreduce/Int64/dims=2 |
955416.5 ns |
989083 ns |
0.97 |
array/reductions/mapreduce/Int64/dims=2L |
2261292 ns |
2259083 ns |
1.00 |
array/reductions/reduce/Float32/1d |
492083 ns |
727333 ns |
0.68 |
array/reductions/reduce/Float32/dims=1 |
461000 ns |
495416 ns |
0.93 |
array/reductions/reduce/Float32/dims=1L |
715770.5 ns |
844021 ns |
0.85 |
array/reductions/reduce/Float32/dims=2 |
461416.5 ns |
494041 ns |
0.93 |
array/reductions/reduce/Float32/dims=2L |
1039167 ns |
1347166.5 ns |
0.77 |
array/reductions/reduce/Int64/1d |
791750 ns |
913917 ns |
0.87 |
array/reductions/reduce/Int64/dims=1 |
766125 ns |
775500 ns |
0.99 |
array/reductions/reduce/Int64/dims=1L |
1218833 ns |
1723125 ns |
0.71 |
array/reductions/reduce/Int64/dims=2 |
961041 ns |
953708 ns |
1.01 |
array/reductions/reduce/Int64/dims=2L |
2261875 ns |
2260834 ns |
1.00 |
array/shared/copy |
162750 ns |
237020.5 ns |
0.69 |
array/shared/copyto!/cpu_to_gpu |
41542 ns |
41125 ns |
1.01 |
array/shared/copyto!/gpu_to_cpu |
47833 ns |
42042 ns |
1.14 |
array/shared/copyto!/gpu_to_gpu |
40209 ns |
41000 ns |
0.98 |
array/shared/iteration/findall/bool |
1135166 ns |
1075541 ns |
1.06 |
array/shared/iteration/findall/int |
1291708 ns |
1246917 ns |
1.04 |
array/shared/iteration/findfirst/bool |
1040479.5 ns |
1078458.5 ns |
0.96 |
array/shared/iteration/findfirst/int |
1087709 ns |
1080625 ns |
1.01 |
array/shared/iteration/findmin/1d |
1207687.5 ns |
1210417 ns |
1.00 |
array/shared/iteration/findmin/2d |
1184083 ns |
1220500 ns |
0.97 |
array/shared/iteration/logical |
1624542 ns |
1470292 ns |
1.10 |
array/shared/iteration/scalar |
5826.5 ns |
5937.5 ns |
0.98 |
integration/byval/reference |
1173729.5 ns |
1157812.5 ns |
1.01 |
integration/byval/slices=1 |
1174667 ns |
1157250 ns |
1.02 |
integration/byval/slices=2 |
2118166.5 ns |
2087500 ns |
1.01 |
integration/byval/slices=3 |
20220250 ns |
7967250 ns |
2.54 |
integration/metaldevrt |
457792 ns |
463083 ns |
0.99 |
kernel/indexing |
316916.5 ns |
354291 ns |
0.89 |
kernel/indexing_checked |
329583 ns |
356375 ns |
0.92 |
kernel/launch |
13542 ns |
13458 ns |
1.01 |
kernel/rand |
345875 ns |
372625 ns |
0.93 |
latency/import |
1408823896 ns |
1387370125 ns |
1.02 |
latency/precompile |
30185175187.5 ns |
29863472292 ns |
1.01 |
latency/ttfp |
1682623583.5 ns |
1656507125 ns |
1.02 |
metal/synchronization/context |
814.3203883495146 ns |
835.6741573033707 ns |
0.97 |
metal/synchronization/stream |
429.02010050251255 ns |
439.8131313131313 ns |
0.98 |
This comment was automatically generated by workflow using github-action-benchmark.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The macOS 26 shader validation layer gets upset otherwise.