Conversation
This comment was marked as resolved.
This comment was marked as resolved.
|
r? @thomcc (rust-highfive has picked a reviewer for you, use r? to override) |
|
This regresses the When looking into this I noticed that |
|
Thanks, @timvermeulen, I'll look at that one. I've sent PR #100220 to fix @rustbot author Update 2022-08-24: I've rebased atop the other PR, but still not investigated that perf test, so it's not really review-ready yet. |
…riplett Properly forward `ByRefSized::fold` to the inner iterator cc `@timvermeulen,` who noticed this mistake in rust-lang#100214 (comment)
…riplett Properly forward `ByRefSized::fold` to the inner iterator cc ``@timvermeulen,`` who noticed this mistake in rust-lang#100214 (comment)
…riplett Properly forward `ByRefSized::fold` to the inner iterator cc ```@timvermeulen,``` who noticed this mistake in rust-lang#100214 (comment)
…riplett Properly forward `ByRefSized::fold` to the inner iterator cc ``@timvermeulen,`` who noticed this mistake in rust-lang#100214 (comment)
84f0eaf to
14e0e7b
Compare
Properly forward `ByRefSized::fold` to the inner iterator cc ``@timvermeulen,`` who noticed this mistake in rust-lang/rust#100214 (comment)
14e0e7b to
c08ec39
Compare
This comment has been minimized.
This comment has been minimized.
4104932 to
c43b960
Compare
|
@timvermeulen My mistake turned out to be an obvious one: I'd forgotten to inline some methods, and because it's a non-generic type that meant the benches ended up needing to I now get exactly the same inner loop for that .LBB0_5:
mov edx, dword ptr [rbx + rcx]
rol edx, 8
bswap edx
mov dword ptr [rax + rcx], edx
add rcx, 4
cmp rdi, rcx
jne .LBB0_5@rustbot ready |
|
@bors try @rust-timer queue |
|
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
|
@rustbot author |
`.into_iter()` on arrays was slower than it needed to be (especially compared to slice iterator) since it uses `Range<usize>`, which needs to handle degenerate ranges like `10..4`. This PR adds an internal `IndexRange` type that's like `Range<usize>` but with a safety invariant that means it doesn't need to worry about those cases -- it only handles `start <= end` -- and thus can give LLVM more information to optimize better. I added one simple demonstration of the improvement as a codegen test.
fb3a7d8 to
6dbd9a2
Compare
|
Oh, it's failing on the UB trap check. I've ignored the codegen test for debug, since it's a @bors r=thomcc |
|
⌛ Testing commit 6dbd9a2 with merge da70c0543ba1f77142453c5c82e0a1c1e4e327d3... |
|
⌛ Testing commit 6dbd9a2 with merge ebbb1c6a0f71f7a95483d2fef3bed8c804b21cbe... |
|
💔 Test failed - checks-actions |
|
Looks like the retry didn't put it back in the queue right? I'll try it again. @bors retry |
|
☀️ Test successful - checks-actions |
|
Finished benchmarking commit (4ecfdfa): comparison URL. Overall result: ✅ improvements - no action needed@rustbot label: -perf-regression Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Footnotes |
|
Huh, interesting that the post-merge perf run is so different (thankfully better!) from the original: #100214 (comment) |
.into_iter()on arrays was slower than it needed to be (especially compared to slice iterator) since it usesRange<usize>, which needs to handle degenerate ranges like10..4.This PR adds an internal
IndexRangetype that's likeRange<usize>but with a safety invariant that means it doesn't need to worry about those cases -- it only handlesstart <= end-- and thus can give LLVM more information to optimize better.I added one simple demonstration of the improvement as a codegen test.
(
vec::IntoIteruses pointers instead of indexes, so doesn't have this problem, but that only works because its elements are boxed.array::IntoItercan't use pointers because that would keep it from being movable.)