Fix perf regression caused by tracing#143520
Conversation
|
@bors2 try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Fix perf regression caused by tracing See #143334, this is another alternative that may be worth benchmarking as suggested in #143334 (comment). r? `@RalfJung` <!-- homu-ignore:start --> <!-- If this PR is related to an unstable feature or an otherwise tracked effort, please link to the relevant tracking issue here. If you don't know of a related tracking issue or there are none, feel free to ignore this. This PR will get automatically assigned to a reviewer. In case you would like a specific user to review your work, you can assign it to them by using r? <reviewer name> --> <!-- homu-ignore:end -->
|
No, this is not the closure I meant. I don't think this one will help. What I meant is an API of the form fn with_trace_span<R>(_span: tracing::Span, f: impl FnOnce() -> R) -> R { f() }I.e., the span ends when the closure returns. |
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (3ab9e25): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -0.4%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary -1.3%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 460.756s -> 460.961s (0.04%) |
|
Ah, nice find! My prediction was wrong. :) |
|
☔ The latest upstream changes (presumably #143582) made this pull request unmergeable. Please resolve the merge conflicts. |
Hopefully this will make tracing calls be optimized out properly when tracing is disabled
57aa88e to
e5f7d4d
Compare
|
The Miri subtree was changed cc @rust-lang/miri |
|
Yeah I also think it's not worth it, since it takes the span by value I would expect it to behave like point 4 in my analysis. I rebased and implemented @rustbot ready |
|
I meant a variant of that that takes the span in a closure, since you demonstrated that that is indeed necessary. But, I doubt it'll help so let's just |
…=RalfJung Fix perf regression caused by tracing See rust-lang#143334, this is another alternative that may be worth benchmarking as suggested in rust-lang#143334 (comment). r? `@RalfJung`
Rollup of 11 pull requests Successful merges: - #143177 (Remove false label when `self` resolve failure does not relate to macro) - #143339 (Respect endianness correctly in CheckEnums test suite) - #143426 (clippy fix: indentation) - #143499 (Don't call `predicates_of` on a dummy obligation cause's body id) - #143520 (Fix perf regression caused by tracing) - #143532 (More carefully consider span context when suggesting remove `&mut`) - #143606 (configure.py: Write last key in each section) - #143632 (fix: correct parameter names in LLVMRustBuildMinNum and LLVMRustBuildMaxNum FFI declarations) - #143644 (Add triagebot stdarch mention ping) - #143651 (Win: Use exceptions with empty data for SEH panic exception copies instead of a new panic) - #143660 (Disable docs for `compiler-builtins` and `sysroot`) r? `@ghost` `@rustbot` modify labels: rollup
Rollup of 9 pull requests Successful merges: - #142357 (Simplify LLVM bitcode linker in bootstrap and add tests for it) - #143177 (Remove false label when `self` resolve failure does not relate to macro) - #143339 (Respect endianness correctly in CheckEnums test suite) - #143426 (clippy fix: indentation) - #143475 (tests: Use `cfg_target_has_reliable_f16_f128` in `conv-bits-runtime-const`) - #143499 (Don't call `predicates_of` on a dummy obligation cause's body id) - #143520 (Fix perf regression caused by tracing) - #143532 (More carefully consider span context when suggesting remove `&mut`) - #143606 (configure.py: Write last key in each section) r? `@ghost` `@rustbot` modify labels: rollup
Rollup merge of #143520 - Stypox:enter_trace_span-closure, r=RalfJung Fix perf regression caused by tracing See #143334, this is another alternative that may be worth benchmarking as suggested in #143334 (comment). r? ``@RalfJung``
|
@rust-timer build 45ab0f5 Checking that the small regression from #143667 is caused by this. |
This comment has been minimized.
This comment has been minimized.
|
Ah damn I shouldn't have marked this as rollup, sorry for that. |
|
Finished benchmarking commit (45ab0f5): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -1.0%, secondary 3.3%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary -2.6%, secondary -3.4%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 465.667s -> 464.844s (-0.18%) |
|
Hm, that looks somewhat different from the last perf run we did. I also have no idea how anything can be slowed down by this, so it might just be a bit of random inliner noise... |
|
The regression isn't that bad, but since this PR calimed to "fix perf regression", and it seemingly is either the same or slightly worse, maybe we should revert it? |
|
It helps with const-stress, and arguably it makes a lot more sense to put this span logic into a closure. Previous measurements came back slightly green in the median, this one is slightly red. I am inclined to keep it. |
|
Sorry, I missed the CTFE benchmark. Ok, let's keep it, marking the rollup as triaged. |
|
@rustbot label: +perf-regression-triaged |
Rollup of 9 pull requests Successful merges: - rust-lang/rust#142357 (Simplify LLVM bitcode linker in bootstrap and add tests for it) - rust-lang/rust#143177 (Remove false label when `self` resolve failure does not relate to macro) - rust-lang/rust#143339 (Respect endianness correctly in CheckEnums test suite) - rust-lang/rust#143426 (clippy fix: indentation) - rust-lang/rust#143475 (tests: Use `cfg_target_has_reliable_f16_f128` in `conv-bits-runtime-const`) - rust-lang/rust#143499 (Don't call `predicates_of` on a dummy obligation cause's body id) - rust-lang/rust#143520 (Fix perf regression caused by tracing) - rust-lang/rust#143532 (More carefully consider span context when suggesting remove `&mut`) - rust-lang/rust#143606 (configure.py: Write last key in each section) r? `@ghost` `@rustbot` modify labels: rollup
See #143334, this is another alternative that may be worth benchmarking as suggested in #143334 (comment).
r? @RalfJung