Fix perf regression caused by tracing by Stypox · Pull Request #143520 · rust-lang/rust

Stypox · 2025-07-06T07:05:46Z

See #143334, this is another alternative that may be worth benchmarking as suggested in #143334 (comment).

r? @RalfJung

rustbot · 2025-07-06T07:05:53Z

Some changes occurred to the CTFE machinery

cc @RalfJung, @oli-obk, @lcnr

Some changes occurred to the CTFE / Miri interpreter

cc @rust-lang/miri

Kobzol · 2025-07-06T08:13:08Z

@bors2 try @rust-timer queue

rust-bors · 2025-07-06T08:13:11Z

⌛ Trying commit 57aa88e with merge 3ab9e25…

To cancel the try build, run the command @bors2 try cancel.

Fix perf regression caused by tracing See #143334, this is another alternative that may be worth benchmarking as suggested in #143334 (comment). r? `@RalfJung`

RalfJung · 2025-07-06T08:36:11Z

No, this is not the closure I meant. I don't think this one will help.

What I meant is an API of the form

fn with_trace_span<R>(_span: tracing::Span, f: impl FnOnce() -> R) -> R { f() }

I.e., the span ends when the closure returns.

rust-bors · 2025-07-06T10:28:55Z

☀️ Try build successful (CI)
Build commit: 3ab9e25 (3ab9e2528f3d5f3f77694dc747bed001f44bf378, parent: febb10d0a2d29278135676783f6a22eb83295981)

rust-timer · 2025-07-06T20:30:34Z

Finished benchmarking commit (3ab9e25): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	0.4%	[0.1%, 0.8%]	16
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.5%	[-1.1%, -0.2%]	15
All ❌✅ (primary)	-	-	0

Max RSS (memory usage)

Results (primary -0.4%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	0.9%	[0.9%, 0.9%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-1.6%	[-1.6%, -1.6%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.4%	[-1.6%, 0.9%]	2

Cycles

Results (primary -1.3%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-1.3%	[-1.3%, -1.3%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-1.3%	[-1.3%, -1.3%]	1

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 460.756s -> 460.961s (0.04%)
Artifact size: 372.14 MiB -> 372.24 MiB (0.03%)

RalfJung · 2025-07-06T20:42:44Z

Ah, nice find! My prediction was wrong. :)
This does seem quite a bit better than the other approach.

bors · 2025-07-07T18:30:09Z

☔ The latest upstream changes (presumably #143582) made this pull request unmergeable. Please resolve the merge conflicts.

RalfJung · 2025-07-08T13:09:40Z

@Stypox Do you think it is worth trying this API? I assume not, so I'd be fine landing this as-is once the conflicts are resolved.

Hopefully this will make tracing calls be optimized out properly when tracing is disabled

rustbot · 2025-07-08T13:37:27Z

The Miri subtree was changed

cc @rust-lang/miri

Stypox · 2025-07-08T13:44:36Z

Yeah I also think it's not worth it, since it takes the span by value I would expect it to behave like point 4 in my analysis. I rebased and implemented enter_trace_span for MiriMachine because in the meantime the Miri tree was updated.

@rustbot ready

RalfJung · 2025-07-08T13:55:44Z

I meant a variant of that that takes the span in a closure, since you demonstrated that that is indeed necessary.

But, I doubt it'll help so let's just
@bors r+ rollup

bors · 2025-07-08T13:55:48Z

📌 Commit e5f7d4d has been approved by RalfJung

It is now in the queue for this repository.

…=RalfJung Fix perf regression caused by tracing See rust-lang#143334, this is another alternative that may be worth benchmarking as suggested in rust-lang#143334 (comment). r? `@RalfJung`

Rollup of 11 pull requests Successful merges: - #143177 (Remove false label when `self` resolve failure does not relate to macro) - #143339 (Respect endianness correctly in CheckEnums test suite) - #143426 (clippy fix: indentation) - #143499 (Don't call `predicates_of` on a dummy obligation cause's body id) - #143520 (Fix perf regression caused by tracing) - #143532 (More carefully consider span context when suggesting remove `&mut`) - #143606 (configure.py: Write last key in each section) - #143632 (fix: correct parameter names in LLVMRustBuildMinNum and LLVMRustBuildMaxNum FFI declarations) - #143644 (Add triagebot stdarch mention ping) - #143651 (Win: Use exceptions with empty data for SEH panic exception copies instead of a new panic) - #143660 (Disable docs for `compiler-builtins` and `sysroot`) r? `@ghost` `@rustbot` modify labels: rollup

Rollup of 9 pull requests Successful merges: - #142357 (Simplify LLVM bitcode linker in bootstrap and add tests for it) - #143177 (Remove false label when `self` resolve failure does not relate to macro) - #143339 (Respect endianness correctly in CheckEnums test suite) - #143426 (clippy fix: indentation) - #143475 (tests: Use `cfg_target_has_reliable_f16_f128` in `conv-bits-runtime-const`) - #143499 (Don't call `predicates_of` on a dummy obligation cause's body id) - #143520 (Fix perf regression caused by tracing) - #143532 (More carefully consider span context when suggesting remove `&mut`) - #143606 (configure.py: Write last key in each section) r? `@ghost` `@rustbot` modify labels: rollup

Rollup merge of #143520 - Stypox:enter_trace_span-closure, r=RalfJung Fix perf regression caused by tracing See #143334, this is another alternative that may be worth benchmarking as suggested in #143334 (comment). r? ``@RalfJung``

Kobzol · 2025-07-09T10:57:46Z

@rust-timer build 45ab0f5

Checking that the small regression from #143667 is caused by this.

RalfJung · 2025-07-09T11:22:26Z

Ah damn I shouldn't have marked this as rollup, sorry for that.

rust-timer · 2025-07-09T13:15:59Z

Finished benchmarking commit (45ab0f5): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	0.1%	[0.0%, 0.2%]	20
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.4%	[-0.9%, -0.0%]	5
All ❌✅ (primary)	-	-	0

Max RSS (memory usage)

Results (primary -1.0%, secondary 3.3%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	1.4%	[1.4%, 1.4%]	1
Regressions ❌ (secondary)	4.2%	[1.2%, 7.0%]	14
Improvements ✅ (primary)	-3.4%	[-3.4%, -3.4%]	1
Improvements ✅ (secondary)	-2.5%	[-2.5%, -2.4%]	2
All ❌✅ (primary)	-1.0%	[-3.4%, 1.4%]	2

Cycles

Results (primary -2.6%, secondary -3.4%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	2.8%	[2.8%, 2.8%]	1
Improvements ✅ (primary)	-2.6%	[-3.0%, -2.1%]	4
Improvements ✅ (secondary)	-4.3%	[-10.3%, -2.0%]	7
All ❌✅ (primary)	-2.6%	[-3.0%, -2.1%]	4

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 465.667s -> 464.844s (-0.18%)
Artifact size: 372.26 MiB -> 374.50 MiB (0.60%)

RalfJung · 2025-07-09T13:18:39Z

Hm, that looks somewhat different from the last perf run we did.

I also have no idea how anything can be slowed down by this, so it might just be a bit of random inliner noise...

Kobzol · 2025-07-09T14:34:05Z

The regression isn't that bad, but since this PR calimed to "fix perf regression", and it seemingly is either the same or slightly worse, maybe we should revert it?

RalfJung · 2025-07-09T14:39:09Z

It helps with const-stress, and arguably it makes a lot more sense to put this span logic into a closure. Previous measurements came back slightly green in the median, this one is slightly red. I am inclined to keep it.

Kobzol · 2025-07-09T14:40:24Z

Sorry, I missed the CTFE benchmark. Ok, let's keep it, marking the rollup as triaged.

Kobzol · 2025-07-09T14:40:42Z

@rustbot label: +perf-regression-triaged

Rollup of 9 pull requests Successful merges: - rust-lang/rust#142357 (Simplify LLVM bitcode linker in bootstrap and add tests for it) - rust-lang/rust#143177 (Remove false label when `self` resolve failure does not relate to macro) - rust-lang/rust#143339 (Respect endianness correctly in CheckEnums test suite) - rust-lang/rust#143426 (clippy fix: indentation) - rust-lang/rust#143475 (tests: Use `cfg_target_has_reliable_f16_f128` in `conv-bits-runtime-const`) - rust-lang/rust#143499 (Don't call `predicates_of` on a dummy obligation cause's body id) - rust-lang/rust#143520 (Fix perf regression caused by tracing) - rust-lang/rust#143532 (More carefully consider span context when suggesting remove `&mut`) - rust-lang/rust#143606 (configure.py: Write last key in each section) r? `@ghost` `@rustbot` modify labels: rollup

rustbot assigned RalfJung Jul 6, 2025

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jul 6, 2025

Stypox mentioned this pull request Jul 6, 2025

Always inline InterpCx::layout_of after perf regression #143334

Closed

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 6, 2025

This comment has been minimized.

Sign in to view

rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Jul 6, 2025

Stypox added 5 commits July 8, 2025 15:23

Always inline InterpCx::layout_of after perf regression

c4bf37d

Replace TRACING_ENABLED with enter_trace_span()

07143af

Hopefully this will make tracing calls be optimized out properly when tracing is disabled

Add inline(always) to Machine::enter_trace_span

3cacaa7

Make enter_trace_span take a closure for better optimization

e8c8330

Implement enter_trace_span() in MiriMachine

e5f7d4d

Stypox force-pushed the enter_trace_span-closure branch from 57aa88e to e5f7d4d Compare July 8, 2025 13:37

bors removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 8, 2025

bors added the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Jul 8, 2025

jhpratt mentioned this pull request Jul 9, 2025

Rollup of 11 pull requests #143664

Closed

tgross35 mentioned this pull request Jul 9, 2025

Rollup of 9 pull requests #143667

Merged

bors merged commit 00aa4e1 into rust-lang:master Jul 9, 2025
11 checks passed

rustbot added this to the 1.90.0 milestone Jul 9, 2025

This comment has been minimized.

Sign in to view

rustbot added the perf-regression-triaged The performance regression has been triaged. label Jul 9, 2025

Uh oh!

Conversation

Stypox commented Jul 6, 2025 • edited by rustbot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rustbot commented Jul 6, 2025

Uh oh!

Kobzol commented Jul 6, 2025

Uh oh!

This comment has been minimized.

rust-bors bot commented Jul 6, 2025

Uh oh!

RalfJung commented Jul 6, 2025

Uh oh!

rust-bors bot commented Jul 6, 2025

Uh oh!

This comment has been minimized.

rust-timer commented Jul 6, 2025

Overall result: ❌✅ regressions and improvements - please read the text below

Uh oh!

RalfJung commented Jul 6, 2025

Uh oh!

bors commented Jul 7, 2025

Uh oh!

RalfJung commented Jul 8, 2025

Uh oh!

rustbot commented Jul 8, 2025

Uh oh!

Stypox commented Jul 8, 2025

Uh oh!

RalfJung commented Jul 8, 2025

Uh oh!

bors commented Jul 8, 2025

Uh oh!

Uh oh!

Kobzol commented Jul 9, 2025

Uh oh!

This comment has been minimized.

RalfJung commented Jul 9, 2025

Uh oh!

rust-timer commented Jul 9, 2025

Overall result: ❌✅ regressions and improvements - please read the text below

Uh oh!

RalfJung commented Jul 9, 2025

Uh oh!

Kobzol commented Jul 9, 2025

Uh oh!

RalfJung commented Jul 9, 2025

Uh oh!

Kobzol commented Jul 9, 2025

Uh oh!

Kobzol commented Jul 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Stypox commented Jul 6, 2025 •

edited by rustbot

Loading