Manually implement PartialEq for Option<T> and specialize non-nullable types by clubby789 · Pull Request #103556 · rust-lang/rust

clubby789 · 2022-10-26T01:10:24Z

This PR manually implements PartialEq and StructuralPartialEq for Option, which seems to produce slightly better codegen than the automatically derived implementation.

It also allows specializing on the core::num::NonZero* and core::ptr::NonNull types, taking advantage of the niche optimization by transmuting the Option<T> to T to be compared directly, which can be done in just two instructions.

A comparison of the original, new and specialized code generation is available here.

rustbot · 2022-10-26T01:10:27Z

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with @rustbot label +T-libs-api -T-libs to tag it appropriately. If this PR contains changes to any unstable APIs please edit the PR description to add a link to the relevant API Change Proposal or create one if you haven't already. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

Stabilizing library features
Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
Changing public documentation in ways that create new stability guarantees
Changing observable runtime behavior of library APIs

rust-highfive · 2022-10-26T01:10:28Z

r? @m-ou-se

(rust-highfive has picked a reviewer for you, use r? to override)

compiler-errors · 2022-10-26T01:18:40Z

perf run was requested @bors try @rust-timer queue

rust-timer · 2022-10-26T01:18:42Z

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

bors · 2022-10-26T01:18:53Z

⌛ Trying commit d0a58800564093f4b6db3ae0b5f76547c7564f4b with merge 05058ddce07865ec73eedb7c8d2cddd50d96d959...

thomcc

This is a pretty significant codegen win, although I'm surprised we don't get this already. Needs a fix for some UB, tho (fixing the UB in godbolt still produces the codegen win).

library/core/src/option.rs

compiler-errors · 2022-10-26T01:29:33Z

@bors try @rust-timer queue

rust-timer · 2022-10-26T01:29:35Z

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

bors · 2022-10-26T01:29:45Z

⌛ Trying commit a8140e59b80fae0c3b27e3a1d6e0b176dd5fb757 with merge 8032e517bc4d1e2309051ba99ef9c8beaff83a82...

clubby789 · 2022-10-26T01:58:10Z

Made the NonNull test support ptr or i8* since CI was producing different results to my local build.

library/core/src/option.rs

compiler-errors · 2022-10-26T03:34:18Z

@bors try @rust-timer queue

rust-timer · 2022-10-26T03:34:20Z

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

bors · 2022-10-26T03:34:30Z

⌛ Trying commit 5ed28fed2a0b927a267da146d51ec99bce8bc92f with merge a041a05c3184bb8c38b8940422e8951e99b6d3f1...

library/core/src/option.rs

Noratrieb · 2022-10-26T05:31:35Z

rustc internally uses rustc_scalar_valid attributes in its index macro. Would it make sense to apply this specialization for rustc indices as well? I don't think options are compared very often but it could be a win nevertheless.

bors · 2022-10-26T05:39:07Z

☀️ Try build successful - checks-actions
Build commit: a041a05c3184bb8c38b8940422e8951e99b6d3f1 (a041a05c3184bb8c38b8940422e8951e99b6d3f1)

rust-timer · 2022-10-26T05:39:09Z

Queued a041a05c3184bb8c38b8940422e8951e99b6d3f1 with parent 6365e5a, future comparison URL.

scottmcm · 2022-10-26T05:54:06Z

I went to try to see if there's anything we could do to make LLVM understand this, and realized that right now we're shooting outselves in the foot: https://rust.godbolt.org/z/Ye5xr8P8x

What's PartialEq for NonZero doing right now? Well, apparently it's derived and whatever's going on with the derive it has no range information:

pub fn demo_std(x: &NonZeroU32, y: &NonZeroU32) -> bool {
    x == y
}

define noundef zeroext i1 @_ZN7example8demo_std17hcce6db1e74f1c1d4E(ptr noalias nocapture noundef readonly align 4 dereferenceable(4) %0, ptr noalias nocapture noundef readonly align 4 dereferenceable(4) %1) unnamed_addr #0 {
  %_9 = load i32, ptr %0, align 4
  %_10 = load i32, ptr %1, align 4
  %2 = icmp eq i32 %_9, %_10
  ret i1 %2
}

Whereas if you write the obvious implementation yourself

pub fn demo_obvious(x: &NonZeroU32, y: &NonZeroU32) -> bool {
    x.get() == y.get()
}

Then the loads get the !range metadata saying that it's nonzero

efine noundef zeroext i1 @_ZN7example12demo_obvious17haee70b6eb73f133dE(ptr noalias nocapture noundef readonly align 4 dereferenceable(4) %x, ptr noalias nocapture noundef readonly align 4 dereferenceable(4) %y) unnamed_addr #0 {
  %self = load i32, ptr %x, align 4, !range !2, !noundef !3
  %self1 = load i32, ptr %y, align 4, !range !2, !noundef !3
  %0 = icmp eq i32 %self, %self1
  ret i1 %0
}

!2 = !{i32 1, i32 0}

It's possible that LLVM still might not be able to optimize this even with that for other reasons (#49572 (comment)), but I think we should at least find out whether giving LLVM the obvious information would be enough to let it make this transform -- it would be great if we could solve this in the NonZero code or in the rustc_layout_scalar_valid_range_start code and thus not need to specialize every use.

EDIT: Oh, if nikic already looked then there's probably no easy fix.

lukas-code · 2022-10-26T07:54:41Z

cc #49892

clubby789 · 2022-10-29T01:35:28Z

@rustbot ready

compiler/rustc_ast/src/lib.rs

bors · 2022-10-31T16:26:26Z

☔ The latest upstream changes (presumably #103797) made this pull request unmergeable. Please resolve the merge conflicts.

scottmcm · 2022-11-25T07:50:38Z

Thanks! It's great that this worked out without adding any unsafe!

@bors r+

scottmcm · 2022-11-26T06:56:51Z

Weird, let's try that again

@bors r+

bors · 2022-11-26T06:56:53Z

📌 Commit b9a95d8 has been approved by scottmcm

It is now in the queue for this repository.

bors · 2022-11-26T08:56:23Z

⌛ Testing commit b9a95d8 with merge 8841bee...

bors · 2022-11-26T12:11:26Z

☀️ Test successful - checks-actions
Approved by: scottmcm
Pushing 8841bee to master...

rust-timer · 2022-11-26T13:30:25Z

Finished benchmarking commit (8841bee): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	2.1%	[2.1%, 2.1%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.3%	[-0.4%, -0.2%]	2
Improvements ✅ (secondary)	-0.3%	[-0.4%, -0.3%]	2
All ❌✅ (primary)	0.5%	[-0.4%, 2.1%]	3

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	3.0%	[3.0%, 3.0%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-5.1%	[-5.1%, -5.1%]	1
All ❌✅ (primary)	3.0%	[3.0%, 3.0%]	1

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.8%	[1.8%, 1.8%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	1.8%	[1.8%, 1.8%]	1

nnethercote · 2022-11-27T22:13:12Z

Perf changes are few, tiny, and not a concern.

@rustbot label: +perf-regression-triaged

…eq, r=scottmcm Manually implement PartialEq for Option<T> and specialize non-nullable types This PR manually implements `PartialEq` and `StructuralPartialEq` for `Option`, which seems to produce slightly better codegen than the automatically derived implementation. It also allows specializing on the `core::num::NonZero*` and `core::ptr::NonNull` types, taking advantage of the niche optimization by transmuting the `Option<T>` to `T` to be compared directly, which can be done in just two instructions. A comparison of the original, new and specialized code generation is available [here](https://godbolt.org/z/dE4jxdYsa).

rust-highfive assigned m-ou-se Oct 26, 2022

rustbot added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Oct 26, 2022

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Oct 26, 2022

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 26, 2022

thomcc requested changes Oct 26, 2022

View reviewed changes

library/core/src/option.rs Outdated Show resolved Hide resolved

This comment has been minimized.

Sign in to view

Rageking8 reviewed Oct 26, 2022

View reviewed changes

library/core/src/option.rs Outdated Show resolved Hide resolved

scottmcm reviewed Oct 26, 2022

View reviewed changes

library/core/src/option.rs Outdated Show resolved Hide resolved

scottmcm mentioned this pull request Oct 26, 2022

Niches rust-lang/rfcs#3334

Closed

clubby789 force-pushed the specialize-option-partial-eq branch from b6b33c2 to a1b650c Compare October 27, 2022 12:46

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Oct 29, 2022

compiler-errors reviewed Oct 29, 2022

View reviewed changes

compiler/rustc_ast/src/lib.rs Outdated Show resolved Hide resolved

clubby789 added 2 commits October 31, 2022 16:43

Specialize PartialEq for Option<num::NonZero*> and Option<ptr::NonNull>

8e8fd02

Specialize PartialEq for Option<newtype>

20f2d8b

clubby789 force-pushed the specialize-option-partial-eq branch from a1b650c to 20f2d8b Compare October 31, 2022 16:44

Use allow_internal_unstable and add unstable reason

b9a95d8

scottmcm approved these changes Nov 25, 2022

View reviewed changes

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 26, 2022

bors added the merged-by-bors This PR was explicitly merged by bors. label Nov 26, 2022

bors merged commit 8841bee into rust-lang:master Nov 26, 2022

rustbot added this to the 1.67.0 milestone Nov 26, 2022

rustbot added the perf-regression Performance regression. label Nov 26, 2022

rustbot added the perf-regression-triaged The performance regression has been triaged. label Nov 27, 2022

clubby789 deleted the specialize-option-partial-eq branch February 11, 2023 14:43

matthiaskrgr mentioned this pull request Jul 27, 2023

ICE: const evaluatable failed for non-unevaluated const #114151

Closed

matthiaskrgr mentioned this pull request Oct 11, 2023

ICE with generic_const_exprs: expected bits of usize, got (Sub: 2_usize, 1_usize): usize #116637

Open

matthiaskrgr mentioned this pull request Sep 1, 2024

ICE: Invalid 'Const' during codegen: UnevaluatedConst {..} #129857

Open

Uh oh!

Conversation

clubby789 commented Oct 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rustbot commented Oct 26, 2022

Uh oh!

rust-highfive commented Oct 26, 2022

Uh oh!

compiler-errors commented Oct 26, 2022

Uh oh!

rust-timer commented Oct 26, 2022

Uh oh!

bors commented Oct 26, 2022

Uh oh!

thomcc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

This comment has been minimized.

compiler-errors commented Oct 26, 2022

Uh oh!

rust-timer commented Oct 26, 2022

Uh oh!

bors commented Oct 26, 2022

Uh oh!

This comment has been minimized.

clubby789 commented Oct 26, 2022

Uh oh!

Uh oh!

compiler-errors commented Oct 26, 2022

Uh oh!

rust-timer commented Oct 26, 2022

Uh oh!

bors commented Oct 26, 2022

Uh oh!

Uh oh!

Noratrieb commented Oct 26, 2022

Uh oh!

bors commented Oct 26, 2022

Uh oh!

rust-timer commented Oct 26, 2022

Uh oh!

scottmcm commented Oct 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lukas-code commented Oct 26, 2022

Uh oh!

clubby789 commented Oct 29, 2022

Uh oh!

Uh oh!

bors commented Oct 31, 2022

Uh oh!

scottmcm commented Nov 25, 2022

Uh oh!

scottmcm commented Nov 26, 2022

Uh oh!

bors commented Nov 26, 2022

Uh oh!

bors commented Nov 26, 2022

Uh oh!

bors commented Nov 26, 2022

Uh oh!

rust-timer commented Nov 26, 2022

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Instruction count

Max RSS (memory usage)

Cycles

Uh oh!

nnethercote commented Nov 27, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

16 participants

clubby789 commented Oct 26, 2022 •

edited

Loading

scottmcm commented Oct 26, 2022 •

edited

Loading