Use 128-bit Widening Multiply on More Platforms#62
Merged
WaffleLapkin merged 1 commit intorust-lang:masterfrom Aug 6, 2025
Merged
Use 128-bit Widening Multiply on More Platforms#62WaffleLapkin merged 1 commit intorust-lang:masterfrom
WaffleLapkin merged 1 commit intorust-lang:masterfrom
Conversation
scottmcm
reviewed
Jul 7, 2025
The 128-bit widening multiplication was previously gated by simply checking the target pointer width. This works as a simple heuristic, but a better heuristic can be used: 1. Most 64-bit architectures except SPARC64 and Wasm64 support the 128-bit widening multiplication, so it shouldn't be used on those two architectures. 2. The target pointer width doesn't always indicate that we are dealing with a 64-bit architecture, as there are ABIs that reduce the pointer width, especially on AArch64 and x86-64. 3. WebAssembly (regardless of pointer width) supports 64-bit to 128-bit widening multiplication with the `wide-arithmetic` proposal. The `wide-arithmetic` proposal is available since the LLVM 20 update and works perfectly for this use case as can be seen here: https://rust.godbolt.org/z/9jY7fxqxK Using `wasmtime explore`, we can see it compiles down to the ideal instructions on x86-64: ```nasm mulx rax, rdx, r10 xor rax, rdx ``` Based on the same change in [`foldhash`](orlp/foldhash#17).
146ff74 to
6849c16
Compare
WaffleLapkin
reviewed
Aug 6, 2025
| // We compute the full u64 x u64 -> u128 product, this is a single mul | ||
| // instruction on x86-64, one mul plus one mulhi on ARM64. | ||
| let full = (x as u128) * (y as u128); | ||
| let full = (x as u128).wrapping_mul(y as u128); |
Member
There was a problem hiding this comment.
See orlp/foldhash#16 for why this change was applied
WaffleLapkin
approved these changes
Aug 6, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The 128-bit widening multiplication was previously gated by simply checking the target pointer width. This works as a simple heuristic, but a better heuristic can be used:
wide-arithmeticproposal.The
wide-arithmeticproposal is available since the LLVM 20 update and works perfectly for this use case as can be seen here:https://rust.godbolt.org/z/9jY7fxqxK
Using
wasmtime explore, we can see it compiles down to the ideal instructions on x86-64:Based on the same change in
foldhash.