Rollup of 4 pull requests#155793
Closed
JonathanBrouwer wants to merge 9 commits intorust-lang:mainfrom
Closed
Conversation
Workgroup memory is a memory region that is shared between all threads in a workgroup on GPUs. Workgroup memory can be allocated statically or after compilation, when launching a gpu-kernel. The intrinsic added here returns the pointer to the memory that is allocated at launch-time. # Interface With this change, workgroup memory can be accessed in Rust by calling the new `gpu_launch_sized_workgroup_mem<T>() -> *mut T` intrinsic. It returns the pointer to workgroup memory guaranteeing that it is aligned to at least the alignment of `T`. The pointer is dereferencable for the size specified when launching the current gpu-kernel (which may be the size of `T` but can also be larger or smaller or zero). All calls to this intrinsic return a pointer to the same address. See the intrinsic documentation for more details. ## Alternative Interfaces It was also considered to expose dynamic workgroup memory as extern static variables in Rust, like they are represented in LLVM IR. However, due to the pointer not being guaranteed to be dereferencable (that depends on the allocated size at runtime), such a global must be zero-sized, which makes global variables a bad fit. # Implementation Details Workgroup memory in amdgpu and nvptx lives in address space 3. Workgroup memory from a launch is implemented by creating an external global variable in address space 3. The global is declared with size 0, as the actual size is only known at runtime. It is defined behavior in LLVM to access an external global outside the defined size. There is no similar way to get the allocated size of launch-sized workgroup memory on amdgpu an nvptx, so users have to pass this out-of-band or rely on target specific ways for now.
…useZ4,Sa4dus,workingjubilee,RalfJung,nikic,kjetilkjeka,kulst Add intrinsic for launch-sized workgroup memory on GPUs Workgroup memory is a memory region that is shared between all threads in a workgroup on GPUs. Workgroup memory can be allocated statically or after compilation, when launching a gpu-kernel. The intrinsic added here returns the pointer to the memory that is allocated at launch-time. # Interface With this change, workgroup memory can be accessed in Rust by calling the new `gpu_launch_sized_workgroup_mem<T>() -> *mut T` intrinsic. It returns the pointer to workgroup memory guaranteeing that it is aligned to at least the alignment of `T`. The pointer is dereferencable for the size specified when launching the current gpu-kernel (which may be the size of `T` but can also be larger or smaller or zero). All calls to this intrinsic return a pointer to the same address. See the intrinsic documentation for more details. ## Alternative Interfaces It was also considered to expose dynamic workgroup memory as extern static variables in Rust, like they are represented in LLVM IR. However, due to the pointer not being guaranteed to be dereferencable (that depends on the allocated size at runtime), such a global must be zero-sized, which makes global variables a bad fit. # Implementation Details Workgroup memory in amdgpu and nvptx lives in address space 3. Workgroup memory from a launch is implemented by creating an external global variable in address space 3. The global is declared with size 0, as the actual size is only known at runtime. It is defined behavior in LLVM to access an external global outside the defined size. There is no similar way to get the allocated size of launch-sized workgroup memory on amdgpu an nvptx, so users have to pass this out-of-band or rely on target specific ways for now. Tracking issue: rust-lang#135516
…, r=JonathanBrouwer Error on invalid macho section specifier The macho section specifier used by `#[link_section = "..."]` is more strict than e.g. the one for elf. LLVM will error when you get it wrong, which is easy to do if you're used to elf. So, provide some guidance for the simplest mistakes, based on the LLVM validation. Currently compilation fails with an LLVM error, see https://godbolt.org/z/WoE8EdK1K. The LLVM validation logic is at https://github.com/llvm/llvm-project/blob/a0f0d6342e0cd75b7f41e0e6aae0944393b68a62/llvm/lib/MC/MCSectionMachO.cpp#L199-L203 LLVM validates the other components of the section specifier too, but it feels a bit fragile to duplicate those checks. If you get that far, hopefully the LLVM errors will be sufficient to get unstuck. --- sidequest from rust-lang#147811 r? JonathanBrouwer specifically, is this the right place for this sort of validation? `rustc_attr_parsing` also does some validation.
…uct, r=fee1-dead Reject implementing const Drop for types that are not const `Destruct` already fixes rust-lang#155618 While there is no soundness or otherwise issue currently, this PR ensures that people get what they expect. It seems wrong to allow implementing `const Drop`, but then the type still can't be dropped at compile-time. r? @fee1-dead
…ggestions, r=JonathanBrouwer Do not suggest internal cfg trace attributes Fixes rust-lang#150566.
Contributor
Author
|
@bors r+ rollup=never p=5 |
Contributor
Contributor
|
⌛ Testing commit 0258ca6 with merge fd0384b... Workflow: https://github.com/rust-lang/rust/actions/runs/24940589241 |
rust-bors Bot
pushed a commit
that referenced
this pull request
Apr 25, 2026
…uwer Rollup of 4 pull requests Successful merges: - #146181 (Add intrinsic for launch-sized workgroup memory on GPUs) - #155065 (Error on invalid macho section specifier) - #155676 ( Reject implementing const Drop for types that are not const `Destruct` already) - #155783 (Do not suggest internal cfg trace attributes)
Contributor
Author
|
@bors yield |
Contributor
|
Auto build was cancelled. Cancelled workflows: The next pull request likely to be tested is #155793. |
Contributor
|
This pull request was unapproved due to being closed. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Successful merges:
Destructalready #155676 ( Reject implementing const Drop for types that are not constDestructalready)r? @ghost
Create a similar rollup