cuda::is_trivially_copyable#8265
Conversation
|
I wanted to start a discussion for some time to add such a trait, but I would really keep this feature internally for now until it is sufficiently baked. I would really like this feature to make |
davebayer
left a comment
There was a problem hiding this comment.
I am not a fan of this trait. We bend C++ rules to fix poorly designed nvfp types. We would have to basically change every use of is_trivially_copyable and is_trivially_copy_constructible to this trait to make it work consistently in libcu++.
I don't think this is a good idea, we should rather insist of the nvfp types being fixed.
| Users may specialize ``cuda::is_trivially_copyable_relaxed`` for their own types whose memory representation is safe to copy | ||
| with ``memcpy`` but that the compiler does not consider trivially copyable. |
There was a problem hiding this comment.
not for the types that we care about. Said that, the user could provide an object that triggers UB. I can highlight this point in the documentation but we cannot do anything to explicitly prevent it.
Integrated and extended the tests that you point out. Everything works |
the actual issue is that the user can extend the trait to custom types. It needs to be public |
there are no plan for that. nvfp types are not trivially copyable for optimization purposes. |
This isn't going to happen and we shouldn't delude ourselves into thinking it ever will. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
| @@ -0,0 +1,323 @@ | |||
| //===----------------------------------------------------------------------===// | |||
There was a problem hiding this comment.
I'm inclined to remove this file because it is very similar with bit_cast.pass.cpp
This comment has been minimized.
This comment has been minimized.
| } // namespace cuda | ||
|
|
||
| ``cuda::is_trivially_copyable_v`` trait evaluates if a type can be copied by copying its underlying bytes. | ||
| It extends ``cuda::std::is_trivially_copyable`` to also recognize CUDA extended floating-point vector types as trivially copyable. |
There was a problem hiding this comment.
Should also mention whether it is legal for users to specialize the trait, and give an example of how to do that if so. Given the definition above it implies that users should specialize is_trivially_copyable_v instead of is_trivially_copyable, usually it is the other way around.
There was a problem hiding this comment.
specializing is_trivially_copyable_v could be too far. The only reason we are proving it is to overcome CUDA data type limitations. I don't see a good motivation for user to specialize it
There was a problem hiding this comment.
@miscco @davebayer let me know if you have a different opinion
There was a problem hiding this comment.
I don't see a good motivation for user to specialize it
Even so, I still think the definition snippet should show the "usual" way these traits are defined, i.e.
template <typename T>
struct my_trait : bool_constant<...> { };
template <typename T>
inline constexper bool my_trait_v = my_trait<T>::value;Even if that is not technically true, we don't want to give the impression that specializing the _v form is what actually matters.
| ``cuda::is_trivially_copyable_v`` trait evaluates if a type can be copied by copying its underlying bytes. | ||
| It extends ``cuda::std::is_trivially_copyable`` to also recognize CUDA extended floating-point vector types as trivially copyable. | ||
|
|
||
| ``cuda::is_trivially_copyable_v<T>`` relies on ``cuda::std::is_trivially_copyable`` but adds support for CUDA-specific types. |
There was a problem hiding this comment.
If it is legal to specialize, should elaborate whether the user should specialize cuda::std::is_trivially_copyable or cuda::is_trivially_copyable
There was a problem hiding this comment.
Important: can you add tests that the trait works even when it is user-specialized?
| ``cuda::is_trivially_copyable_v`` trait evaluates if a type can be copied by copying its underlying bytes. | ||
| It extends ``cuda::std::is_trivially_copyable`` to also recognize CUDA extended floating-point vector types as trivially copyable. | ||
|
|
||
| ``cuda::is_trivially_copyable_v<T>`` relies on ``cuda::std::is_trivially_copyable`` but adds support for CUDA-specific types. |
There was a problem hiding this comment.
Note: This sentence and the previous one seem a bit redundant
😬 CI Workflow Results🟥 Finished in 3h 26m: Pass: 97%/437 | Total: 19d 02h | Max: 3h 25m | Hits: 44%/599270See results here. |
Description
Followup of the discussion in
warp_shuffleoriginal behavior (revert #8210) #8254The PR introduces
cuda::is_trivially_copyable_relaxedto support types that are actually trivially copyable but not recognized by the C++std::is_trivially_copyabletype trait.The new trait supports:
cuda::std::array.cuda::std::pair.cuda::std::tuple.Potentially affected paths by
std::is_trivially_copyableunsupported cases:
cuda/std/__atomic/types/base.h:39cuda/std/__atomic/types/reference.h:39cuda/__memcpy_async/memcpy_async_barrier.h:61cuda/__memcpy_async/memcpy_async_tx.h:58cuda/__container/buffer.h:111cuda/__algorithm/copy.h:73cuda/__algorithm/fill.h:42cuda/std/string_view:147fallback to slower path:
cuda/std/__algorithm/copy.h:102cuda/std/__algorithm/copy_backward.h:50cuda/std/__algorithm/move.h:54cuda/std/__algorithm/move_backward.h:53cuda/std/__bit/bit_cast.h:55cub/device/dispatch/kernels/kernel_histogram.cuh:81cub/detail/uninitialized_copy.cuh:33thrust/type_traits/is_trivially_relocatable.h:213cudax/include/cuda/experimental/__kernel/kernel_ref.cuh:54Require #8347