Skip to content

[QST]Memory Ordering of atomicAdd and tma_reduce_add #3282

@kindwyf

Description

@kindwyf

What is your question?
I've been studying the source code of CUTLASS and have a question. When multiple threads within a single block perform atomicAdd to write to the same shared memory address, is the execution order deterministic? This matters because floating-point addition is not associative. Additionally, does tma_reduce_add guarantee a fixed execution order? Thank you.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions