Problem Description
Example:
https://github.com/ROCm/aiter/blob/2fad199eebc07179de0b3e2d661122baf241400b/aiter/ops/flydsl/kernels/attn_reduce.py#L434-L440
if work_idx < total_wg_cnt:
continue_flag = main_loop(work_idx)
if continue_flag:
work_idx += gpu.grid_dim("x")
while work_idx < total_wg_cnt:
# ...
if work_idx < total_wg_cnt: sets ReplaceIfWithDispatch as True so that the while loop while work_idx < total_wg_cnt not get rewritten.
Operating System
all
CPU
all
GPU
all
ROCm Version
all
ROCm Component
No response
Steps to Reproduce
No response
(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
No response
Additional Information
No response