Skip to content

Fix the ScatterD issue in predicated_tile_iterator#3278

Open
pengpeng-yu wants to merge 1 commit into
NVIDIA:mainfrom
pengpeng-yu:fix-scatterd
Open

Fix the ScatterD issue in predicated_tile_iterator#3278
pengpeng-yu wants to merge 1 commit into
NVIDIA:mainfrom
pengpeng-yu:fix-scatterd

Conversation

@pengpeng-yu
Copy link
Copy Markdown

Summary

This PR fixes the remaining ScatterD pointer-advance issue in PredicatedTileIterator. This change:

  • Guards increment_group and increment_cluster in load_with_byte_offset() with !ScatterD
  • Updates operator+=() to follow the same pointer-advance rules as operator++()

Fixes #3101.

Validation

I added a minimal local repro bug_repro.zip using:

  • CUTLASS GEMM 2.x API
  • SIMT core
  • gather A + GEMM + scatter D
  • int8 x int8 -> int32

Before the fix:

num_mismatch: 8

After the fix:

num_mismatch: 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] ScatterD issue in predicated_tile_iterator with unguarded pointer updates in load_with_byte_offset and operator+=

1 participant