Skip to content

fix: handle ComposedLayout slicing with dynamic strides (fixes #3255)#3261

Open
zhils wants to merge 1 commit into
NVIDIA:mainfrom
zhils:fix/dynamic-composed-layout-slice-3255
Open

fix: handle ComposedLayout slicing with dynamic strides (fixes #3255)#3261
zhils wants to merge 1 commit into
NVIDIA:mainfrom
zhils:fix/dynamic-composed-layout-slice-3255

Conversation

@zhils
Copy link
Copy Markdown

@zhils zhils commented May 22, 2026

When slicing a Tensor whose layout is a ComposedLayout (e.g., Swizzle composed with a dynamic outer Layout), _cute_ir.slice fails because the Swizzle composition requires static stride analysis that cannot be performed with runtime dynamic strides (represented as '?' / unknown values).

The fix intercepts this case in _Tensor.getitem and decomposes the ComposedLayout manually in the Python DSL layer:

  1. Extract inner (Swizzle), offset, and outer (Layout) from the ComposedLayout
  2. Slice only the outer Layout using slice_ (pure Layout slicing works with dynamic strides)
  3. Compute the offset delta via crd2idx(coord_with_none_as_zero, outer)
  4. Rebuild the ComposedLayout with the new offset and sliced outer
  5. Construct a new Tensor with the original iterator

This avoids passing the composed layout with dynamic strides to the MLIR slice operation, which triggers an MLIR verifier error and segfault.

…#3255)

When slicing a Tensor whose layout is a ComposedLayout (e.g., Swizzle composed
with a dynamic outer Layout), _cute_ir.slice fails because the Swizzle
composition requires static stride analysis that cannot be performed with
runtime dynamic strides (represented as '?' / unknown values).

The fix intercepts this case in _Tensor.__getitem__ and decomposes the
ComposedLayout manually in the Python DSL layer:
1. Extract inner (Swizzle), offset, and outer (Layout) from the ComposedLayout
2. Slice only the outer Layout using slice_ (pure Layout slicing works with
   dynamic strides)
3. Compute the offset delta via crd2idx(coord_with_none_as_zero, outer)
4. Rebuild the ComposedLayout with the new offset and sliced outer
5. Construct a new Tensor with the original iterator

This avoids passing the composed layout with dynamic strides to the MLIR
slice operation, which triggers an MLIR verifier error and segfault.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant