Fix: Scaled Matmul at rocm 9.0 by shurale-nkn · Pull Request #739 · ROCm/jax

shurale-nkn · 2026-03-17T23:21:48Z

PR fixes failing ScaledMatmul and ScaledDot tests on ROCm caused by:
MLIR translation rule for primitive 'scaled_matmul' not found for platform rocm

On ROCm, we now register a dedicated lowering for scaled_matmul that uses a composite approach: it lowers through lax.scaled_dot instead of relying on direct fusion.
The resulting HLO can then be fused later by XLA, which matches the intended backend behavior and avoids the missing primitive translation path on ROCm.

What changed

Implemented ROCm lowering by delegating to lax.scaled_dot.
Kept existing CUDA lowering unchanged.

…ring.

Add ROCm support for scaled matrix multiplication and scaled dot lowe…

0187ef9

…ring.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: Scaled Matmul at rocm 9.0#739

Fix: Scaled Matmul at rocm 9.0#739
shurale-nkn wants to merge 1 commit intorocm-jaxlib-v0.9.0from
fix_scaled_matmul_dot_test_0.9.0

shurale-nkn commented Mar 17, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

shurale-nkn commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

shurale-nkn commented Mar 17, 2026 •

edited

Loading