Skip to content

GFX1250 Gluon MoE A4W4 Kernel#2337

Draft
farlukas wants to merge 1 commit intomainfrom
farlukas/moe-a4w4-gfx1250-main
Draft

GFX1250 Gluon MoE A4W4 Kernel#2337
farlukas wants to merge 1 commit intomainfrom
farlukas/moe-a4w4-gfx1250-main

Conversation

@farlukas
Copy link
Contributor

Motivation

Port MoE A4W4 kernel from Triton to Gluon for GFX1250.

Technical Details

Follows the same function signature as Triton version but using TDM features in Gluon.

Test Plan

Added backend parameter to the Triton unit tests to switch between Triton and Gluon.

pytest op_tests/triton_tests/moe/test_moe_gemm_a4w4.py # runs both triton and gluon tests
pytest op_tests/triton_tests/moe/test_moe_gemm_a4w4.py -k "gluon" # runs just gluon tests
pytest op_tests/triton_tests/moe/test_moe_gemm_a4w4.py -k "triton" # runs just triton tests

Test Result

The following unit tests has passed:

  • Base (no HBM swizzling, no gather, no scatter, no gammas, no activations, no fused quant)
  • With HBM swizzling
  • With gather
  • With scatter
  • With gammas
  • With swiglu activation
  • With fused quant

Submission Checklist

@github-actions
Copy link
Contributor

🏷️ CI Guide

Runs automatically on every PR:

  • ✅ Pre-checks (submodule verification, code formatting)
  • ✅ Aiter op tests (gfx942 + gfx950)
  • ✅ Triton tests (only when aiter/ops/triton/** or related paths are changed)

Extended tests (opt-in via labels):

Label Tests
ci:sglang SGLang integration tests
ci:atom ATOM benchmark (DeepSeek-R1 + GPT-OSS)
ci:vllm vLLM benchmark
ci:all All of the above

Add labels via the sidebar or gh pr edit 2337 --add-label <label>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant