Add MoE ASM ctypes migration on top of #2255 by zufayu · Pull Request #2341 · ROCm/aiter

zufayu · 2026-03-19T05:35:02Z

Summary

Migrate 8 MoE ASM kernel functions from pybind11 (torch::Tensor&) to C ABI (AiterTensor* + hipStream_t) called via ctypes
Functions migrated: fmoe, fmoe_int8_g1u0, fmoe_g1u1, fmoe_g1u1_tkw1, fmoe_int8_g1u0_a16, fmoe_g1u1_a16, fmoe_fp8_blockscale_g1u1, moe_stage1_g1u1
Split ctypes sources into separate module_moe_fmoe_asm build module to avoid torch dependency in the .so

Changes

csrc/py_itfs_cu/asm_fmoe.cu: Remove torch/ATen includes; use AiterTensor*, AITER_DTYPE_*, AITER_CHECK; convert template <typename T, typename T_O> to <int I_elemSize, int O_elemSize>; add extern "C" exports
csrc/py_itfs_cu/asm_moe_2stage.cu: Same conversion for moe_stage1_g1u1
csrc/include/moe_op.h: Remove fmoe pybind declarations (now extern "C" in .cu files)
csrc/include/rocm_ops.hpp: Remove fmoe entries from MOE_OP_PYBIND macro
aiter/ops/moe_op.py: Use ffi_type="ctypes" with new module_moe_fmoe_asm for migrated functions
aiter/jit/optCompilerConfig.json: Split ctypes sources into module_moe_fmoe_asm; keep pybind sources in module_moe_asm

Test plan

Verify module_moe_fmoe_asm builds without torch dependency
Run python op_tests/test_moe.py -t fmoe to validate fmoe kernel
Run python op_tests/test_moe.py -t fmoe_int8 to validate int8 path
Run python op_tests/test_moe.py -t fmoe_g1u1 to validate g1u1 path
Verify remaining pybind ops (topk_softmax, moe_align_block_size, moe_sum) still work

🤖 Generated with Claude Code

Convert fmoe, fmoe_int8_g1u0, fmoe_g1u1, fmoe_g1u1_tkw1, fmoe_int8_g1u0_a16, fmoe_g1u1_a16, fmoe_fp8_blockscale_g1u1, and moe_stage1_g1u1 from torch::Tensor& (pybind11) to AiterTensor* + hipStream_t (C ABI called via ctypes). - asm_fmoe.cu: Remove torch/ATen includes, use AiterTensor*, AITER_DTYPE_*, AITER_CHECK; template <int I_elemSize, int O_elemSize> - asm_moe_2stage.cu: Same conversion for moe_stage1_g1u1 - moe_op.h: Remove fmoe pybind declarations (now extern "C") - rocm_ops.hpp: Remove fmoe entries from MOE_OP_PYBIND macro - moe_op.py: Use ffi_type="ctypes" with new module_moe_fmoe_asm - optCompilerConfig.json: Split ctypes sources into module_moe_fmoe_asm Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions · 2026-03-19T05:35:29Z

🏷️ CI Guide

Runs automatically on every PR:

✅ Pre-checks (submodule verification, code formatting)
✅ Aiter op tests (gfx942 + gfx950)
✅ Triton tests (only when aiter/ops/triton/** or related paths are changed)

Extended tests (opt-in via labels):

Label	Tests
`ci:sglang`	SGLang integration tests
`ci:atom`	ATOM benchmark (DeepSeek-R1 + GPT-OSS)
`ci:vllm`	vLLM benchmark
`ci:all`	All of the above

Add labels via the sidebar or gh pr edit 2341 --add-label <label>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add MoE ASM ctypes migration on top of #2255#2341

Add MoE ASM ctypes migration on top of #2255#2341
zufayu wants to merge 1 commit intorefactor_bind_klfrom
refactor_bind_ctypes_all

zufayu commented Mar 19, 2026

Uh oh!

github-actions bot commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

zufayu commented Mar 19, 2026

Summary

Changes

Test plan

Uh oh!

github-actions bot commented Mar 19, 2026

🏷️ CI Guide

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant