Add MiniMax M2.5 MXFP4 benchmark for MI355x vLLM v0.17.1 (TP=2,4) by functionstackx · Pull Request #827 · SemiAnalysisAI/InferenceX

functionstackx · 2026-03-01T04:14:09Z

Add MiniMax M2.5 MXFP4 benchmark config for MI355x with vLLM v0.17.1, now that AMD's MXFP4 checkpoint is out: https://huggingface.co/amd/MiniMax-M2.5-MXFP4

Model: amd/MiniMax-M2.5-MXFP4
Image: vllm/vllm-openai-rocm:v0.17.1
TP=2 and TP=4 (matching MiniMax M2.5 FP8 pattern)
VLLM_ROCM_USE_AITER=1, with AITER MoE fallback for TP>=4
Seq lengths: 1k1k, 1k8k, 8k1k (conc 4-64)

Closes #826

Generated with Claude Code

functionstackx · 2026-03-01T04:43:02Z

@claude can u comment out TP=4 and only do TP=2 for now since TP=4 is bugged vllm-project/vllm#35637

Klaud-Cold · 2026-03-01T04:43:24Z

Claude finished @functionstackx's task in 53s —— View job

Commented out TP=4 for MiniMax M2.1 MXFP4

Due to vllm-project/vllm#35637, TP=4 has been disabled. Only TP=2 will run.

Read current benchmark files
Comment out TP=4 in .github/configs/amd-master.yaml (all 3 seq-len configs)
Update perf-changelog.yaml with PR link and description
Commit and push changes

Changes made:

.github/configs/amd-master.yaml: Commented out all 3 tp: 4 search-space entries under minimaxm2.1-fp4-mi355x-vllm with a reference to the vLLM issue
perf-changelog.yaml: Updated entry to reflect TP=2 only and added PR Add MiniMax M2.5 MXFP4 benchmark for MI355x vLLM v0.17.1 (TP=2,4) #827 link
|

functionstackx · 2026-03-01T07:11:04Z

@chunfangamd i am seeing good mxfp4 performance when comparing to FP8. just waiting for AMD to release mxfp4 version of MiniMax M2.5 now. currently AMD only have mxfp4 version of MiniMax M2.1.

chunfangamd

Looks good to me

Add MiniMax M2.1 MXFP4 benchmark config for MI355x with vLLM v0.16.0. - Model: amd/MiniMax-M2.1-MXFP4 - TP=2 and TP=4 (matching MiniMax M2.5 FP8 pattern) - Only VLLM_ROCM_USE_AITER=1 env var (per Andy Luo recipe) - Seq lengths: 1k1k, 1k8k, 8k1k (conc 4-64) Closes #826 Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>

TP=4 is bugged for this model per vllm-project/vllm#35637. Comment out TP=4 search-space entries, keeping only TP=2. Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>

Author: hongxiayang - Keep AITER for attention but disable it specifically for MoE, so the fused MoE falls back to triton kernels that can handle N=384, when TP=4 and N=192 when TP=8. - Install the amd-quark library to fix the crash when TP=4 with VLLM_ROCM_USE_AITER_MOE=0.

- Model: amd/MiniMax-M2.1-MXFP4 → amd/MiniMax-M2.5-MXFP4 - Image: vllm/vllm-openai-rocm v0.16.0 → v0.17.1 - Rename config key and script from m2.1 to m2.5 - Update perf-changelog entry Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

functionstackx requested a review from a team March 1, 2026 04:14

functionstackx requested review from billishyahao and chunfangamd as code owners March 1, 2026 04:14

github-project-automation bot added this to InferenceMAX Board Mar 1, 2026

functionstackx added AMD sweep-enabled labels Mar 1, 2026

functionstackx changed the title ~~[AMD] Add MiniMax M2.1 MXFP4 benchmark for MI355x vLLM (TP=2,4)~~ [WIP] Add MiniMax M2.1 MXFP4 benchmark for MI355x vLLM (TP=2,4) Mar 1, 2026

functionstackx removed the sweep-enabled label Mar 1, 2026

functionstackx added the sweep-enabled label Mar 1, 2026

functionstackx removed the sweep-enabled label Mar 1, 2026

functionstackx marked this pull request as draft March 1, 2026 23:23

chunfangamd marked this pull request as ready for review March 4, 2026 09:09

chunfangamd approved these changes Mar 4, 2026

View reviewed changes

chunfangamd changed the title ~~[WIP] Add MiniMax M2.1 MXFP4 benchmark for MI355x vLLM (TP=2,4)~~ Add MiniMax M2.1 MXFP4 benchmark for MI355x vLLM (TP=2,4) Mar 4, 2026

chunfangamd enabled auto-merge (squash) March 4, 2026 09:11

functionstackx changed the title ~~Add MiniMax M2.1 MXFP4 benchmark for MI355x vLLM (TP=2,4)~~ [Do Not Merge] [WIP till AMD releases MXFP4 of MiniMax M2.5] Add MiniMax M2.1 MXFP4 benchmark for MI355x vLLM (TP=2,4) Mar 4, 2026

functionstackx changed the title ~~[Do Not Merge] [WIP till AMD releases MXFP4 of MiniMax M2.5] Add MiniMax M2.1 MXFP4 benchmark for MI355x vLLM (TP=2,4)~~ Add MiniMax M2.5 MXFP4 benchmark for MI355x vLLM v0.17.1 (TP=2,4) Mar 20, 2026

functionstackx added sweep-enabled and removed sweep-enabled labels Mar 20, 2026

github-actions bot and others added 4 commits March 19, 2026 21:46

Disable TP=4 for MiniMax M2.1 MXFP4 due to vLLM bug

e22ed53

TP=4 is bugged for this model per vllm-project/vllm#35637. Comment out TP=4 search-space entries, keeping only TP=2. Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>

Activate TP4 for MiniMax2.1 FP4 MI355 vLLM

42bb501

functionstackx force-pushed the claude/issue-826-20260301-0409 branch 2 times, most recently from bd10495 to e849d65 Compare March 20, 2026 01:50

functionstackx added the sweep-enabled label Mar 20, 2026

functionstackx force-pushed the claude/issue-826-20260301-0409 branch from e849d65 to 86cc700 Compare March 20, 2026 01:57

functionstackx force-pushed the claude/issue-826-20260301-0409 branch from 86cc700 to b82116b Compare March 20, 2026 01:57

functionstackx removed the sweep-enabled label Mar 20, 2026

functionstackx force-pushed the claude/issue-826-20260301-0409 branch from b82116b to 7dd6063 Compare March 20, 2026 01:59

functionstackx added the sweep-enabled label Mar 20, 2026

functionstackx force-pushed the claude/issue-826-20260301-0409 branch from 7dd6063 to 5bc40e6 Compare March 20, 2026 02:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add MiniMax M2.5 MXFP4 benchmark for MI355x vLLM v0.17.1 (TP=2,4)#827

Add MiniMax M2.5 MXFP4 benchmark for MI355x vLLM v0.17.1 (TP=2,4)#827
functionstackx wants to merge 5 commits intomainfrom
claude/issue-826-20260301-0409

functionstackx commented Mar 1, 2026 •

edited

Loading

Uh oh!

functionstackx commented Mar 1, 2026

Uh oh!

Klaud-Cold commented Mar 1, 2026 •

edited

Loading

Uh oh!

functionstackx commented Mar 1, 2026

Uh oh!

chunfangamd left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

functionstackx commented Mar 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

functionstackx commented Mar 1, 2026

Uh oh!

Klaud-Cold commented Mar 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Commented out TP=4 for MiniMax M2.1 MXFP4

Uh oh!

functionstackx commented Mar 1, 2026

Uh oh!

chunfangamd left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

functionstackx commented Mar 1, 2026 •

edited

Loading

Klaud-Cold commented Mar 1, 2026 •

edited

Loading