[ascend]suppot ep by yao-fengchen · Pull Request #3696 · InternLM/lmdeploy

yao-fengchen · 2025-07-01T10:27:59Z

related PR DeepLink-org/dlinfer#237

add ascend ep
qwen 235B ep

docker/Dockerfile_ascend_a3

lmdeploy/pytorch/kernels/dlinfer/__init__.py

lmdeploy/pytorch/backends/dlinfer/ascend/op_backend.py

Copilot

Pull request overview

This PR adds expert parallelism (EP) support for Ascend NPU devices by integrating with the dlinfer library. The changes enable distributed MoE (Mixture of Experts) computation across multiple Ascend devices with optimized communication strategies based on the hardware generation (A2/A3).

Changes:

Added EP support to dlinfer backend for Ascend devices with MoE metadata tracking and communication type selection
Updated PyTorch/torch-npu version constraints to support newer versions (up to 2.10.0/2.25.0)
Refactored kernel imports to use torch.Tensor directly instead of dlinfer type annotations for better compatibility

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 10 comments.

Show a summary per file

File	Description
requirements/runtime_ascend.txt	Updated torch, torch-npu, and torchvision version constraints to support newer releases
docker/Dockerfile_ascend_a3	Updated base CANN image and PyTorch versions to match runtime requirements
lmdeploy/pytorch/kernels/dlinfer/pagedattention.py	Changed imports to use typing.Optional/Sequence and torch.Tensor instead of dlinfer annotations
lmdeploy/pytorch/kernels/dlinfer/flash_attention.py	Changed import to use torch.Tensor instead of dlinfer type annotation
lmdeploy/pytorch/kernels/dlinfer/moe_gating_topk_softmax.py	Added moe_metadata parameter and DlinferMoeMetadata type import for EP support
lmdeploy/pytorch/kernels/dlinfer/fused_moe.py	Added moe_metadata parameter and MoE type imports for EP functionality
lmdeploy/pytorch/kernels/dlinfer/init.py	Exported DlinferMoECommType and DlinferMoeMetadata for use in backend
lmdeploy/pytorch/backends/dlinfer/moe.py	Extended MoE implementation with EP support including expert partitioning and metadata handling
lmdeploy/pytorch/backends/dlinfer/ascend/op_backend.py	Added comprehensive EP infrastructure including DistMeta, communication type selection, and MoE metadata creation
lmdeploy/pytorch/configurations/utils.py	Improved flash_mla availability check to handle torch_npu version differences

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

lmdeploy/pytorch/kernels/dlinfer/moe_gating_topk_softmax.py

lmdeploy/pytorch/backends/dlinfer/ascend/op_backend.py

lmdeploy/pytorch/backends/dlinfer/moe.py

lmdeploy/pytorch/kernels/dlinfer/fused_moe.py

lmdeploy/pytorch/backends/dlinfer/moe.py

docker/Dockerfile_ascend_a3

grimoire

LGTM

jinminxi104 · 2026-02-09T06:31:49Z

waiting for ci result on dlinfer-side

jinminxi104 · 2026-02-09T06:45:41Z

ci passed

jinminxi104 marked this pull request as draft July 2, 2025 02:42

yao-fengchen changed the title ~~[ascend]suppot deepseek eager_mode~~ [ascend]suppot ep Dec 16, 2025

yao-fengchen force-pushed the support_ds_eager branch 3 times, most recently from 636bc40 to e5a8c5b Compare December 23, 2025 09:16

yao-fengchen added 10 commits January 4, 2026 06:51

[ascend]suppot deepseek eager_mode

93dccc7

fix flash_mla_available on ascend

70cb6ab

modify for dp_ep

1bb548b

backup code

5f47d92

run tp ep

cdeb30c

format code

c91aea8

add dp tp

0879001

move DlinferDistContext into dlinfer

f17f231

fix get_max_tokens_across_dp in tp case

760c0db

format code

73ebda1

yao-fengchen force-pushed the support_ds_eager branch from da74b83 to 73ebda1 Compare January 4, 2026 08:04

yao-fengchen added 3 commits January 6, 2026 02:49

fix grpah_mode dp

86d00eb

merge main

a376b4b

add mlpmetada

35a60da

yao-fengchen force-pushed the support_ds_eager branch from 8317872 to 35a60da Compare January 12, 2026 11:55

yao-fengchen added 8 commits January 14, 2026 09:21

good eager mode

8980ba9

good graph mode

a4f003b

good dp*tp+ep feature

74997d3

fix tp err

4004466

update pad_size

1166d22

optimize ep moe

fec438f

opt ep moe

d7177a1

fix ascend dptp

1d3325e

yao-fengchen force-pushed the support_ds_eager branch from 7c8d16f to 1d3325e Compare January 28, 2026 08:50

refactor code

b1f94e4

refactor code

2f77108

yao-fengchen force-pushed the support_ds_eager branch from bbadd6e to 2f77108 Compare February 5, 2026 09:13

yao-fengchen added 2 commits February 5, 2026 10:58

remove useless code

07eaf3f

Merge remote-tracking branch 'origin/main' into support_ds_eager

41613f2

jinminxi104 reviewed Feb 6, 2026

View reviewed changes

update code

700db7d

yao-fengchen force-pushed the support_ds_eager branch from 996688e to 700db7d Compare February 6, 2026 09:33

jinminxi104 marked this pull request as ready for review February 6, 2026 09:33

Copilot AI review requested due to automatic review settings February 6, 2026 09:33

Copilot started reviewing on behalf of jinminxi104 February 6, 2026 09:34 View session

Copilot AI reviewed Feb 6, 2026

View reviewed changes

jinminxi104 requested review from grimoire and lvhan028 February 6, 2026 10:57

update code

6a957ff

grimoire approved these changes Feb 6, 2026

View reviewed changes

jinminxi104 approved these changes Feb 9, 2026

View reviewed changes

jinminxi104 self-requested a review February 9, 2026 06:31

jinminxi104 approved these changes Feb 9, 2026

View reviewed changes

lvhan028 merged commit 09071c7 into InternLM:main Feb 11, 2026
15 checks passed

lvhan028 added the enhancement New feature or request label Feb 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ascend]suppot ep#3696

[ascend]suppot ep#3696
lvhan028 merged 27 commits intoInternLM:mainfrom
DeepLink-org:support_ds_eager

yao-fengchen commented Jul 1, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

grimoire left a comment

Uh oh!

jinminxi104 commented Feb 9, 2026

Uh oh!

jinminxi104 commented Feb 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

yao-fengchen commented Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

grimoire left a comment

Choose a reason for hiding this comment

Uh oh!

jinminxi104 commented Feb 9, 2026

Uh oh!

jinminxi104 commented Feb 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

yao-fengchen commented Jul 1, 2025 •

edited

Loading