[CI] Re-enable deep_ep and deep_gemm unit tests#8027
Conversation
|
/skip-ci ci_iluvatar |
PaddlePaddle-bot
left a comment
There was a problem hiding this comment.
🤖 Paddle-CI-Agent | pr_review |
2026-06-09 13:38:37
📋 Review 摘要
PR 概述:恢复 cov pytest 对 DeepEP/DeepGEMM 相关单测的收集与执行。
变更范围:tests/cov_pytest.ini 覆盖率测试忽略列表。
影响面 Tag:[CI]
问题
| 级别 | 文件 | 概述 |
|---|---|---|
| 🔴 Bug | tests/cov_pytest.ini:17 |
恢复 test_fusedmoe_ep_entry.py 后,pytest 收集/导入阶段会直接启动分布式测试,导致覆盖率任务重复执行且绕过 coverage_run.sh 的调度与日志包装 |
📝 PR 规范检查
符合规范。
总体评价
当前改动方向是恢复覆盖率,但至少这个分布式入口测试还不满足被全量收集的条件。请先修正测试入口的导入副作用,再从 ignore 列表移除。
| --ignore=tests/graph_optimization/test_cuda_graph_dynamic_subgraph.py | ||
| --ignore=tests/e2e/test_ernie_03b_pd_decode_unified_attention.py | ||
| --ignore=tests/e2e/test_ernie_03b_pd_router_v1_ipc.py | ||
| --ignore=tests/distributed/test_fusedmoe_ep_entry.py |
There was a problem hiding this comment.
🔴 Bug 删除这条 ignore 后,coverage_run.sh 的收集阶段会导入并实际执行 tests/distributed/test_fusedmoe_ep_entry.py。
coverage_run.sh 先执行 python -m pytest --collect-only -q -c ${PYTEST_INI} tests 来收集文件;被恢复的测试文件底部有未受 if __name__ == "__main__" 保护的 test_fused_moe_launch(),导入时就会启动 paddle.distributed.launch --gpus 0,1。后续 run_test_with_logging 再跑该文件时,pytest 导入会再执行一次,随后测试函数本身还会执行一次,导致覆盖率任务重复启动分布式子进程,并且第一次发生在收集阶段,绕过分类、重试和日志目录隔离。
建议修复方式:
先保留这条 ignore,或在恢复前把 tests/distributed/test_fusedmoe_ep_entry.py 的模块级调用移到 if __name__ == "__main__": 下,并给 pytest 入口补上显式的多 GPU 可用性 skip/gate,确保 collect-only 只收集、不执行分布式任务。
CI报告基于以下代码生成(30分钟更新一次): 1 Required任务 : 2/10 通过
2 失败详情无 |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #8027 +/- ##
==========================================
Coverage ? 78.32%
==========================================
Files ? 404
Lines ? 57430
Branches ? 9032
==========================================
Hits ? 44984
Misses ? 9572
Partials ? 2874
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Motivation
deep_epanddeep_gemmunit tests were previously disabled due to compatibility issues introduced by upstream Paddle changes.Since the related changes have been reverted in PaddlePaddle/Paddle#79249, the original incompatibility no longer exists and the affected test cases can be restored.
Re-enabling these tests helps recover validation coverage and ensures continued regression protection for
deep_epanddeep_gemmfunctionality.Modifications
deep_eprelated unit tests.deep_gemmrelated unit tests.Usage or Command
N/A
Accuracy Tests
N/A
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.