Releases: modelscope/mcore-bridge
Releases · modelscope/mcore-bridge
v1.3.0
中文版
新特性
- 新增 model_type 支持:kimi_k25、hy_v3、llava_onevision。
- mlp_padding_free 兼容 Sequence Parallelism。
- 移除对 megatron-core 0.12 - 0.14 版本的依赖支持。
English Version
New Features
- Added model_type support: kimi_k25, hy_v3, llava_onevision.
- mlp_padding_free is now compatible with Sequence Parallelism.
- Removed dependency support for megatron-core versions 0.12 - 0.14.
What's Changed
- [docs] update readme by @Jintao-Huang in #49
- update requirements by @Jintao-Huang in #51
- npu qwen3.5 megatron padding_free fix by @addsubmuldiv in #50
- [model] support kimi_k25 by @Jintao-Huang in #52
- [model] support hy_v3 by @Jintao-Huang in #53
- Add support for LLaVA-OneVision-1.5 model by @randydl in #54
- [bugfix] fix torch_dtype by @Jintao-Huang in #57
- fix qwen3_next by @Jintao-Huang in #58
- remove mcore0.12-mcore0.14 by @Jintao-Huang in #59
- fix kwargs by @Jintao-Huang in #61
- [megatron] support mlp_padding_free & sp; refactor TransformerLayer by @Jintao-Huang in #62
- [bugfix] fix gather_from_sp by @Jintao-Huang in #63
- update transformers by @Jintao-Huang in #65
- update requirements by @Jintao-Huang in #66
New Contributors
Full Changelog: v1.2.0...v1.3.0
Patch release v1.2.3
Full Changelog: v1.2.2...v1.2.3
Patch release v1.2.2
Full Changelog: v1.2.1...v1.2.2
Patch release v1.2.1
Full Changelog: v1.2.0...v1.2.1
v1.2.0
中文版
新特性
- 支持 GLM-5 共享参数 MTP ,可通过
mtp_shared_weights参数启用。 - 支持 Qwen3.5 FP8 训练和权重导入导出。
- 支持控制 MTP 分支中
decoder_input是否停止梯度,即 MTP loss 能否直接通过decoder_input回传梯度到Embedding/ViT,使用mtp_decoder_input_detach参数。 - 昇腾 NPU 训练兼容 megatron-core 0.15.3。
English Version
New Features
- Added support for GLM-5 shared-weight MTP, which can be enabled via the
mtp_shared_weightsargument. - Added support for Qwen3.5 FP8 training and FP8 weight import/export.
- Added support for controlling whether gradients are stopped at
decoder_inputin the MTP branch, i.e., whether the MTP loss can be back-propagated throughdecoder_inputtoEmbedding/ViT. This can be configured via themtp_decoder_input_detachargument. - Added compatibility with Megatron-Core 0.15.3 for training on Huawei Ascend NPU.
What's Changed
- [docs] update readme by @Jintao-Huang in #17
- [qwen3.5] compat transformers 5.4.0 by @Jintao-Huang in #18
- [bugfix] fix gptq_bridge by @Jintao-Huang in #19
- Revert qwen3.5 save weight by @Jintao-Huang in #20
- [bugfix] fix multimodal mtp by @Jintao-Huang in #21
- update get_parameter_local_cp by @Jintao-Huang in #22
- [bugfix] Fix the multi-LoRA issue in Twinkle by @Jintao-Huang in #24
- Adapt Mindspeed/Megatron 0.15.3 by @addsubmuldiv in #25
- [bugfix] fix qwen3.5 gpt_bridge lora by @Jintao-Huang in #28
- [bugfix] fix gdn sharded_state_dict lora by @Jintao-Huang in #23
- support Qwen3.5 FP8 by @Jintao-Huang in #30
- [bugfix] fix fp8 by @Jintao-Huang in #32
- [bugfix] fix set_module lora by @Jintao-Huang in #33
- [compat] gdn compat mcore main by @Jintao-Huang in #34
- [bugfix] Fix mtp fp8 by @Jintao-Huang in #35
- support mtp_decoder_input_detach by @Jintao-Huang in #37
- [bugfix] fix gate_up_proj by @Jintao-Huang in #39
- fix mtp_num_layer >= 2 multimodal by @Jintao-Huang in #40
- support mtp_shared_weights by @Jintao-Huang in #41
- compat peft 0.19 by @Jintao-Huang in #42
- [bugfix] fix peft_format qwen3_5_moe by @Jintao-Huang in #43
- fix: Add is_mtp parameter to _set_moe_state avoid type error by @0hujun in #45
- [bugfix] fix grpo qwen3_5_moe full by @Jintao-Huang in #46
- [bugfix] fix safe_ddp_context hang by @Jintao-Huang in #47
New Contributors
- @addsubmuldiv made their first contribution in #25
- @0hujun made their first contribution in #45
Full Changelog: v1.1.0...v1.2.0
Patch release v1.1.2
Full Changelog: v1.1.1...v1.1.2
Patch release v1.1.1
Full Changelog: v1.1.0...v1.1.1
v1.1.0
What's Changed
- [feat] Support multimodel mtp by @Jintao-Huang in #14
- [gdn] support GDN CP by @Jintao-Huang in #16
Full Changelog: v1.0.2...v1.1.0
v1.0.2
What's Changed
- [module] clean mtp code by @Jintao-Huang in #11
- [bugfix] fix mtp by @Jintao-Huang in #12
- update import by @Jintao-Huang in #13
Full Changelog: v1.0.1...v1.0.2
v1.0.1
What's Changed
- [readme] fix readme by @Jintao-Huang in #3
- [bugfix] fix vit_lora tp by @Jintao-Huang in #5
- [bugfix] fix modules_to_save deepcopy by @Jintao-Huang in #6
- [docs] update readme by @Jintao-Huang in #7
- [bridge] Support GPTBridge callback by @Jintao-Huang in #8
- [bugfix] Fix internvl by @Jintao-Huang in #9
Full Changelog: v1.0.0...v1.0.1