[draft] Multi lora by rcano-baseten · Pull Request #60 · modelscope/mcore-bridge

rcano-baseten · 2026-05-04T18:54:36Z

No description provided.

gemini-code-assist

Code Review

This pull request introduces a Multi-LoRA registry and routing mechanism to support per-sequence and per-token adapter dispatching in Megatron-Core models, including MoE layers. My review highlights potential issues with registry initialization, the safety of direct dictionary manipulation for cleanup, and performance considerations regarding per-token linear operations and buffer allocation in the routing path.

gemini-code-assist · 2026-05-04T18:57:58Z

+            num_slots: Number of concurrent adapters to support.
+            lora_config: ``peft.LoraConfig`` describing rank, target modules, etc.
+        """
+        self._adapter_registry: dict = {}  # logical_name → slot_name


The _adapter_registry dictionary is initialized inside preallocate_adapters. If this method is called multiple times, it will overwrite the existing registry, potentially losing previously registered adapters. Consider checking if it already exists or initializing it in __init__.

gemini-code-assist · 2026-05-04T18:57:58Z

+        if root is not None:
+            root.__dict__.pop('_expert_lora_indices', None)
+        if post is not None:
+            self.__dict__.pop('_lora_post_dispatch', None)
+        return result


The cleanup logic in routed_experts_compute uses root.__dict__.pop and self.__dict__.pop. While this works, it is safer to use getattr(root, '_expert_lora_indices', None) and delattr or setattr(root, '_expert_lora_indices', None) to avoid potential issues with direct dictionary manipulation of objects.

gemini-code-assist · 2026-05-04T18:57:58Z

+                for idx, slot in _lpl._idx_to_name.items():
+                    if slot not in _lpl.lora_A:
+                        continue
+                    mask = token_idx == idx
+                    if not mask.any():
+                        continue
+                    rows = mask.nonzero(as_tuple=True)[0]
+                    x_sub = x_flat.index_select(0, rows).to(result.dtype)
+                    delta = F.linear(_lpl.lora_dropout[slot](x_sub),
+                                     _lpl.lora_A[slot].weight.to(result.dtype))
+                    delta = F.linear(delta, _lpl.lora_B[slot].weight.to(result.dtype))
+                    result_flat.index_add_(0, rows, (delta * _lpl.scaling[slot]).to(result.dtype))


The loop iterates over _lpl._idx_to_name.items() and performs F.linear operations. This could be inefficient if the number of adapters is large. Consider vectorizing these operations if possible, or at least ensure that the x_sub selection and subsequent operations are as efficient as possible.

gemini-code-assist · 2026-05-04T18:57:58Z

+                    delta = torch.zeros(total_tokens, out_feat, dtype=dtype, device=result.device)
+                    apply_routed_lora(delta, x_flat, token_indices,
+                                      lora_A_by_idx, lora_B_by_idx, scaling_by_idx, dropout_by_idx)
+                    result = result + delta.view_as(result).to(result.dtype)


Creating a new torch.zeros buffer for delta on every forward pass can be memory-intensive. Consider reusing a buffer if possible, or ensure that this allocation does not become a performance bottleneck during training.

rcano-baseten closed this May 4, 2026

rcano-baseten reopened this May 4, 2026

gemini-code-assist Bot reviewed May 4, 2026

View reviewed changes

placeholder

bcf320b

rcano-baseten closed this May 4, 2026

rcano-baseten force-pushed the multi-lora branch from 245078f to bcf320b Compare May 4, 2026 19:12

rcano-baseten deleted the multi-lora branch May 4, 2026 19:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[draft] Multi lora#60

[draft] Multi lora#60
rcano-baseten wants to merge 1 commit intomodelscope:mainfrom
basetenlabs:multi-lora

rcano-baseten commented May 4, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 4, 2026

Uh oh!

gemini-code-assist Bot May 4, 2026

Uh oh!

gemini-code-assist Bot May 4, 2026

Uh oh!

gemini-code-assist Bot May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rcano-baseten commented May 4, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant