add glora by not-lain · Pull Request #3098 · huggingface/peft

not-lain · 2026-03-13T01:57:20Z

follow up on #780 and #2568
this pr adds GLoRA to the library
i also made a minor colab notebook to test this remotely in which you can find here

…_adapter_config signature (for linting reasons)

BenjaminBossan

Thanks for reviving the GLoRA implementation to PEFT. There is still a lot missing (docs, examples, more tests) but let's focus on the integration for now and work on the rest later.

Unfortunately, this PR seems to be based on the state that PEFT was in when the PR was first suggested. We made several refactors since then, which require different patterns to implement. The good news is that the final result should be much simpler with less code required. I marked the corresponding parts, please check. Maybe this is even something that a coding agent can do if asked to update the implementation and pointed to the most recent PEFT code for reference.

BenjaminBossan · 2026-03-16T16:12:15Z

+
+
+# Refactored GLoraLinear for PEFT compatibility
+class GLoraLinear(GLoraLayer, nn.Linear):


You probably based this PR on the very old original PR. Since then, we had several refactors in PEFT, some of which affect how we designed the adapter layers. Could you please check the latest implementation in tuners/lora/layer or PRs of other recent PEFT additions (e.g. #2851)? Most notably, we now pass the base_layer (i.e. the original layer) and wrap it inside the PEFT layer. To get the results of the base layer, we can then call self.base_layer(x) in the forward call (no need for F.linear(x, self.weight)).

BenjaminBossan · 2026-03-16T16:18:20Z

+                m.bias.requires_grad = True
+
+
+class GLoraModel(BaseTuner):


Similar argument as for GLoraLayer: We have substantially refactored this part of PEFT. The good news is that it should greatly simplify the overall implementation: You only need to define _create_and_replace and _create_new_module, the remaining methods should all be fine when inherited from the parent class. Moreover, we need these class attributes:

prefix: str = "glora_"

tuner_layer_cls = GloraLayer

target_module_mapping = TRANSFORMERS_MODELS_TO_GLORA_TARGET_MODULES_MAPPING

BenjaminBossan · 2026-03-16T16:24:27Z

+        self.glora_Cu: nn.ParameterDict = nn.ParameterDict()
+        self.glora_D: nn.ParameterDict = nn.ParameterDict()
+        self.glora_E: nn.ParameterDict = nn.ParameterDict()
+        self.eval_config: dict[str, dict[str, object]] = {}


Why do we need this? IIUC, this is supposed to be an object that unifies the settings for each parameter, as we have distinct options like "vector" or "constant". I haven't checked all the details, but ideally we would just define during the layer initialization something like:

if config_A_B == "lora": self.glora_Ad[adapter_name] = nn.Linear(...) elif config_A_B == "vector": self.glora_Ad[adapter_name] = ...

The arguments config_A_B etc. should be passed to the __init__ and update_layer methods, directly coming from the GloraConfig.

This change may require some custom nn.Modules but that would be fine. I would really like to avoid the whole prepare_path call during forward and frontload the whole resolution to the initialization. This make the forward/merge/unmerge call simpler to understand and should also be slightly more performant.

This comment is still relevant.

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

BenjaminBossan · 2026-03-23T13:04:16Z

Note that due to another PEFT method being merged, there is now a merge conflict, but it should be straightforward to resolve. Once you're finished, please ping me for another review.

- Updated GloraLayer to inherit from BaseTunerLayer - Enhanced adapter management with new methods and improved type checks. - Refined initialization to ensure compatibility with nn.Linear layers only. - Adjusted merge logic to handle weight and bias more robustly. - Updated tests to skip unsupported quantized layers.

not-lain · 2026-04-20T00:38:34Z

@BenjaminBossan
made the necessary changes mentioned above and made a couple extra changes to match that of test_custom_models.
would appreciate it if you can take a look at this pr.

BenjaminBossan

Thanks for the latest updates. I found that there is still some work left to make GLoRA consistent with other PEFT methods. Please check my comments and don't hesitate to ask if something is unclear.

Also note that in the meantime, we have updated the contribution guidelines for adding new PEFT methods: https://github.com/huggingface/peft/blob/main/docs/source/developer_guides/contributing.md#add-a-new-peft-fine-tuning-method. You already tick many boxes, but please refer to this when it comes to what's still missing.

BenjaminBossan · 2026-04-21T13:18:58Z

+        self.out_features: int
+        base = self.get_base_layer()
+        # Use exact type check: bitsandbytes.Linear4bit subclasses nn.Linear but is not compatible with GLORA math.
+        if type(base) is not nn.Linear:


Let's remove this. The check for the correct type should happen in _create_new_module.

BenjaminBossan · 2026-04-21T13:36:12Z

+            "config_D_E": peft_config.config_D_E,
+        }
+        new_module = GloraLinear(target, **kwargs_glora)
+        new_module.add_adapter(


Again, don't call add_adapter here, new_module = GloraLinear(target, **kwargs_glora) should be enough, as GloraLinear should call self.update_layer.

Also note that, as mentioned earlier, we did a small refactor here, so the peft_config should be passed to GloraLinear's __init__.

BenjaminBossan · 2026-04-21T13:53:18Z

        _skip_if_merging_not_supported(test_name, config_cls, config_kwargs_1)
        _skip_tests_with_multiple_adapters_with_target_parameters(config_cls, config_kwargs_2)

+        config_1 = config_cls(**config_kwargs_1)


This shouldn't be necessary. Why do the results differ?

The issue is that GLoRA's merge is multiplicative, not purely additive, during multi-active forward, each adapter computes its delta against the original frozen W0 thus the formula :

result = W0@x + (W0*A1 + B1)@x + (W0*A2 + B2)@x

But sequential merging mutates W0 in place between adapters:

after adapter1: W0' = W0 + W0*A1 + B1

after adapter2: W0'' = W0' + W0'*A2 + B2

The second merge uses the already-modified W0', meaning W0'*A2 expands to (W0 + W0*A1 + B1)*A2 instead of the original W0*A2

LoRA doesn't have this issue because its merge is purely additive (W0 + delta), so i chose to skip the test for GLoRA for now.

So you're saying that if I have a GLoRA model with two active adapters, it behaves differently when merged vs unmerged? I think this should be treated like a bug. IIUC, it should be possible to implement the merge so that it gives the same result by not merging sequentially but combining the adapters before merging.

That said, if a user wants to merge adapter 1 first and then adapter 2 in a separate call, that would indeed require extra logic, as the results of naively merging the 2nd adapter would not be correct. I see two solutions here:

First unmerge adapter 1, then merge 1 and 2 together.

When merging, check if there is already a merged adapter and raise an error, telling users they need to merge all adapters in one step.

WDYT?

chose to unmerge adapters before merging, didn't want to introduce an edge case for glora
you can review the changes related to this change in 43ceaee

BenjaminBossan

Thanks for the updates. I still found a couple of issues, but the integration is close to being ready. Please take a look.

BenjaminBossan · 2026-06-01T12:20:10Z

+        self._active_adapter = adapter_name
+        self.update_layer(adapter_name, config.r, config=config)
+
+    def _mixed_batch_forward(


If you want GLoRA to support mixed adapter batches, you have to also implement _enable_peft_forward_hook on GloraModel, see e.g. here. Also, please ensure that they're also being tested. Please check TestMixedAdapterBatches in test_custom_models.py. It's also fine not to support it, in which case you can just remove _mixed_batch_forward and the adapter_names = kwargs.pop("adapter_names", None) part in forward.

did implement this in 66c57fa
IMO this would look better if we can turn it into a mixin, maybe even open an issue and try to integrate mixed adapter batching into the rest of the peft methods. But i digress, let me know what you feel about this and feel free to resolve this thread since this is already done

I agree that would be a nicer solution, though I think not quite trivial to implement. But that's for a separate PR. I'll resolve the discussion with the next review.

BenjaminBossan · 2026-06-01T12:36:47Z

        _skip_if_merging_not_supported(test_name, config_cls, config_kwargs_1)
        _skip_tests_with_multiple_adapters_with_target_parameters(config_cls, config_kwargs_2)

+        config_1 = config_cls(**config_kwargs_1)


So you're saying that if I have a GLoRA model with two active adapters, it behaves differently when merged vs unmerged? I think this should be treated like a bug. IIUC, it should be possible to implement the merge so that it gives the same result by not merging sequentially but combining the adapters before merging.

That said, if a user wants to merge adapter 1 first and then adapter 2 in a separate call, that would indeed require extra logic, as the results of naively merging the 2nd adapter would not be correct. I see two solutions here:

First unmerge adapter 1, then merge 1 and 2 together.

When merging, check if there is already a merged adapter and raise an error, telling users they need to merge all adapters in one step.

WDYT?

not-lain added 7 commits March 6, 2026 04:22

add glora

2b76146

update glora layer to match paper implementation, and update tests

6108568

add missing inference_mode parameter in set_adapter method

ff1ded8

add missing safe_merge parameter in merge_and_unload method

b0dd263

fix parameter names in _create_new_module

7a7a820

update _replace_module, _mark_only_adapters_as_trainable and _prepare…

14a2400

…_adapter_config signature (for linting reasons)

fix circular import

c1ed1c3

BenjaminBossan requested changes Mar 16, 2026

View reviewed changes

not-lain and others added 4 commits March 17, 2026 07:55

Update docs/source/package_reference/glora.md

f3272cb

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

GLora -> Glora

a1ecaff

r = 8

b1c0ff7

lower case config params "LoRA" -> "lora"

324173a

not-lain added 10 commits March 23, 2026 15:08

Merge branch 'main' of github.com:huggingface/peft into glora2

894a095

update metadata

912efe4

use literal instead of str in config

a2601a3

remove redundant allow_lora parameter

05f120d

add a reference to the paper

3e4b8c1

add target_module_mapping attribute

a6c9f57

use testing_common and fix all issues from failed tests

45009b0

initialize glora params directly and update readme

9f8352e

lint

93793b1

BenjaminBossan requested changes Apr 21, 2026

View reviewed changes

not-lain added 4 commits May 3, 2026 15:53

Merge upstream/main into glora2

55c04fa

better

0e9b580

reset testing_comming

8dfdbc6

Merge remote-tracking branch 'upstream/main' into glora2

b13e440

BenjaminBossan requested changes Jun 1, 2026

View reviewed changes

not-lain added 10 commits June 1, 2026 20:31

unformatt utils file

7b3201d

GloraLoRAPath -> GloraLoraPath

7388d9c

add reset_parameters method

b03ce91

update imports

695cbb3

better lint

26174e6

avoid unnecessary clones

bd5bba4

add mixed adapter support

66c57fa

less gpu overhead

27f8042

remove unecessary testing function

236f673

fix sequential merge

43ceaee

not-lain requested a review from BenjaminBossan June 3, 2026 07:55



		# Refactored GLoraLinear for PEFT compatibility
		class GLoraLinear(GLoraLayer, nn.Linear):

Conversation

not-lain commented Mar 13, 2026

Uh oh!

BenjaminBossan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

BenjaminBossan commented Mar 23, 2026

Uh oh!

not-lain commented Apr 20, 2026

Uh oh!

BenjaminBossan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

not-lain May 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

BenjaminBossan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

not-lain May 31, 2026 •

edited

Loading