Skip to content

Optimize 2D tensor gathering to skip sensitive layers early#17

Open
gacty wants to merge 1 commit intosilveroxides:mainfrom
gacty:patch-1
Open

Optimize 2D tensor gathering to skip sensitive layers early#17
gacty wants to merge 1 commit intosilveroxides:mainfrom
gacty:patch-1

Conversation

@gacty
Copy link
Copy Markdown

@gacty gacty commented Jan 30, 2026

Summary

Adds early filtering for normalization/modulation layers at the 2D tensor gathering stage (line 261)

Problem

Currently, sensitive layers (norms, modulations, embeddings) are gathered into weight_keys and then filtered out later in the exclusion logic (around line 324+). This means they're loaded into memory unnecessarily

Solution

Move common exclusion patterns up to the gathering stage using the existing AVOID_KEY_NAMES constant:

and not any(avoid in key for avoid in AVOID_KEY_NAMES)

This acts as a "first line of defense" before the MODEL_FILTERS logic

Benefits

Reduces the memory footprint of weight_keys list
Will avoid unnecessary downstream filtering for already-excluded layers
Complements an existing MODEL_FILTERS system
No behavioral changes

Uses existing AVOID_KEY_NAMES to skip sensitive layers before loading into weight_keys
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant