Code for "Matryoshka Plasticity: Exploiting Nested Transformer Structure for Zero‑Overhead Continual Learning"
machine-learning deep-learning edge-computing continual-learning catastrophic-forgetting edge-ai pretraining llm matformer nested-transformers gradient-masking
-
Updated
May 3, 2026 - Python