Pinned Loading
-
Awni00/algorithmic-generalization-transformer-architectures
Awni00/algorithmic-generalization-transformer-architectures Public -
MS-Attn-Simulation
MS-Attn-Simulation PublicSimulation code for paper "Training Dynamics of Multi-Head Softmax Attention for In-Context Learning: Emergence, Convergence, and Optimality"
-
-
-
gszfwsb/Unveiling-Induction-Heads
gszfwsb/Unveiling-Induction-Heads PublicPyTorch implementation for "Unveiling Induction Heads: Provable Training Dynamics and Feature Learning in Transformers", NeurIPS 2024
Jupyter Notebook 2
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.