Skip to content
View FFishy-git's full-sized avatar

Highlights

  • Pro

Block or report FFishy-git

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. Awni00/algorithmic-generalization-transformer-architectures Awni00/algorithmic-generalization-transformer-architectures Public

    Jupyter Notebook 6 1

  2. MS-Attn-Simulation MS-Attn-Simulation Public

    Simulation code for paper "Training Dynamics of Multi-Head Softmax Attention for In-Context Learning: Emergence, Convergence, and Optimality"

    Python 4 1

  3. data_parallel data_parallel Public

    A useful tool for running data_parallelism

    Python 1 1

  4. TamingSAE_GBA TamingSAE_GBA Public

    Jupyter Notebook 1

  5. gszfwsb/Unveiling-Induction-Heads gszfwsb/Unveiling-Induction-Heads Public

    PyTorch implementation for "Unveiling Induction Heads: Provable Training Dynamics and Feature Learning in Transformers", NeurIPS 2024

    Jupyter Notebook 2