mamba2

Here are 4 public repositories matching this topic...

pathcosmos / EVAFRILL-Mo

Hybrid Mamba-2 + Transformer 2.94B LLM (Nemotron-H style) — Korean 3B model pretrained from scratch on 7× NVIDIA B200 GPUs with SFT + DPO alignment

transformer sft dpo pretraining fp8 korean-llm nemotron hybrid-architecture mamba2 nvidia-b200

Updated Mar 26, 2026
Python

gxcsoccer / alloy

Star

Hybrid SSM-Attention language model on Apple Silicon with MLX — interleaving Mamba-2 and Transformer for efficient inference

python machine-learning deep-learning transformer attention language-model ssm mamba hybrid-model mlx state-space-model apple-silicon llm mamba2

Updated Mar 29, 2026
Python

This is a complete testing and construction project for a recurrent small-parameter language model based on the Mamba2 architecture.这是一个完整的基于mamba2架构的循环小参数语言模型的测试与构建项目.And it try to be built with a Mano Optimiters.Mano is a new Optimister

mano mamba2

Updated Feb 7, 2026
Python

9to5ninja-projects / groundthink

Star

[RWKV6/mamba2(parallel fusion)] SSM

ssm rwkv6 mamba2

Updated Jan 19, 2026
Jupyter Notebook

Improve this page

Add a description, image, and links to the mamba2 topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the mamba2 topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly