nvidia-b200

Here are 2 public repositories matching this topic...

Hybrid Mamba-2 + Transformer 2.94B LLM (Nemotron-H style) — Korean 3B model pretrained from scratch on 7× NVIDIA B200 GPUs with SFT + DPO alignment

transformer sft dpo pretraining fp8 korean-llm nemotron hybrid-architecture mamba2 nvidia-b200

Korean 3B LLM (pure Transformer) pretrained from scratch on 8× NVIDIA B200 GPUs with SFT + ORPO alignment

transformer sft gqa pretraining fp8 korean-llm flash-attention gguf orpo nvidia-b200

Add a description, image, and links to the nvidia-b200 topic page so that developers can more easily learn about it.

To associate your repository with the nvidia-b200 topic, visit your repo's landing page and select "manage topics."