tensor-parallelism

Here are 16 public repositories matching this topic...

bigscience-workshop / petals

🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading

Updated Sep 7, 2024
Python

InternLM / InternEvo

InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencies.

pytorch multi-modal gemma pipeline-parallelism transformers-models tensor-parallelism llava llm-training internlm flash-attention zero3 llm-framework sequence-parallelism internlm2 ring-attention deepspeed-ulysses llama3 910b

Updated Aug 21, 2025
Python

kaiyuyue / torchshard

Star

Slicing a PyTorch Tensor Into Parallel Shards

pytorch model-parallelism tensor-parallelism

Updated Jun 7, 2025
Python

ai-decentralized / BloomBee

Star

Decentralized LLMs fine-tuning and inference with offloading

distributed-systems machine-learning deep-learning pytorch llama pipeline-parallelism tensor-parallelism

Updated Jan 14, 2026
Python

xrsrke / pipegoose

Star

Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*

transformers moe data-parallelism distributed-optimizers model-parallelism megatron mixture-of-experts pipeline-parallelism huggingface-transformers megatron-lm tensor-parallelism large-scale-language-modeling 3d-parallelism zero-1 sequence-parallelism

Updated Dec 14, 2023
Python

gty111 / gLLM

Star

gLLM: Global Balanced Pipeline Parallelism System for Distributed LLM Serving with Token Throttling

pipeline-parallelism tensor-parallelism llm-serving llm-inference pagedattention continuous-batching qwen3 token-throttling chunked-prefill

Updated Jan 12, 2026
Python

aniquetahir / JORA

Star

JORA: JAX Tensor-Parallel LoRA Library (ACL 2024)

machine-learning lora jax tensor-parallelism large-language-models

Updated Apr 25, 2024
Python

ShinoharaHare / LLM-Training

Star

A distributed training framework for large language models powered by Lightning.

transformer llama distributed-training fine-tuning pre-training tensor-parallelism llm instruction-tuning llm-training llm-finetuning phi-3

Updated Jul 31, 2025
Python

AlibabaPAI / FlashModels

Star

Fast and easy distributed model training examples.

deep-learning pytorch zero data-parallelism model-parallelism distributed-training xla tensor-parallelism llm fsdp sequence-parallelism

Updated Nov 26, 2024
Python

fattorib / transformer_shmap

Star

Tensor Parallelism with JAX + Shard Map

transformers gpt tpu jax tensor-parallelism pjit shmap

Updated Sep 29, 2023
Python

NiuHuangxiaozi / Deep-Learning-Parallelism

Star

This repository outlines a comprehensive guide for training a distributed deep learning model.

pytorch ps ddp allreduce pipline deepspeed tensor-parallelism

Updated Jul 2, 2024
Python

eduardburlacu / NanoTransformer

Star

Communication-efficient Tensor Parallelism for GPT-2

distributed-computing tensor-parallelism llm-training

Updated Oct 17, 2025
Python

vLLM - High-throughput, memory-efficient LLM inference engine with PagedAttention, continuous batching, CUDA/HIP optimization, quantization (GPTQ/AWQ/INT4/INT8/FP8), tensor/pipeline parallelism, OpenAI-compatible API, multi-GPU/TPU/Neuron support, prefix caching, and multi-LoRA capabilities

Updated Jan 26, 2026
Elixir

CoffeeVampir3 / Hyper-AMX

Star

Repo for AMX + FAST

inference amx tensor quantization avx512 inference-engine matmul numa-aware tensor-parallelism

Updated Nov 1, 2025
C++

shreyansh26 / wordle-solver

Star

Training Qwen3 to solve Wordle using SFT and GRPO

rl wordle sft rft tensor-parallelism wordle-solver llm fsdp grpo qwen3

Updated Sep 22, 2025
Python

George614 / gpu-mem-calculator

Star

GPU Memory Calculator for LLM Training - Calculate GPU memory requirements for training Large Language Models with support for multiple training engines including PyTorch DDP, DeepSpeed ZeRO, Megatron-LM, and FSDP.

Updated Jan 26, 2026
Python

Improve this page

Add a description, image, and links to the tensor-parallelism topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the tensor-parallelism topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tensor-parallelism

Here are 16 public repositories matching this topic...

bigscience-workshop / petals

InternLM / InternEvo

kaiyuyue / torchshard

ai-decentralized / BloomBee

xrsrke / pipegoose

gty111 / gLLM

aniquetahir / JORA

ShinoharaHare / LLM-Training

AlibabaPAI / FlashModels

fattorib / transformer_shmap

NiuHuangxiaozi / Deep-Learning-Parallelism

eduardburlacu / NanoTransformer

nshkrdotcom / vllm

CoffeeVampir3 / Hyper-AMX

shreyansh26 / wordle-solver

George614 / gpu-mem-calculator

Improve this page

Add this topic to your repo