Atandra Bharati atandra2000

Atandra Bharati

Deep learning research engineer rebuilding frontier AI architectures from scratch in PyTorch — LLMs, latent diffusion, multimodal, video.

11 from-scratch projects spanning LLMs (LLaMA-3, DeepSeek-V3, FusionLLM, GPT-2, TranslationLM), generative vision (Stable Diffusion, DCGAN, VAE, CycleGAN), multimodal (PaliGemma-style VLM), and video understanding (ST-GCN action recognition).
Single-GPU feasibility: engineered to run on A100, RTX 5090, RTX 6000 Ada, RTX 3090, P100, T4 with BF16, Flash Attention 2, gradient checkpointing, and torch.compile.
Faithful paper reproductions: MLA with absorption trick, aux-loss-free MoE, Multi-Token Prediction, Min-SNR, KL annealing, AdaIN conditioning.

Project	Status	Link
Stable Diffusion 1.x (860M UNet)	Trained 42 epochs on 2x RTX 5090	repo
VisionLanguageModel (PaliGemma-style)	Trained end-to-end on COCO captions	repo
FaceAgingCycleGAN (AdaIN)	31 epochs on IMDB-WIKI	repo
FaceGenerationVAE (beta-VAE)	50 epochs on CelebA	repo
DCGAN-Face-Generation	50 epochs on CelebA	repo
GPT-From-Scratch	Trained on Tiny Shakespeare	repo
TranslationLM (EN->IT)	20 epochs on OPUS Books	repo

Project	Status	Link
DeepSeek-V3-Lite (MLA + MoE + MTP)	Architecture and pipeline ready; pre-training not started	repo
LLaMA-3-Lite (515M)	Architecture and pipeline ready; pre-training not started	repo
FusionLLM (MLA + GDN + MoE + MTP)	Framework and tests ready; pre-training not started	repo
ActionRecognition (ST-GCN)	Pipeline and tests ready; NTU benchmark not started	repo