Skip to content
View atandra2000's full-sized avatar
💭
Learning has no ending
💭
Learning has no ending

Block or report atandra2000

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
atandra2000/README.md

Atandra Bharati

Deep learning research engineer rebuilding frontier AI architectures from scratch in PyTorch — LLMs, latent diffusion, multimodal, video.

What I do

  • 11 from-scratch projects spanning LLMs (LLaMA-3, DeepSeek-V3, FusionLLM, GPT-2, TranslationLM), generative vision (Stable Diffusion, DCGAN, VAE, CycleGAN), multimodal (PaliGemma-style VLM), and video understanding (ST-GCN action recognition).
  • Single-GPU feasibility: engineered to run on A100, RTX 5090, RTX 6000 Ada, RTX 3090, P100, T4 with BF16, Flash Attention 2, gradient checkpointing, and torch.compile.
  • Faithful paper reproductions: MLA with absorption trick, aux-loss-free MoE, Multi-Token Prediction, Min-SNR, KL annealing, AdaIN conditioning.

Verified builds

Project Status Link
Stable Diffusion 1.x (860M UNet) Trained 42 epochs on 2x RTX 5090 repo
VisionLanguageModel (PaliGemma-style) Trained end-to-end on COCO captions repo
FaceAgingCycleGAN (AdaIN) 31 epochs on IMDB-WIKI repo
FaceGenerationVAE (beta-VAE) 50 epochs on CelebA repo
DCGAN-Face-Generation 50 epochs on CelebA repo
GPT-From-Scratch Trained on Tiny Shakespeare repo
TranslationLM (EN->IT) 20 epochs on OPUS Books repo

In-progress architectures

Project Status Link
DeepSeek-V3-Lite (MLA + MoE + MTP) Architecture and pipeline ready; pre-training not started repo
LLaMA-3-Lite (515M) Architecture and pipeline ready; pre-training not started repo
FusionLLM (MLA + GDN + MoE + MTP) Framework and tests ready; pre-training not started repo
ActionRecognition (ST-GCN) Pipeline and tests ready; NTU benchmark not started repo

Links

Pinned Loading

  1. StableDiffusion StableDiffusion Public

    A Stable Diffusion 1.x-class latent diffusion model trained from scratch on 2× RTX 5090 (Blackwell) GPUs. Full UNet (~860M params), DDPM/DDIM, LAION pipeline, DDP+BF16.

    Python

  2. DeepSeek-v3-Lite DeepSeek-v3-Lite Public

    Faithful from-scratch reimplementation of DeepSeek-V3 (MLA + MoE + MTP), scaled for Chinchilla-optimal 422M training on a single A100 80GB

    Python 1