AI & ML Engineer Β· Open Source Contributor Β· Scientific ML & Genomics NLP
π¬ Building ML systems for science and production - transformers, diffusion models, NLP, and RAG pipelines.
π Active contributor to Open Climate Fix (3 merged PRs) and MalariaGEN (2 merged PRs in malaria vector genomics).
Currently exploring natural-language interfaces for genomic data : NLP Interface PoC
- MalariaGEN NLP Interface PoC : Natural language β
malariagen_dataAPI translation for querying malaria vector genomic data (10/10 queries resolved, 7 API methods covered) - Weather Transformer : Physics-aware Vision Transformer for weather forecasting, built from scratch in PyTorch (74 tests, beats persistence baseline by 27%)
- Biodiversity Publication Analyzer : NLP pipeline (SciBERT + TF-IDF) to classify biodiversity genomics articles from Europe PMC (81 tests, 99.5% F1)
- LLaMA Task Agent : Fine-tuned LLaMA-3.1-8B with LoRA for structured tool execution (100% format compliance)
- Complaint Intelligence : RAG pipeline with FAISS + Gemini over consumer complaints (live app)
- Thermalizer (OCF) : Diffusion-based denoising layer for autoregressive weather forecasting (merged)
| Org | PR | Status |
|---|---|---|
| MalariaGEN | Fixed cnv_discordant_read_calls indentation error |
β Merged |
| MalariaGEN | Added "lower triangle" annotation to Fst heatmap |
β Merged |
| Open Climate Fix | ThermalizerLayer implementation | β Merged |
| Open Climate Fix | NNJA-AI V1 dataset loader | β Merged |
| Open Climate Fix | Thermalizer diffusion fix | β Merged |
PyTorch Transformers Diffusion Models LoRA/PEFT SciBERT FAISS RAG FastAPI xarray zarr plotly bokeh



