Skip to content
View VatsalSangani's full-sized avatar

Block or report VatsalSangani

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
vatsalsangani/README.md

Hi, I'm Vatsal Sangani πŸ‘‹

AI Engineer | 4 Years in ML β€’ 3 Years in GenAI/LLM | AWS Certified ML Engineer | MSc Warwick

Portfolio LinkedIn Email


πŸš€ About Me

I build production AI systems that actually work in the real world β€” not just notebooks and demos.

  • πŸŽ“ MSc Computer Science β€” University of Warwick (2024)
  • πŸ† AWS Certified Machine Learning Engineer – Associate (MLA-C01)
  • πŸ”­ Specialised in LangGraph multi-agent orchestration, RAG pipelines, and FastAPI backend development
  • ☁️ All projects deployed on AWS EC2 with full observability (Prometheus/Grafana) and CI/CD
  • 🌍 Based in London, UK | Open to AI Engineer & ML Engineer roles across UK and India

πŸ—οΈ Featured Projects

πŸ›‘οΈ AML Sentinel β€” Anti-Money Laundering Detection

Production fintech ML system processing 31.9M financial transactions via PySpark on AWS. XGBoost + LightGBM ensemble achieving AUC-ROC 0.9857 and 81.23% recall on 6.3M test transactions. SHAP explainability for compliance auditing. React dashboard. Prometheus/Grafana monitoring. GitHub Actions CI/CD.

Python XGBoost LightGBM SHAP PySpark FastAPI React AWS EC2 Prometheus Grafana


πŸ₯ WiseWell β€” Biomedical RAG System

Production RAG system retrieving evidence from 252,316 PubMed abstracts using Pinecone Vector Database retrieval via Claude Haiku. Faithfulness evaluation (Langfuse), improving answer reliability from ~0.45 to ~0.71. Evolved architecture from fine-tuned BioBART (LoRA/PEFT) to production RAG.

Python FAISS BM25 AWS Bedrock FastAPI React LLM RAG LoRA/PEFT


πŸ” RepoGuard β€” Multi-Agent Code Security Scanner

LangGraph multi-agent pipeline where 4 specialised agents (Parser, Guardrails, Processor, Aggregator) coordinate security audit workflows via MCP integrations. Human-in-the-loop approval gates. 100% success rate capturing hidden secrets in stress testing. <$0.01 per audit.

Python LangGraph LangChain MCP OpenAI Streamlit AWS EC2


πŸš• NYC Taxi Serverless Analytics β€” AWS Data Pipeline

Serverless analytics pipeline over NYC taxi trip data. S3 partitioned data lake, AWS Glue schema discovery, Athena SQL queries, Lambda automation β€” no always-on infrastructure.

Python AWS S3 AWS Glue Athena Lambda PySpark Parquet


πŸ” API Gateway Monitoring β€” Production Observability

Production API gateway with token-based rate limiting via Redis, Prometheus metrics, Grafana dashboards, and Streamlit admin dashboard. Full stack containerised with Docker Compose.

Python FastAPI Redis Prometheus Grafana Docker Streamlit


πŸ’³ Credit Risk Modelling β€” ML with CI/CD

End-to-end credit risk prediction with automated GitHub Actions CI/CD β€” auto-deploys to AWS EC2 on every push to main.

Python scikit-learn XGBoost Docker GitHub Actions AWS EC2 Streamlit


πŸ› οΈ Tech Stack

AI/LLM: LangChain β€’ LangGraph β€’ RAG β€’ FAISS β€’ BM25 β€’ Prompt Engineering β€’ LoRA/PEFT β€’ OpenAI API β€’ AWS Bedrock β€’ SHAP

ML/Data: PyTorch β€’ TensorFlow β€’ XGBoost β€’ LightGBM β€’ scikit-learn β€’ PySpark β€’ Pandas β€’ NumPy

Backend: Python β€’ FastAPI β€’ REST APIs β€’ asyncio β€’ Redis β€’ Docker β€’ Nginx β€’ PostgreSQL

Cloud/DevOps: AWS (EC2 β€’ S3 β€’ Lambda β€’ Glue β€’ Athena β€’ Bedrock β€’ CloudWatch) β€’ Terraform β€’ GitHub Actions β€’ Prometheus β€’ Grafana

Frontend: React β€’ TypeScript β€’ Streamlit β€’ Power BI


πŸ“Š GitHub Stats

Vatsal's GitHub Stats

Top Languages


🀝 Let's Connect

I'm actively looking for AI Engineer and ML Engineer roles in the UK and India.

Pinned Loading

  1. aml-sentiment aml-sentiment Public

    Jupyter Notebook

  2. NYC-Taxi-Serverless-Analytics NYC-Taxi-Serverless-Analytics Public

    Python

  3. repoguard repoguard Public

    Python

  4. api_gateway_monitoring api_gateway_monitoring Public

    Python

  5. Wise_Well_Chatbot_Lite Wise_Well_Chatbot_Lite Public

    Python

  6. Credit_Risk_Modelling Credit_Risk_Modelling Public

    Jupyter Notebook