ML Engineer building production-grade AI systems with safety at the core. Currently researching Multi-Agent RL for cybersecurity at the University of Arizona and co-authoring StepShield β a safety benchmark for autonomous code agents (submitted to ICML 2026). Previously built recommendation engines at Escape LLC (30% engagement lift) and agentic RAG chatbots at Omdena (95% reduction in harmful responses).
I don't treat AI safety as a checkbox β I treat it as an engineering discipline.
|
First benchmark for evaluating when autonomous code agents go rogue β not just whether they do. Detects specification violations (data exfiltration, unauthorized access) in real-time across 9,213 agent trajectories. Early detection cuts monitoring costs by 75% (~$108M projected savings).
|
|
Production-grade LLM evaluation + red-teaming Hybrid n8n + FastAPI architecture with 4 LLM providers, LLM-as-Judge scoring, circuit breaker, DLQ, Redis caching, Prometheus/Grafana monitoring.
|
ML-infra-aware defense for model weights Protects against model-weight exfiltration using a 3-layer cascaded architecture (Rules β ML β LLM). Kubernetes-native, GPU-aware anomaly detection.
|
|
7-benchmark bias evaluation + guardrails Open-source LLM bias evaluation framework with red-teaming, guardrails, and monitoring β all running locally via Ollama. Zero API costs.
|
Production-grade ML pricing system XGBoost demand forecasting + price elasticity estimation + scipy revenue optimization. FastAPI serving, Streamlit dashboard, MLflow tracking, Evidently drift monitoring.
|
|
Full-stack speech pipeline: STT β LLM β TTS End-to-end voice assistant running entirely on your own machine β FastAPI backend, React frontend, Docker. Private by design: zero cloud calls.
|
AI motorcycle advisor for Indian riders RAG over motorcycle specs with vLLM serving, Qdrant vector store, FastAPI. Personalized bike recommendations with source citations.
|
- chatbot-auditor β Quality auditor for AI chatbots; analyzes conversation logs to surface where bots underperform.
- credit-scoring-fairness-mlops β End-to-end MLOps with automated fairness gates, drift monitoring, EU AI Act compliance (XGBoost, Fairlearn, MLflow).
- healthcare-bias-audit β Bias audit of healthcare ML on the MEPS dataset; AIF360 mitigation, SHAP/LIME explainability.
- AI-Chief β Food science assistant with multi-agent RAG, real-time safety monitoring, dangerous-advice detection (TypeScript, Fastify, HNSW).
- Interactive-Multilingual-AI-Audiobook-Assistant β OCR extraction β neural TTS β multilingual translation β real-time Q&A audiobook pipeline.
- AI-Wildlife-Tracker β RAG identifying 500+ Indian wildlife species from text or photos; hybrid retrieval, ONNX inference, Langfuse observability.
- Multilingual-Sentiment-Emotion-Intelligence-Engine β 5 languages + Hindi-English code-switching; multi-task XLM-RoBERTa with LoRA adapters, ONNX INT8.
- Algorithmic-Trading-AI β FinBERT sentiment + spaCy NER + TimeGPT forecasting β BUY/SELL/HOLD signals from real-time financial news.
- LLaMA-Sum-Fine-Tuning β LLaMA 3.2 1B fine-tuned via QLoRA; 40%+ ROUGE-2 improvement over base on CNN/DailyMail.
| π» Languages | |
| π€ ML / DL | |
| π§ LLM & Agents | |
| π οΈ MLOps / Cloud | |
| π Observability | |
| π‘οΈ AI Safety & Responsible AI | |
| ποΈ Data |
Open to ML Engineer, AI Safety, and AI Researcher roles β remote & relocation
Let's build AI systems that are powerful AND trustworthy.




