Senior AI/ML Engineering Leader building production AI systems that scale.
I work at the intersection of LLMs, MLOps, and distributed systems turning experimental prototypes into enterprise-grade platforms.
| Area | What I focus on |
|---|---|
| RAG & LLMs | Semantic chunking, hallucination detection, retrieval optimization |
| MLOps | CI/CD pipelines, model monitoring, drift detection, 99.7% uptime |
| Inference | Quantization (INT8/TensorRT), dynamic batching, 5M+ predictions/day |
| Agentic AI | LangChain agents, tool-calling, state machines, multi-step workflows |
| Cost Optimization | Infra attribution, auto-scaling, 42% cost reduction playbooks |
| Repo | What it is | Stars |
|---|---|---|
| production-rag-kit | Enterprise RAG with guardrails & observability | ![stars] |
| ml-monitoring-stack | 4-layer ML observability framework | ![stars] |
| inference-optimizer | Dynamic batching + INT8 + predictive scaling | ![stars] |
| agentic-workflow-kit | LangChain agents with state + approval gates | ![stars] |
| mlops-cicd-templates | Blue-green ML deployments, GitHub Actions | ![stars] |
| feature-store-lite | Redis-backed feature store, <20ms retrieval | ![stars] |
5M+ predictions/day | 68% latency reduction (250ms -> 80ms)
99.7% deployment uptime | 42% infra cost reduction
85% RAG answer accuracy | 120 eng-hours/week automated
I write about production AI engineering — the gap between demos and real systems.
- Substack: The Production ML Dispatch
- Real lessons from building AI at scale. Biweekly.
- Medium / TDS: Long-form technical deep-dives on RAG, MLOps, inference optimization
- LinkedIn: linkedin.com/in/parikshitiiitb
If you're working on production AI systems, I'd love to talk.
- Email: psharma2345@gmail.com
- LinkedIn: linkedin.com/in/parikshitiiitb
- Substack: parikshitiiitb.substack.com