Skip to content
View parikshitiiitb's full-sized avatar

Block or report parikshitiiitb

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
parikshitiiitb/README.md

Hi, I'm Parikshit Sharma

Senior AI/ML Engineering Leader building production AI systems that scale.
I work at the intersection of LLMs, MLOps, and distributed systems turning experimental prototypes into enterprise-grade platforms.


What I build

Area What I focus on
RAG & LLMs Semantic chunking, hallucination detection, retrieval optimization
MLOps CI/CD pipelines, model monitoring, drift detection, 99.7% uptime
Inference Quantization (INT8/TensorRT), dynamic batching, 5M+ predictions/day
Agentic AI LangChain agents, tool-calling, state machines, multi-step workflows
Cost Optimization Infra attribution, auto-scaling, 42% cost reduction playbooks

Featured Repositories

Repo What it is Stars
production-rag-kit Enterprise RAG with guardrails & observability ![stars]
ml-monitoring-stack 4-layer ML observability framework ![stars]
inference-optimizer Dynamic batching + INT8 + predictive scaling ![stars]
agentic-workflow-kit LangChain agents with state + approval gates ![stars]
mlops-cicd-templates Blue-green ML deployments, GitHub Actions ![stars]
feature-store-lite Redis-backed feature store, <20ms retrieval ![stars]

Numbers that matter

5M+ predictions/day | 68% latency reduction (250ms -> 80ms)
99.7% deployment uptime | 42% infra cost reduction
85% RAG answer accuracy | 120 eng-hours/week automated


Tech Stack

LLM & GenAI

Python LangChain OpenAI LangChain LlamaIndex OpenAI Anthropic HuggingFace Pinecone Weaviate

MLOps & Infra

Kubernetes Airflow MLflow Docker

Cloud

AWS OCI GCP

Data & Serving

FastAPI Redis Kafka Spark

MLOps & Infrastructure

MLflow Airflow Kubernetes Docker Prometheus Grafana GitHub_Actions Terraform WandB

Data and Serving

FastAPI Redis Kafka Spark PyTorch TensorRT

Core Languages

Python SQL JavaScript Bash


Writing

I write about production AI engineering — the gap between demos and real systems.


GitHub Stats

GitHub Stats

Top Languages

GitHub Streak

Activity Graph

Profile Views

Wakatime Stats


Let's connect

If you're working on production AI systems, I'd love to talk.

Popular repositories Loading

  1. production-rag-kit production-rag-kit Public

    Enterprise-ready RAG template: semantic chunking, NLI hallucination detection, latency budgeting, Prometheus observability. Battle-tested at scale.

    Python 1

  2. parikshitiiitb parikshitiiitb Public

  3. system-design-primer system-design-primer Public

    Forked from donnemartin/system-design-primer

    Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.

    Python

  4. ragflow ragflow Public

    Forked from infiniflow/ragflow

    RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs

    Python

  5. job-tracker job-tracker Public

    The app is fully standalone one HTML file, no npm, no build step, no backend, no login. Anyone can clone and open index.html directly in their browser and it just works. Data lives in localStorage …

    HTML

  6. ml-monitoring-stack- ml-monitoring-stack- Public

    4-layer ML observability framework: data quality (Great Expectations), feature drift (PSI), model performance, business metric correlation. Catches 92% of drift before production impact.