Staff Software Engineer | Cloud Infrastructure | Kubernetes SRE | GenAI Platforms | Platform Engineering
π Currently: Co-Founder and Engineering Lead at StartupManch, building GenAI and platform infrastructure
π’ Experience: 8+ years designing and operating large-scale cloud and developer platforms
π Key Metrics:
- Architected multi-tenant cloud platforms processing 5M+ events/day and 500K+ daily queries
- Designed systems with 99.8-99.9% availability SLOs across production regions
- Reduced cloud spend by 15-25% ($220K/year) through autoscaling and FinOps
- Improved deployment time from 3 days to 6 hours through platform automation
- Reduced incident MTTD by 25% and alert noise by 40-60% through AIOps
π± Specializations:
- Multi-tenant cloud control planes (GCP Cloud Run, Kubernetes)
- SRE & Kubernetes fleet governance (EKS/GKE across 100+ nodes)
- GenAI/MLOps infrastructure (RAG pipelines, Vector DBs, Model serving)
- Internal developer platforms improving engineering velocity
- Distributed systems & platform-as-a-product thinking
Cloud Platforms: AWS (ECS, Fargate, Lambda), GCP (Cloud Run, Firestore, Pub/Sub), Azure, Kubernetes (EKS/GKE)
Infrastructure & DevOps: Terraform, GitHub Actions, GitOps, CI/CD, Vault, IAM, Zero Trust
Data & Streaming: Kafka, Pub/Sub, Redis, PostgreSQL, MongoDB Atlas, Firestore, Vector DBs
AI/ML: TensorFlow, MLflow, Kubeflow, RAG pipelines, LangChain, Model monitoring
Observability: Prometheus, Grafana, ELK Stack, Tracing, SLOs, AIOps
Languages: Python, Go, JavaScript, Bash, C# (.NET), Kotlin, SQL
StartupManch (Jul 2020 - Present) | Co-Founder & Engineering Lead
- Multi-tenant GenAI platform (5M events/day, 99.8% SLO)
- GitOps developer platform with environments & secrets management
- Deployment time: 3 days β 6 hours
- Observability stack with Prometheus, Grafana, tracing
Okta (Nov 2023 - Jun 2025) | Staff Software Engineer - Cloud, SRE, AIOps
- Zero-downtime EKS fleet upgrades (99.9% availability)
- AIOps platform with anomaly detection (50+ alert reduction)
- TensorFlow-based capacity forecasting (15-20% cost savings)
- MLOps control plane for model training & safe rollout
- Mentored engineers on architecture & incident reviews
Lookout/CipherCloud (Jan 2022 - Oct 2023) | Senior Software Engineer
- Global Firewall-as-a-Service platform with multi-region PoPs
- Multi-tenant control plane (60% faster deployments)
- DevSecOps automation (60% vulnerability reduction)
- Infrastructure automation (40% faster provisioning)
- GitHub Contributor: Platform engineering, Kubernetes, GenAI infrastructure
- Developer Program Member: Active GitHub developer
- Published: Design docs and RFCs for architecture standards
- Teaching: Mentoring engineers on platform engineering, cloud infrastructure, SRE practices





