diff --git a/rfc/00002/k8s-helm-rfc.md b/rfc/00002/k8s-helm-rfc.md new file mode 100644 index 00000000..7579d8af --- /dev/null +++ b/rfc/00002/k8s-helm-rfc.md @@ -0,0 +1,145 @@ + +# RFC: Kubernetes Deployment with Helm Charts for BharatMLStack + +## Metadata +- **Title**: Kubernetes Deployment Strategy using Helm Charts for BharatMLStack +- **Author**: Raju Gupta +- **Status**: Draft +- **Created**: 2025-07-26 +- **Last Updated**: 2025-07-26 +- **Repository**: [Meesho/BharatMLStack](https://github.com/Meesho/BharatMLStack) +- **Target Release:** v1.0.0 +- **Discussion Channel:** #infra-dev + +## 🎯 Motivation + +BharatMLStack powers core ML infrastructure β€” including **Online Feature Store**, **Horizon (control plane backend)**, and **Trufflebox UI** β€” to support high-throughput inference, training, and real-time feature retrieval. + +The current deployment methods are manual and component-specific, making it hard to: +- Standardize deployment patterns across components. +- Onboard new contributors or operators quickly. +- Maintain consistent security and observability standards. + +A **Helm-based deployment approach** is needed to: +- **Simplify deployment** for ML engineers and data scientists. +- **Enable consistent configuration as code** across all environments. +- **Support production-grade scaling** (VPA,HPA, PDB, Gateway routing). +- **Adopt cloud-native best practices** from the start. + +## βœ… Goals + +- Provide **modular Helm charts** for core components: + **Online Feature Store**, **Horizon** and **Trufflebox UI**. +- Support **Ingress** (default) and **Gateway API** (production-ready routing). +- Embed **security & observability best practices** (RBAC, NetworkPolicy, ServiceMonitor). +- Enable environment-specific overrides (`values-dev.yaml`, `values-prod.yaml`). +- Provide a **contributor-friendly structure** (clear templates, tests, CI-ready). + +## 🚫 Non-Goals + +- Provisioning Kubernetes clusters or cloud infrastructure. +- Managing third-party services (Redis, Scylla, Postgres) beyond optional values. +- Providing GitOps or CI/CD pipelines (only chart testing and linting). +- Combining components into a single β€œumbrella chart” (initial phase is modular). + +## 🧱 Proposed Directory Structure + +The Helm charts are **modularized per core component** for independent development, deployment, and scaling. +Each component explicitly supports **Ingress** and **Gateway API** (Gateway + HTTPRoute). + +``` +bharatmlstack/helm/ +β”œβ”€β”€ online-feature-store/ # Core real-time feature store +β”‚ β”œβ”€β”€ Chart.yaml +β”‚ β”œβ”€β”€ values.yaml +β”‚ β”œβ”€β”€ values-dev.yaml +β”‚ β”œβ”€β”€ values-prod.yaml +β”‚ └── templates/ +β”‚ β”œβ”€β”€ deployment.yaml +β”‚ β”œβ”€β”€ service.yaml +β”‚ β”œβ”€β”€ ingress.yaml +β”‚ β”œβ”€β”€ gateway.yaml +β”‚ β”œβ”€β”€ httproute.yaml +β”‚ β”œβ”€β”€ configmap.yaml +β”‚ β”œβ”€β”€ secret.yaml +β”‚ β”œβ”€β”€ hpa.yaml +β”‚ β”œβ”€β”€ vpa.yaml +β”‚ β”œβ”€β”€ networkpolicy.yaml +β”‚ β”œβ”€β”€ servicemonitor.yaml +β”‚ β”œβ”€β”€ pdb.yaml +β”‚ └── tests/ +β”‚ β”œβ”€β”€ latency-test.yaml +β”‚ └── api-test.yaml + +β”œβ”€β”€ horizon/ # Control plane backend +β”‚ β”œβ”€β”€ Chart.yaml +β”‚ β”œβ”€β”€ values.yaml +β”‚ β”œβ”€β”€ values-dev.yaml +β”‚ β”œβ”€β”€ values-prod.yaml +β”‚ └── templates/ +β”‚ β”œβ”€β”€ deployment.yaml +β”‚ β”œβ”€β”€ service.yaml +β”‚ β”œβ”€β”€ ingress.yaml +β”‚ β”œβ”€β”€ gateway.yaml +β”‚ β”œβ”€β”€ httproute.yaml +β”‚ β”œβ”€β”€ configmap.yaml +β”‚ β”œβ”€β”€ secret.yaml +β”‚ β”œβ”€β”€ hpa.yaml +β”‚ β”œβ”€β”€ vpa.yaml +β”‚ β”œβ”€β”€ networkpolicy.yaml +β”‚ β”œβ”€β”€ servicemonitor.yaml +β”‚ β”œβ”€β”€ pdb.yaml +β”‚ └── tests/ +β”‚ └── api-test.yaml + +β”œβ”€β”€ trufflebox-ui/ # Management console +β”‚ β”œβ”€β”€ Chart.yaml +β”‚ β”œβ”€β”€ values.yaml +β”‚ β”œβ”€β”€ values-dev.yaml +β”‚ β”œβ”€β”€ values-prod.yaml +β”‚ └── templates/ +β”‚ β”œβ”€β”€ deployment.yaml +β”‚ β”œβ”€β”€ service.yaml +β”‚ β”œβ”€β”€ ingress.yaml +β”‚ β”œβ”€β”€ gateway.yaml +β”‚ β”œβ”€β”€ httproute.yaml +β”‚ β”œβ”€β”€ configmap.yaml +β”‚ β”œβ”€β”€ secret.yaml +β”‚ β”œβ”€β”€ hpa.yaml +β”‚ β”œβ”€β”€ vpa.yaml +β”‚ β”œβ”€β”€ networkpolicy.yaml +β”‚ β”œβ”€β”€ servicemonitor.yaml +β”‚ β”œβ”€β”€ pdb.yaml +β”‚ └── tests/ +β”‚ └── ui-availability.yaml +``` + +## 🌐 Ingress and Gateway API Support + +### Why Both? +- **Ingress (NGINX/Traefik)** β†’ Default, widely supported, ideal for dev/local. +- **Gateway API** β†’ Kubernetes future standard, ideal for production-grade routing, traffic-splitting, and multi-tenant use cases. + +### Example Values +```yaml +ingress: + enabled: true + className: "nginx" + host: bharatmlstack.local + +gateway: + enabled: false + className: "istio" + host: bharatmlstack.prod.com + tls: + enabled: true + secretName: bharatmlstack-tls +``` + +## πŸ“ Contribution Workflow + +- Fork the repository and work under `bharatmlstack/helm//`. +- Run `helm lint` and `helm template` before raising PRs. +- Update/add Helm test hooks in `templates/tests/`. +- Ensure changes work with KinD or Minikube (CI will validate). +- Update `values-*.yaml` for new configs.