From 478bb8592923d64cf8f661913a75b8b79ddc0e5d Mon Sep 17 00:00:00 2001 From: Raju Gupta Date: Sat, 26 Jul 2025 17:40:20 +0530 Subject: [PATCH 1/3] RFC00002 Kubernetes Deployment with Helm Charts for BharatMLStack --- rfc/00002/k8s-helm-rfc.md | 144 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 144 insertions(+) create mode 100644 rfc/00002/k8s-helm-rfc.md diff --git a/rfc/00002/k8s-helm-rfc.md b/rfc/00002/k8s-helm-rfc.md new file mode 100644 index 00000000..2dd4d54a --- /dev/null +++ b/rfc/00002/k8s-helm-rfc.md @@ -0,0 +1,144 @@ + +# RFC: Kubernetes Deployment with Helm Charts for BharatMLStack + +## Metadata +- **Title**: Kubernetes Deployment Strategy using Helm Charts for BharatMLStack +- **Author**: Raju Gupta +- **Status**: Draft +- **Created**: 2025-07-26 +- **Last Updated**: 2025-07-26 +- **Repository**: [Meesho/BharatMLStack](https://github.com/Meesho/BharatMLStack) + +## 🎯 Motivation + +BharatMLStack powers core ML infrastructure β€” including **Online Feature Store**, **Horizon (control plane backend)**, and **Trufflebox UI** β€” to support high-throughput inference, training, and real-time feature retrieval. + +The current deployment methods are manual and component-specific, making it hard to: +- Standardize deployment patterns across components. +- Onboard new contributors or operators quickly. +- Maintain consistent security and observability standards. + +A **Helm-based deployment approach** is needed to: +- **Simplify deployment** for ML engineers and data scientists. +- **Enable consistent configuration as code** across all environments. +- **Support production-grade scaling** (HPA, PDB, Gateway routing). +- **Adopt cloud-native best practices** from the start. + +## βœ… Goals + +- Provide **modular Helm charts** for core components: + **Online Feature Store**, **Horizon**, **Trufflebox UI**, and optional **SDK utilities**. +- Support **Ingress** (default) and **Gateway API** (production-ready routing). +- Embed **security & observability best practices** (RBAC, NetworkPolicy, ServiceMonitor). +- Enable environment-specific overrides (`values-dev.yaml`, `values-prod.yaml`). +- Provide a **contributor-friendly structure** (clear templates, tests, CI-ready). + +## 🚫 Non-Goals + +- Provisioning Kubernetes clusters or cloud infrastructure. +- Managing third-party services (Redis, Scylla, Postgres) beyond optional values. +- Providing GitOps or CI/CD pipelines (only chart testing and linting). +- Combining components into a single β€œumbrella chart” (initial phase is modular). + +## 🧱 Proposed Directory Structure + +The Helm charts are **modularized per core component** for independent development, deployment, and scaling. +Each component explicitly supports **Ingress** and **Gateway API** (Gateway + HTTPRoute). + +``` +bharatmlstack/helm/ +β”œβ”€β”€ online-feature-store/ # Core real-time feature store +β”‚ β”œβ”€β”€ Chart.yaml +β”‚ β”œβ”€β”€ values.yaml +β”‚ β”œβ”€β”€ values-dev.yaml +β”‚ β”œβ”€β”€ values-prod.yaml +β”‚ └── templates/ +β”‚ β”œβ”€β”€ deployment.yaml +β”‚ β”œβ”€β”€ service.yaml +β”‚ β”œβ”€β”€ ingress.yaml +β”‚ β”œβ”€β”€ gateway.yaml +β”‚ β”œβ”€β”€ httproute.yaml +β”‚ β”œβ”€β”€ configmap.yaml +β”‚ β”œβ”€β”€ secret.yaml +β”‚ β”œβ”€β”€ hpa.yaml +β”‚ β”œβ”€β”€ networkpolicy.yaml +β”‚ β”œβ”€β”€ servicemonitor.yaml +β”‚ β”œβ”€β”€ pdb.yaml +β”‚ └── tests/ +β”‚ β”œβ”€β”€ latency-test.yaml +β”‚ └── api-test.yaml + +β”œβ”€β”€ horizon/ # Control plane backend +β”‚ β”œβ”€β”€ Chart.yaml +β”‚ β”œβ”€β”€ values.yaml +β”‚ β”œβ”€β”€ values-dev.yaml +β”‚ β”œβ”€β”€ values-prod.yaml +β”‚ └── templates/ +β”‚ β”œβ”€β”€ deployment.yaml +β”‚ β”œβ”€β”€ service.yaml +β”‚ β”œβ”€β”€ ingress.yaml +β”‚ β”œβ”€β”€ gateway.yaml +β”‚ β”œβ”€β”€ httproute.yaml +β”‚ β”œβ”€β”€ configmap.yaml +β”‚ β”œβ”€β”€ secret.yaml +β”‚ β”œβ”€β”€ cronjob.yaml +β”‚ β”œβ”€β”€ rbac.yaml +β”‚ β”œβ”€β”€ networkpolicy.yaml +β”‚ β”œβ”€β”€ servicemonitor.yaml +β”‚ └── tests/ +β”‚ └── api-test.yaml + +β”œβ”€β”€ trufflebox-ui/ # Management console +β”‚ β”œβ”€β”€ Chart.yaml +β”‚ β”œβ”€β”€ values.yaml +β”‚ β”œβ”€β”€ values-dev.yaml +β”‚ β”œβ”€β”€ values-prod.yaml +β”‚ └── templates/ +β”‚ β”œβ”€β”€ deployment.yaml +β”‚ β”œβ”€β”€ service.yaml +β”‚ β”œβ”€β”€ ingress.yaml +β”‚ β”œβ”€β”€ gateway.yaml +β”‚ β”œβ”€β”€ httproute.yaml +β”‚ β”œβ”€β”€ configmap.yaml +β”‚ β”œβ”€β”€ secret.yaml +β”‚ β”œβ”€β”€ pdb.yaml +β”‚ └── tests/ +β”‚ └── ui-availability.yaml + +└── sdk-common/ (optional shared utilities) + β”œβ”€β”€ Chart.yaml + β”œβ”€β”€ values.yaml + └── templates/ + β”œβ”€β”€ job.yaml + └── cronjob.yaml +``` + +## 🌐 Ingress and Gateway API Support + +### Why Both? +- **Ingress (NGINX/Traefik)** β†’ Default, widely supported, ideal for dev/local. +- **Gateway API** β†’ Kubernetes future standard, ideal for production-grade routing, traffic-splitting, and multi-tenant use cases. + +### Example Values +```yaml +ingress: + enabled: true + className: "nginx" + host: bharatmlstack.local + +gateway: + enabled: false + className: "istio" + host: bharatmlstack.prod.com + tls: + enabled: true + secretName: bharatmlstack-tls +``` + +## πŸ“ Contribution Workflow + +- Fork the repository and work under `bharatmlstack/helm//`. +- Run `helm lint` and `helm template` before raising PRs. +- Update/add Helm test hooks in `templates/tests/`. +- Ensure changes work with KinD or Minikube (CI will validate). +- Update `values-*.yaml` for new configs. From 7d9b6abf4ec46bece976d3c91db90a1f68cf387f Mon Sep 17 00:00:00 2001 From: Raju Gupta Date: Sun, 27 Jul 2025 09:18:06 +0530 Subject: [PATCH 2/3] Added Discussion channel and Target Release --- rfc/00002/k8s-helm-rfc.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/rfc/00002/k8s-helm-rfc.md b/rfc/00002/k8s-helm-rfc.md index 2dd4d54a..073c3612 100644 --- a/rfc/00002/k8s-helm-rfc.md +++ b/rfc/00002/k8s-helm-rfc.md @@ -8,6 +8,8 @@ - **Created**: 2025-07-26 - **Last Updated**: 2025-07-26 - **Repository**: [Meesho/BharatMLStack](https://github.com/Meesho/BharatMLStack) +- **Target Release:** v1.0.0 +- **Discussion Channel:** #infra-dev ## 🎯 Motivation From d185df6c8c5ff0bcb6dc7e35b59b7b767c5579fb Mon Sep 17 00:00:00 2001 From: Raju Gupta Date: Sun, 27 Jul 2025 09:23:21 +0530 Subject: [PATCH 3/3] Updated proposed directory structure --- rfc/00002/k8s-helm-rfc.md | 21 ++++++++++----------- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/rfc/00002/k8s-helm-rfc.md b/rfc/00002/k8s-helm-rfc.md index 073c3612..7579d8af 100644 --- a/rfc/00002/k8s-helm-rfc.md +++ b/rfc/00002/k8s-helm-rfc.md @@ -23,13 +23,13 @@ The current deployment methods are manual and component-specific, making it hard A **Helm-based deployment approach** is needed to: - **Simplify deployment** for ML engineers and data scientists. - **Enable consistent configuration as code** across all environments. -- **Support production-grade scaling** (HPA, PDB, Gateway routing). +- **Support production-grade scaling** (VPA,HPA, PDB, Gateway routing). - **Adopt cloud-native best practices** from the start. ## βœ… Goals - Provide **modular Helm charts** for core components: - **Online Feature Store**, **Horizon**, **Trufflebox UI**, and optional **SDK utilities**. + **Online Feature Store**, **Horizon** and **Trufflebox UI**. - Support **Ingress** (default) and **Gateway API** (production-ready routing). - Embed **security & observability best practices** (RBAC, NetworkPolicy, ServiceMonitor). - Enable environment-specific overrides (`values-dev.yaml`, `values-prod.yaml`). @@ -63,6 +63,7 @@ bharatmlstack/helm/ β”‚ β”œβ”€β”€ configmap.yaml β”‚ β”œβ”€β”€ secret.yaml β”‚ β”œβ”€β”€ hpa.yaml +β”‚ β”œβ”€β”€ vpa.yaml β”‚ β”œβ”€β”€ networkpolicy.yaml β”‚ β”œβ”€β”€ servicemonitor.yaml β”‚ β”œβ”€β”€ pdb.yaml @@ -83,10 +84,11 @@ bharatmlstack/helm/ β”‚ β”œβ”€β”€ httproute.yaml β”‚ β”œβ”€β”€ configmap.yaml β”‚ β”œβ”€β”€ secret.yaml -β”‚ β”œβ”€β”€ cronjob.yaml -β”‚ β”œβ”€β”€ rbac.yaml +β”‚ β”œβ”€β”€ hpa.yaml +β”‚ β”œβ”€β”€ vpa.yaml β”‚ β”œβ”€β”€ networkpolicy.yaml β”‚ β”œβ”€β”€ servicemonitor.yaml +β”‚ β”œβ”€β”€ pdb.yaml β”‚ └── tests/ β”‚ └── api-test.yaml @@ -103,16 +105,13 @@ bharatmlstack/helm/ β”‚ β”œβ”€β”€ httproute.yaml β”‚ β”œβ”€β”€ configmap.yaml β”‚ β”œβ”€β”€ secret.yaml +β”‚ β”œβ”€β”€ hpa.yaml +β”‚ β”œβ”€β”€ vpa.yaml +β”‚ β”œβ”€β”€ networkpolicy.yaml +β”‚ β”œβ”€β”€ servicemonitor.yaml β”‚ β”œβ”€β”€ pdb.yaml β”‚ └── tests/ β”‚ └── ui-availability.yaml - -└── sdk-common/ (optional shared utilities) - β”œβ”€β”€ Chart.yaml - β”œβ”€β”€ values.yaml - └── templates/ - β”œβ”€β”€ job.yaml - └── cronjob.yaml ``` ## 🌐 Ingress and Gateway API Support