Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
145 changes: 145 additions & 0 deletions rfc/00002/k8s-helm-rfc.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@

# RFC: Kubernetes Deployment with Helm Charts for BharatMLStack

## Metadata
- **Title**: Kubernetes Deployment Strategy using Helm Charts for BharatMLStack
- **Author**: Raju Gupta
- **Status**: Draft
- **Created**: 2025-07-26
- **Last Updated**: 2025-07-26
- **Repository**: [Meesho/BharatMLStack](https://github.com/Meesho/BharatMLStack)
- **Target Release:** v1.0.0
- **Discussion Channel:** #infra-dev

## 🎯 Motivation

BharatMLStack powers core ML infrastructure β€” including **Online Feature Store**, **Horizon (control plane backend)**, and **Trufflebox UI** β€” to support high-throughput inference, training, and real-time feature retrieval.

The current deployment methods are manual and component-specific, making it hard to:
- Standardize deployment patterns across components.
- Onboard new contributors or operators quickly.
- Maintain consistent security and observability standards.

A **Helm-based deployment approach** is needed to:
- **Simplify deployment** for ML engineers and data scientists.
- **Enable consistent configuration as code** across all environments.
- **Support production-grade scaling** (VPA,HPA, PDB, Gateway routing).
- **Adopt cloud-native best practices** from the start.

## βœ… Goals

- Provide **modular Helm charts** for core components:
**Online Feature Store**, **Horizon** and **Trufflebox UI**.
- Support **Ingress** (default) and **Gateway API** (production-ready routing).
- Embed **security & observability best practices** (RBAC, NetworkPolicy, ServiceMonitor).
- Enable environment-specific overrides (`values-dev.yaml`, `values-prod.yaml`).
- Provide a **contributor-friendly structure** (clear templates, tests, CI-ready).

## 🚫 Non-Goals

- Provisioning Kubernetes clusters or cloud infrastructure.
- Managing third-party services (Redis, Scylla, Postgres) beyond optional values.
- Providing GitOps or CI/CD pipelines (only chart testing and linting).
- Combining components into a single β€œumbrella chart” (initial phase is modular).

## 🧱 Proposed Directory Structure

The Helm charts are **modularized per core component** for independent development, deployment, and scaling.
Each component explicitly supports **Ingress** and **Gateway API** (Gateway + HTTPRoute).

```
bharatmlstack/helm/
β”œβ”€β”€ online-feature-store/ # Core real-time feature store
β”‚ β”œβ”€β”€ Chart.yaml
β”‚ β”œβ”€β”€ values.yaml
β”‚ β”œβ”€β”€ values-dev.yaml
β”‚ β”œβ”€β”€ values-prod.yaml
β”‚ └── templates/
β”‚ β”œβ”€β”€ deployment.yaml
β”‚ β”œβ”€β”€ service.yaml
β”‚ β”œβ”€β”€ ingress.yaml
β”‚ β”œβ”€β”€ gateway.yaml
β”‚ β”œβ”€β”€ httproute.yaml
β”‚ β”œβ”€β”€ configmap.yaml
β”‚ β”œβ”€β”€ secret.yaml
β”‚ β”œβ”€β”€ hpa.yaml
β”‚ β”œβ”€β”€ vpa.yaml
β”‚ β”œβ”€β”€ networkpolicy.yaml
β”‚ β”œβ”€β”€ servicemonitor.yaml
β”‚ β”œβ”€β”€ pdb.yaml
β”‚ └── tests/
β”‚ β”œβ”€β”€ latency-test.yaml
β”‚ └── api-test.yaml

β”œβ”€β”€ horizon/ # Control plane backend
β”‚ β”œβ”€β”€ Chart.yaml
β”‚ β”œβ”€β”€ values.yaml
β”‚ β”œβ”€β”€ values-dev.yaml
β”‚ β”œβ”€β”€ values-prod.yaml
β”‚ └── templates/
β”‚ β”œβ”€β”€ deployment.yaml
β”‚ β”œβ”€β”€ service.yaml
β”‚ β”œβ”€β”€ ingress.yaml
β”‚ β”œβ”€β”€ gateway.yaml
β”‚ β”œβ”€β”€ httproute.yaml
β”‚ β”œβ”€β”€ configmap.yaml
β”‚ β”œβ”€β”€ secret.yaml
β”‚ β”œβ”€β”€ hpa.yaml
β”‚ β”œβ”€β”€ vpa.yaml
β”‚ β”œβ”€β”€ networkpolicy.yaml
β”‚ β”œβ”€β”€ servicemonitor.yaml
β”‚ β”œβ”€β”€ pdb.yaml
β”‚ └── tests/
β”‚ └── api-test.yaml

β”œβ”€β”€ trufflebox-ui/ # Management console
β”‚ β”œβ”€β”€ Chart.yaml
β”‚ β”œβ”€β”€ values.yaml
β”‚ β”œβ”€β”€ values-dev.yaml
β”‚ β”œβ”€β”€ values-prod.yaml
β”‚ └── templates/
β”‚ β”œβ”€β”€ deployment.yaml
β”‚ β”œβ”€β”€ service.yaml
β”‚ β”œβ”€β”€ ingress.yaml
β”‚ β”œβ”€β”€ gateway.yaml
β”‚ β”œβ”€β”€ httproute.yaml
β”‚ β”œβ”€β”€ configmap.yaml
β”‚ β”œβ”€β”€ secret.yaml
β”‚ β”œβ”€β”€ hpa.yaml
β”‚ β”œβ”€β”€ vpa.yaml
β”‚ β”œβ”€β”€ networkpolicy.yaml
β”‚ β”œβ”€β”€ servicemonitor.yaml
β”‚ β”œβ”€β”€ pdb.yaml
β”‚ └── tests/
β”‚ └── ui-availability.yaml
```

## 🌐 Ingress and Gateway API Support

### Why Both?
- **Ingress (NGINX/Traefik)** β†’ Default, widely supported, ideal for dev/local.
- **Gateway API** β†’ Kubernetes future standard, ideal for production-grade routing, traffic-splitting, and multi-tenant use cases.

### Example Values
```yaml
ingress:
enabled: true
className: "nginx"
host: bharatmlstack.local

gateway:
enabled: false
className: "istio"
host: bharatmlstack.prod.com
tls:
enabled: true
secretName: bharatmlstack-tls
```

## πŸ“ Contribution Workflow

- Fork the repository and work under `bharatmlstack/helm/<component>/`.
- Run `helm lint` and `helm template` before raising PRs.
- Update/add Helm test hooks in `templates/tests/`.
- Ensure changes work with KinD or Minikube (CI will validate).
- Update `values-*.yaml` for new configs.