Skip to content

BharatMLStack is an open-source, end-to-end machine learning infrastructure stack built at Meesho to support real-time and batch ML workloads at Bharat scale

License

Notifications You must be signed in to change notification settings

Meesho/BharatMLStack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

622 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BharatMLStack

CI Static Badge Discord Made in India

Meesho

What is BharatMLStack?

BharatMLStack is a production-ready, cloud-agnostic ML infrastructure platform that powers real-time feature serving, model inference, and embedding search at massive scale. Built and battle-tested at Meesho, it is designed to help organizations ship ML to production faster, cheaper, and more reliably.

Our Vision

BharatMLStack is built around four core tenets:

Workflow Integration & Productivity

Ship ML to production faster than ever.

  • 3x faster experiment-to-deployment cycles
  • 95% reduction in model onboarding time

Cloud-Agnostic & Lock-In Free

Run anywhere. Own your stack.

  • Runs across public cloud, on-prem, and edge
  • Kubernetes-native with zero vendor lock-in

Economic Efficiency

Do more with less.

  • 60–70% lower infrastructure costs vs hyperscaler managed services
  • Optimized resource utilization across CPU and GPU workloads

Availability & Scalability

Enterprise-grade reliability at internet scale.

  • 99.99% uptime across clusters
  • 1M+ QPS with low latency

Designed Truly for Bharat Scale

Built for the demands of one of the world's largest e-commerce platforms:

Metric Performance
Feature Store 2.4M QPS (batch of 100 id lookups)
Model Inference 1M+ QPS
Embedding Search 500K QPS
Feature Retrieval Latency Sub-10ms

Core Components

Component Description Version Docs
TruffleBox UI Web console for feature registry, cataloging, and approval workflows v1.3.0 Docs
Online Feature Store Sub-10ms feature retrieval at millions of QPS with streaming ingestion v1.2.0 Docs
Inferflow DAG-based real-time inference orchestration for composable ML pipelines v1.0.0 Docs
Numerix Rust-powered math compute engine for high-performance matrix ops v1.0.0 Docs
Skye Vector similarity search with pluggable backends v1.0.0 Docs
Go SDK Go client for Feature Store, Interaction Store, and logging v1.3.0 Docs
Python SDK Python client libraries for Feature Store and inference logging v1.0.1 Docs
Interaction Store ScyllaDB-backed store for user interaction signals at sub-10ms
Horizon Control plane that orchestrates all services and powers TruffleBox UI v1.3.0

Full documentation at meesho.github.io/BharatMLStack | Blogs

Quick Start

git clone https://github.com/Meesho/BharatMLStack.git
cd BharatMLStack/quick-start
#Set versions
ONFS_VERSION=v1.2.0 HORIZON_VERSION=v1.3.0 TRUFFLEBOX_VERSION=v1.3.0 NUMERIX_VERSION=v1.0.0

./start.sh

For step-by-step setup, Docker Compose details, sample data, and health checks, see the full Quick Start Guide →.

Architecture

BharatMLStack Architecture

Use-Cases

BharatMLStack powers a wide range of ML-driven applications:

Use-Case What BharatMLStack Enables
Personalized Candidate Generation Retrieve and rank millions of candidates in real time using feature vectors and embedding similarity
Personalized Ranking Serve user, item, and context features at ultra-low latency to power real-time ranking models
Fraud & Risk Detection Stream interaction signals and features to detect anomalies and fraudulent patterns in milliseconds
Image Search Run embedding search at 500K QPS to match visual queries against massive product catalogs
LLM Recommender Systems Orchestrate LLM inference pipelines with feature enrichment for next-gen recommendation engines
DL & LLM Deployments at Scale Deploy and scale deep learning and large language models across GPU clusters with Inferflow orchestration

Contributing

We welcome contributions from the community! Please see our Contributing Guide for details on how to get started.

Community & Support

License

BharatMLStack is open-source software licensed under the BharatMLStack Business Source License 1.1.


Built with ❤️ for the ML community from Meesho
If you find this useful, ⭐️ the repo — your support means the world to us!

About

BharatMLStack is an open-source, end-to-end machine learning infrastructure stack built at Meesho to support real-time and batch ML workloads at Bharat scale

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors 35