Skip to content

AlexandreManai/ML_pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

27 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸš€ Machine Learning Pipeline

Welcome to the Machine Learning Pipeline repository! This project encompasses a complete MLOps training pipeline using open-source technologies, aimed at providing a robust foundation for machine learning workflows. This tool is designed to assist in both educational and production-level ML projects

Overview 🎯

The purpose of this repository is twofold:

  1. To serve as a practical MLOps training tool.
  2. To offer a blueprint for building scalable and maintainable production ML pipelines.

Technologies Used πŸ› οΈ

  • DVC: For data version control.
  • MLflow: For experiment tracking and model registry.
  • Apache Airflow: For orchestrating the ML pipeline.
  • OmegaConf: For managing configuration.
  • Optuna: For hyperparameter optimization.
  • Docker: For containerization and isolation of the environment.
  • MinIO: For S3-compatible storage.
  • Flower: For monitoring Celery workers.

Getting Started 🏁

Follow these steps to set up and run the pipeline in your local environment:

1. Clone the repository:

git clone https://github.com/AlexandreManai/ML_pipeline.git

2. Install Docker:

Check Docker's official documentation and install it according to your operating system.

3. Set up environment variables:

echo -e "AIRFLOW_UID=$(id -u)" > .env

4. Install Docker Compose:

pip install docker-compose

5. Launch the pipeline:

docker compose up 

Access 🌐

Here are the links to access various components of the pipeline:

Cleanup 🧹

To stop and clean up resources, run the following commands:

docker compose stop  # Stops containers written in docker compose v2
docker compose down  # Removes containers, networks, volumes, and images created by 'up' written in docker compose v2
docker rm $(docker ps -aq) # Removes all containers
docker rmi $(docker images -q) # Removes all images
docker volume prune  # Removes all unused volumes

Requirements πŸ“‹

Check out the Airflow environment requirements for necessary dependencies.

Disclaimer πŸ“œ

This project has been tested on macOS and Linux with:

  • Python 3.10.6
  • Docker version 20.10.10
  • Docker-compose version 1.29.2

This README serves as a living document and may be updated as the project evolves. πŸ”„

About

πŸš€ A complete MLOps pipeline using DVC, MLflow, Airflow, OmegaConf, Optuna, and Docker for scalable machine learning workflows.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors