Welcome to the Vehicle Insurance MLOps Project β a comprehensive end-to-end machine learning pipeline designed to automate, scale, and deploy predictive models for vehicle insurance classification. This project showcases practical implementation of MLOps principles including modular coding, cloud integration, CI/CD automation, and API deployment.
- Run
template.pyto scaffold the project structure with necessary folders and files.
- Define local package structure using
setup.pyandpyproject.toml.
conda create -n vehicle python=3.10 -y
conda activate vehicle
pip install -r requirements.txt- Verify packages with:
pip list- Create a MongoDB Atlas account and M0 cluster.
- Set up a DB user and whitelist IP as
0.0.0.0/0. - Get the connection string for Python driver and update the password.
- Create
notebook/mongoDB_demo.ipynb, add your dataset and write code to upload it. - Verify your uploaded data in Atlas > Browse Collections.
- Create reusable logging and exception utilities.
- Test them in
demo.py.
- Analyze patterns and prepare features in your
EDAnotebook undernotebook/.
-
Define:
- MongoDB logic in
configuration.mongo_db_connections.py - Raw data logic in
data_access/proj1_data.py - Ingestion logic in
components.data_ingestion.py
- MongoDB logic in
-
Define config and artifact classes:
entity/config_entity.pyentity/artifact_entity.py
# For Bash
export MONGODB_URL="mongodb+srv://<username>:<password>...."
# For PowerShell
$env:MONGODB_URL = "mongodb+srv://<username>:<password>...."- Add schema in
config/schema.yaml - Add validation logic in
utils/main_utils.py
- Build transformation code in
components/data_transformation.py - Add estimator logic in
entity/estimator.py
- Use transformed data to train models in
components/model_trainer.py
- Create IAM user with
AdministratorAccess - Export AWS keys as environment variables:
export AWS_ACCESS_KEY_ID="your_key"
export AWS_SECRET_ACCESS_KEY="your_secret"-
Create S3 bucket:
my-model-mlopsproj -
Use:
cloud_storage/aws_storage.pyentity/s3_estimator.pyconstants/__init__.pyfor bucket info
- Use
components/model_evaluation.pyandmodel_pusher.py
-
Setup
app.pywith routes:/training: Train pipeline/predict: Make predictions
- Create
static/andtemplate/directories for frontend if needed
-
Create:
Dockerfile.dockerignore
-
Setup
.github/workflows/aws.yaml -
Add GitHub Secrets:
AWS_ACCESS_KEY_IDAWS_SECRET_ACCESS_KEYECR_REPOAWS_DEFAULT_REGION
- Launch EC2 (Ubuntu 24.04, T2 Medium)
- Install Docker:
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
sudo usermod -aG docker ubuntu
newgrp docker- Register EC2 as GitHub Runner
-
Go to EC2 > Security > Edit Inbound Rules
- Add: Custom TCP | Port 5080 | Source: 0.0.0.0/0
- Visit:
http://<your-ec2-public-ip>:5080
.
βββ app.py
βββ demo.py
βββ Dockerfile
βββ .dockerignore
βββ requirements.txt
βββ setup.py
βββ pyproject.toml
βββ notebook/
β βββ mongoDB_demo.ipynb
βββ config/
β βββ schema.yaml
β βββ model.yaml
βββ src/
βββ components/
βββ configuration/
βββ cloud_storage/
βββ data_access/
βββ constants/
βββ entity/
βββ exception/
βββ logger/
βββ pipeline/
βββ utils/- Python 3.10
- MongoDB Atlas
- FastAPI
- Scikit-learn, Pandas, NumPy
- Docker, GitHub Actions, EC2, S3, ECR
Data Ingestion β Data Validation β Data Transformation β Model Training β
Model Evaluation β Model Registry (S3) β Deployment (EC2 + FastAPI + Docker) β CI/CD Pipeline
Shreyank Kasable
π [https://github.com/ShreyankKasable]
If you found this project helpful, donβt forget to β the repo!