Skip to content

ShreyankKasable/Vehicle-Insurance-Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

20 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸš— MLOps Project - Vehicle Insurance Prediction Pipeline

Welcome to the Vehicle Insurance MLOps Project β€” a comprehensive end-to-end machine learning pipeline designed to automate, scale, and deploy predictive models for vehicle insurance classification. This project showcases practical implementation of MLOps principles including modular coding, cloud integration, CI/CD automation, and API deployment.


πŸ“ Project Setup and Structure

βœ… Step 1: Generate Template

  • Run template.py to scaffold the project structure with necessary folders and files.

βœ… Step 2: Package Management

  • Define local package structure using setup.py and pyproject.toml.

βœ… Step 3: Create and Activate Virtual Environment

conda create -n vehicle python=3.10 -y
conda activate vehicle
pip install -r requirements.txt
  • Verify packages with:
pip list

πŸ“Š MongoDB Setup and Data Management

πŸ”§ Step 4: Setup MongoDB Atlas

  1. Create a MongoDB Atlas account and M0 cluster.
  2. Set up a DB user and whitelist IP as 0.0.0.0/0.
  3. Get the connection string for Python driver and update the password.

πŸ“„ Step 5: Push Dataset to MongoDB

  • Create notebook/mongoDB_demo.ipynb, add your dataset and write code to upload it.
  • Verify your uploaded data in Atlas > Browse Collections.

πŸ“ Logging, Exception Handling, and EDA

πŸ“Œ Step 6: Setup Logging and Exception Handling

  • Create reusable logging and exception utilities.
  • Test them in demo.py.

πŸ“ˆ Step 7: Perform EDA and Feature Engineering

  • Analyze patterns and prepare features in your EDA notebook under notebook/.

πŸ“₯ Data Ingestion

πŸ”„ Step 8: Create Data Ingestion Pipeline

  • Define:

    • MongoDB logic in configuration.mongo_db_connections.py
    • Raw data logic in data_access/proj1_data.py
    • Ingestion logic in components.data_ingestion.py
  • Define config and artifact classes:

    • entity/config_entity.py
    • entity/artifact_entity.py

🌐 Step 9: Set MongoDB URL

# For Bash
export MONGODB_URL="mongodb+srv://<username>:<password>...."

# For PowerShell
$env:MONGODB_URL = "mongodb+srv://<username>:<password>...."

πŸ” Data Validation, Transformation & Model Training

βœ… Step 10: Data Validation

  • Add schema in config/schema.yaml
  • Add validation logic in utils/main_utils.py

βœ… Step 11: Data Transformation

  • Build transformation code in components/data_transformation.py
  • Add estimator logic in entity/estimator.py

βœ… Step 12: Model Training

  • Use transformed data to train models in components/model_trainer.py

☁️ AWS Setup for Model Evaluation & Deployment

πŸ” Step 13: Configure AWS

  1. Create IAM user with AdministratorAccess
  2. Export AWS keys as environment variables:
export AWS_ACCESS_KEY_ID="your_key"
export AWS_SECRET_ACCESS_KEY="your_secret"

🩒 Step 14: Model Registry using S3

  • Create S3 bucket: my-model-mlopsproj

  • Use:

    • cloud_storage/aws_storage.py
    • entity/s3_estimator.py
    • constants/__init__.py for bucket info

πŸš€ Model Evaluation, Pushing, and Prediction

βœ… Step 15: Evaluate & Push Best Model

  • Use components/model_evaluation.py and model_pusher.py

🌐 Step 16: Create FastAPI Prediction App

  • Setup app.py with routes:

    • /training: Train pipeline
    • /predict: Make predictions

🧱 Step 17: Add Static Files

  • Create static/ and template/ directories for frontend if needed

πŸ”„ CI/CD Automation with GitHub Actions, Docker, EC2

🐳 Step 18: Dockerize

  • Create:

    • Dockerfile
    • .dockerignore

🧷 Step 19: GitHub Actions

  • Setup .github/workflows/aws.yaml

  • Add GitHub Secrets:

    • AWS_ACCESS_KEY_ID
    • AWS_SECRET_ACCESS_KEY
    • ECR_REPO
    • AWS_DEFAULT_REGION

πŸ—‚οΈ Step 20: EC2 & Self-Hosted Runner

  1. Launch EC2 (Ubuntu 24.04, T2 Medium)
  2. Install Docker:
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
sudo usermod -aG docker ubuntu
newgrp docker
  1. Register EC2 as GitHub Runner

🌐 Step 21: Open EC2 Port

  • Go to EC2 > Security > Edit Inbound Rules

    • Add: Custom TCP | Port 5080 | Source: 0.0.0.0/0

βœ… Step 22: Access Application

  • Visit: http://<your-ec2-public-ip>:5080

🧱 Folder Structure

.
β”œβ”€β”€ app.py
β”œβ”€β”€ demo.py
β”œβ”€β”€ Dockerfile
β”œβ”€β”€ .dockerignore
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ setup.py
β”œβ”€β”€ pyproject.toml
β”œβ”€β”€ notebook/
β”‚   └── mongoDB_demo.ipynb
β”œβ”€β”€ config/
β”‚   β”œβ”€β”€ schema.yaml
β”‚   └── model.yaml
└── src/
    β”œβ”€β”€ components/
    β”œβ”€β”€ configuration/
    β”œβ”€β”€ cloud_storage/
    β”œβ”€β”€ data_access/
    β”œβ”€β”€ constants/
    β”œβ”€β”€ entity/
    β”œβ”€β”€ exception/
    β”œβ”€β”€ logger/
    β”œβ”€β”€ pipeline/
    └── utils/

πŸ› οΈ Tech Stack

  • Python 3.10
  • MongoDB Atlas
  • FastAPI
  • Scikit-learn, Pandas, NumPy
  • Docker, GitHub Actions, EC2, S3, ECR

🎯 End-to-End Workflow

Data Ingestion ➞ Data Validation ➞ Data Transformation ➞ Model Training ➞
Model Evaluation ➞ Model Registry (S3) ➞ Deployment (EC2 + FastAPI + Docker) ➞ CI/CD Pipeline

πŸ’¬ Author

Shreyank Kasable

🌐 [https://github.com/ShreyankKasable]

If you found this project helpful, don’t forget to ⭐ the repo!

About

Vehicle Insurance Prediction MlOps Project

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors