Welcome to the Machine Learning Mini-Workshop Repository! This workshop is designed especially for MBA practitioner and non-technical learners who want a simple, practical, and visual introduction to Machine Learning.
This README will guide you from installation to running notebooks, and walk you through every folder in this repository.
Before starting, ensure you have the following:
Git helps you download and manage this project.
Download and install: https://git-scm.com/download/win
Git is pre-installed, but if needed: https://git-scm.com/download/mac
sudo apt install gitYou need this to access and store your projects.
Create your account: https://github.com/
Download from: https://www.python.org/downloads/
β During installation on Windows, check βAdd Python to PATHβ.
Open a terminal (CMD / PowerShell / Terminal) and run:
pip install jupyterlab notebook ipykernelAfter installation, launch with:
jupyter labor
jupyter notebookIn your terminal:
git clone https://github.com/<your-username>/<repository-name>.git
cd <repository-name>(Replace with your GitHub repo link.)
Run this inside your repo folder:
pip install pandas numpy scikit-learn matplotlib seaborn \
requests beautifulsoup4 lxml opendatasets kaggle \
tensorflow torch notebook jupyterlab transformers \
sentence-transformers nltk spacyThis installs everything needed for:
β data preprocessing β regression, classification, clustering models β visualizations β notebook execution
This workshop repository contains multiple folders. Each folder represents a complete part of the learning pipeline.
This folder covers how to find datasets and research papers.
- How to download datasets (Kaggle, UCI, Google Dataset Search)
- How to structure research data
- Tools like Semantic Scholar, arXiv
- Spreadsheet basics
data_sources.md- Example dataset links
- Small sample CSV datasets
This contains the data cleaning workbook and explanations.
- Handling missing values
- Removing duplicates
- Encoding categories
- Scaling numeric data
- Train/test split
data_cleaning_workbook.ipynbsales_data.csv(demo dataset)- Completed code solution
- Worksheets you can teach from
This folder introduces the 2 important Supervised ML families:
For predicting continuous values (e.g., sales, profit)
For predicting labels (e.g., churn: yes/no)
- A ready-to-run Jupyter Notebook
- Step-by-step explanation
- Intuition with simple plots (scatter plots, clusters, decision boundaries)
- Visual explanations for MBA participants
Example files:
linear_regression.ipynbclassification.ipynb
This section focuses on how models learn.
- Train/test split explanation
- Gradient descent intuition (MBA-friendly)
- Avoiding overfitting
- Hands-on training scripts
Files:
training_basics.ipynb- Visual plots for training curves
This folder shows how to evaluate models properly.
- Accuracy
- Precision
- Recall
- F1-score
- Confusion Matrix
- Cross-validation
Files:
validation_metrics.ipynb- Interactive graphs with matplotlib & seaborn
Open your terminal inside the repo folder:
jupyter notebookor
jupyter labThen click on any .ipynb file to open and run it.
By the end of the workshop, participants will be able to:
β Understand data cleaning
β Build simple regression, classification, clustering models
β Interpret model results
β Read plots and explain insights
β Work with Jupyter Notebook
β Apply ML thinking in business settings