Skip to content

Latest commit

 

History

History
7 lines (6 loc) · 306 Bytes

File metadata and controls

7 lines (6 loc) · 306 Bytes

PySpark_Practice_Project

A short project using PySpark. We used the bank marketing dataset from Kaggle. We handle missing values, label and encode categorical data. We scale numeric data and create a Random Forest model.

Dataset available at: https://www.kaggle.com/janiobachmann/bank-marketing-dataset