Skip to content

Arpit-1807/GUI-data-cleaning-automation-tool-python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 

Repository files navigation

GUI-data-cleaning-automation-tool-python

  1. Project Overview

This project is a GUI-based Data Cleaning Automation Tool built in Python to simplify and automate common preprocessing tasks for Excel and CSV files.

The system allows non-technical users to upload a raw dataset, select cleaning options via checkboxes, and generate a cleaned output file — all without writing code.

This project simulates how real-world Data Analysts automate repetitive data-cleaning workflows to improve efficiency and data quality.

  1. Problem Statement

Raw datasets often contain:

~ Duplicate rows

~ Blank rows

~ Inconsistent text formatting

~ Leading and trailing spaces

~ Missing or null values

~ Data inconsistencies and errors

Manual cleaning in Excel is:

~ Time-consuming

~ Error-prone

~ Not scalable

  1. Solution

This tool provides a Graphical User Interface (GUI) that enables users to:

✔ Browse and upload Excel/CSV files

✔ Remove duplicate rows

✔ Remove blank rows

✔ Trim leading & trailing spaces (text columns only)

✔ Convert text columns to Title Case

✔ Handle null and missing values

✔ Debug common data errors

All through a simple, user-friendly interface.

  1. How It Works

4.1 User selects a raw Excel/CSV file

4.2 Chooses cleaning options via checkboxes

4.3 The system applies selected preprocessing steps

4.4 A cleaned output file is generated automatically

The tool applies cleaning logic programmatically while preserving dataset integrity.

  1. Key Concepts & Skills Applied

~ Data preprocessing principles

~ Automation of repetitive workflows

~ GUI-based user interaction

~ Conditional logic & validation

~ Error handling

~ Data quality improvement strategies

  1. Technologies Used

6.1 Python

6.2 File handling (Excel/CSV)

6.3 Data cleaning logic

6.4 GUI framework

6.5 Automation workflow design

  1. Project Outcome

~ Reduced manual cleaning time

~ Improved dataset consistency

~ Automated repetitive preprocessing tasks

~ Built a reusable analyst-focused cleaning tool

  1. Future Enhancements

~ Add preview window before export

~ Add data profiling summary (basic statistics)

~ Integrate logging system

~ Add automated report generation

~ Convert into standalone executable (.exe)

About

A user-friendly Python GUI application that automates common data cleaning tasks for Excel and CSV files, reducing manual effort and improving dataset quality for analytics workflows.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages