This project is an end-to-end data analytics solution built using Python, Pandas, and NumPy to analyze project and employee data. It focuses on data cleaning, transformation, integration, and applying business logic to generate actionable insights. The project simulates real-world scenarios where multiple datasets are combined to evaluate employee performance and project outcomes.
- Clean and preprocess raw data
- Handle missing values using a running average technique
- Merge multiple datasets for unified analysis
- Apply business rules to evaluate performance
- Generate insights such as bonuses, designation updates, and cost analysis
- π Python
- π Pandas
- π’ NumPy
- Project Data β Contains project details, cost, and status
- Employee Data β Includes employee demographics
- Seniority Data β Defines employee designation levels
- Handled missing values using running average logic
- Split and transformed columns for better usability
- Merged multiple datasets using common keys (
ID) - Created a unified dataset for analysis
- π° Calculated 5% bonus for completed projects
- π Adjusted designation levels for failed projects
- π Promoted employees based on age criteria
- π§Ύ Added titles (Mr./Mrs.) based on gender
- Calculated total project cost per employee
- Filtered data based on conditions (e.g., city-based filtering)
- Improved understanding of employee contribution across projects
- Identified cost distribution and high-value contributors
- Enabled performance-based evaluation using project outcomes
- Clone the repository
- Install required libraries:
pip install pandas numpy - Run the Python script or Jupyter Notebook