I sit in the overlap between Data Analytics and Data Engineering — I build end-to-end pipelines that move data from messy reality into dashboards executives actually trust, and I write the SQL that answers what the business is actually asking.
- 🎓 Graduating MS Computer Science from George Mason University in May 2026
- 💼 Previously: ML Product Engineer Intern @ Aorbis · Software Developer Intern @ Xcellence-IT
- 🌎 Based in the Washington DC area — open to relocation
- 📬 Reach me at pgandhi6@gmu.edu
🚀 19 projects shipped across analytics & engineering
📈 10M+ records processed across AWS and GCP
⚡ 50% latency reduction on LLM inference pipelines
💰 $500K+ in quantified business insights delivered
Languages : Python, SQL, R, JavaScript, Bash Data Engineering & Orchestration : Apache Airflow, dbt, Apache Spark, Apache Kafka, Mage AI Databases & Warehouses : Snowflake, Bigquery, Redshift, PostgreSQL, MySQL, MongoDB Analytics & BI : Tableau, Power BI, Looker Studio, Pandas, Numpy, Excel Cloud & DevOps : AWS, GCP, Azure, Docker, Git, Jenkins
| Project | Stack | Highlight |
|---|---|---|
| Stock Market ETL & Predictive Pipeline | Airflow · dbt · Snowflake · Parquet | Cloud ETL ingesting data for 100+ companies · ML predictions at 70% accuracy |
| Geospatial Data Pipeline for Taxi Analytics | GCP · BigQuery · Mage AI · Looker Studio | ELT pipeline processing 10M+ trips · 40% query speedup |
| Project | Stack | Highlight |
|---|---|---|
| Customer Churn Analysis | SQL · Tableau · Python | Found the $130K churn driver in 7,043 telecom customers |
| Bank Loan Portfolio Analysis | MySQL · Tableau | Exposed a 5x risk gap across grades in a $435.8M, 38K-application portfolio |
| A/B Test Statistical Analysis | Python · Statistical Analysis | Caught a 25.9% statistical power test — prevented a costly launch on 290K sessions |
| E-Commerce Cohort & RFM Analysis | SQL · Power BI | Segmented 96K customers across $19.7M revenue · Quantified $356K uplift opportunity |
| HR Analytics Employee Attrition | Python · SQL · SQLite · scikit-learn · Tableau · pandas · Logistic Regression | Achieved 75% model accuracy with 77% recall, identified overtime as the #1 attrition driver, and surfaced targeted retention actions projected to save $330K–$660K annually. |
| Project | Stack | Highlight |
|---|---|---|
| Credit Card Fraud Detection | CatBoost · XGBoost · LightGBM | Ensemble model with 80%+ detection accuracy on real-time risk scoring |
| Driver Drowsiness Detection System | CNN · OpenCV · Keras | Real-time alert system trained on 7,000+ eye-state images |
| AI Code Review and Security Auditor Agent | Python · LLMs · NLP · Security · AI | Detects critical vulnerabilities like SQL and XSS with automated remediation suggestions, improving code security at scale. |
📂 See all projects → github.com/Parshwa1504?tab=repositories
If you've been burned by a flaky pipeline at 2 AM or watched a team argue about what a number means for an hour we should talk.
Thanks for stopping by! ⭐ any project that catches your eye.