Inspired and built with SQLGlot
-
Updated
Feb 18, 2026 - Python
Inspired and built with SQLGlot
🪂 Parachute: Single-Pass Bi-Directional Information Passing (VLDB'25)
Battle-tested Apache Spark tuning patterns with reproducible benchmarks. 10 techniques (partition pruning, broadcast joins, AQE, skew handling, Z-ORDER, and more) — each paired with measured before/after speedups runnable on a laptop.
PySpark ETL & analytics pipeline for taxi trip ETA, partitioned Parquet, windowed aggregations and performance patterns.
🚖 Ingest and analyze NYC yellow taxi data with a streamlined ETL pipeline, featuring data cleaning, analytics, and business-ready outputs.
Add a description, image, and links to the partition-pruning topic page so that developers can more easily learn about it.
To associate your repository with the partition-pruning topic, visit your repo's landing page and select "manage topics."