From 14584822687509d2373bbc48f6e5d7e0adfa6414 Mon Sep 17 00:00:00 2001 From: SapientSapiens Date: Sun, 22 Mar 2026 22:58:43 +0530 Subject: [PATCH] Update 2025 dezoomcamp project list data.csv --- Data/dezoomcamp/2025/data.csv | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Data/dezoomcamp/2025/data.csv b/Data/dezoomcamp/2025/data.csv index 5687851..bbcf26c 100644 --- a/Data/dezoomcamp/2025/data.csv +++ b/Data/dezoomcamp/2025/data.csv @@ -200,7 +200,7 @@ https://github.com/FeloXbit/Ethereum-Block-Analytics.git,Ethereum Network Perfor https://github.com/brukeg/lewis-hamilton-brilliance,Formula 1 Career Data Warehouse,Batch,"The repository uses Kestra as a workflow orchestrator with scheduled ETL jobs, dbt for transformations, and Terraform for infrastructure provisioning. The code shows a batch-oriented pipeline with ingestion scripts, scheduled transformations, and Looker dashboards for visualization. While Kestra can handle streaming, the configuration and structure indicate batch processing with scheduled data pulls and transformations.",GCP https://github.com/dmytrovoytko/stock-market-data-engineering,Unknown,Unknown,No files fetched,Unknown https://github.com/Juwon-Ogunseye/bitcoin-etl-pipeline,WBTC Blockchain Analytics Pipeline,Batch,"The repository uses Apache Airflow with DAGs (etl_dags.py, test_dag.py) to orchestrate scheduled ETL jobs. The pipeline runs daily with tasks for data extraction, loading to ClickHouse, and running dbt models. This is a classic batch processing architecture using workflow orchestrators.",AWS -https://github.com/SapientSapiens/capstoneproject-2025-dez,NYC Taxi Analytics Pipeline,Batch,"The project uses Kestra workflow orchestrator with scheduled flows (hourly_air_quality, daily_air_quality) to run periodic ETL jobs. The code shows scheduled data fetching from APIs, loading to GCS/BigQuery, and dbt transformations - all characteristic of batch processing rather than continuous streaming.",GCP +https://github.com/SapientSapiens/capstoneproject-2025-dez,Air Quality Analysis Data Pipeline,Batch,"The project uses Kestra workflow orchestrator with scheduled flows (hourly_air_quality, daily_air_quality) to run periodic ETL jobs. The code shows scheduled data fetching from APIs, loading to GCS/BigQuery, and dbt transformations - all characteristic of batch processing rather than continuous streaming.",GCP https://github.com/dmitrievdeveloper/de_project/tree/main/air_pollution,Unknown,Unknown,No files fetched,Unknown https://github.com/3d150n-marc3l0/de-zoomcamp-2025-capstone-baywheels,Bay Wheels Data Pipeline,Batch,"The project uses Kestra workflow orchestrator with scheduled flows (docker-compose.yml, flows/*.yaml) that run ETL jobs periodically (e.g., monthly scheduled data loading). No streaming components like Kafka, Kinesis, or Flink are present.",GCP https://github.com/hbg108/tfl-data-visualization/tree/main,TfL Footfall Data Transformation Pipeline,Batch,"The project uses Kestra workflow orchestrator with scheduled flows (e.g., 04_station_footfall_scheduled.yaml) to run periodic ETL jobs. Data is pulled from TfL sources, transformed, and loaded to BigQuery. The architecture includes dbt transformations and Looker Studio visualization, all characteristic of batch processing rather than continuous streaming.",GCP