A side-by-side reference guide that translates your R/tidyverse fluency into Python/pandas — from first line to production ETL pipelines.
Data analysts and researchers who already work in R with the tidyverse and want to add Python without starting from scratch. If you know dplyr, ggplot2, and tidyr, you already understand 80% of the concepts — this guide gives you the syntax.
| Section | Topics |
|---|---|
| Foundations | Setup, syntax differences (0-indexing, indentation, method chaining), data types, lists vs vectors |
| Data Wrangling | Full dplyr→pandas verb mapping: select, filter, mutate, summarise, group_by, joins, reshaping, strings, dates, categoricals |
| ETL & Pipelines | Reading/writing data, complete ETL pipeline pattern with validation, APIs, web scraping, SQL from Python |
| Visualisation | ggplot2 → matplotlib/seaborn, plotly interactive charts |
| Statistics & ML | scipy.stats, statsmodels regression, scikit-learn pipelines |
| Production | Functions, error handling, CLI scripts, virtual environments, project structure |
Includes a complete R → Python cheat sheet table covering 30+ common operations.