diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json
index 4edc581..fb80891 100644
--- a/.claude-plugin/marketplace.json
+++ b/.claude-plugin/marketplace.json
@@ -76,6 +76,15 @@
         "./quarto/quarto-authoring",
         "./quarto/quarto-alt-text"
       ]
+    },
+    {
+      "name": "tidyverse",
+      "description": "Collection of skills for tidyverse-style R development",
+      "source": "./",
+      "strict": false,
+      "skills": [
+        "./tidyverse/tidy-r"
+      ]
     }
   ]
 }
diff --git a/tidyverse/README.md b/tidyverse/README.md
index e840b29..d58365d 100644
--- a/tidyverse/README.md
+++ b/tidyverse/README.md
@@ -1,6 +1,10 @@
 # Tidyverse Skills
 
-Skills specific to using tidyverse packages and tidyverse-specific package development patterns.
+Skills for tidyverse-style R development, covering modern patterns, style guidelines, and best practices.
+
+## Skills
+
+- **[tidy-r](./tidy-r/)** - Modern tidyverse patterns, style guide, and migration guidance for R development. Covers native pipe usage, join_by() syntax, .by grouping, pick/across/reframe operations, filter_out/when_any/when_all, recode_values/replace_values/replace_when, tidy selection, stringr patterns, naming conventions, and migration from base R or older tidyverse APIs.
 
 ## Potential Skills
 
diff --git a/tidyverse/tidy-r/SKILL.md b/tidyverse/tidy-r/SKILL.md
new file mode 100644
index 0000000..4340cac
--- /dev/null
+++ b/tidyverse/tidy-r/SKILL.md
@@ -0,0 +1,119 @@
+---
+name: tidy-r
+description: >
+  Modern tidyverse patterns, style guide, and migration guidance for R development. Use this skill when writing R code, reviewing tidyverse code, updating legacy R code, or enforcing consistent style. Covers native pipe usage, join_by() syntax, .by grouping, pick/across/reframe, filter_out/when_any/when_all, recode_values/replace_values/replace_when, tidyselect helpers, .data/.env pronouns, stringr, naming conventions, and readr.
+metadata:
+  r_version: ">=4.5.0"
+  tidyverse_version: ">=2.0.0"
+  dplyr_version: ">=1.2.0"
+---
+
+# Modern Tidyverse R Reference
+
+Code from blog posts and StackOverflow often uses deprecated APIs, magrittr pipes, or base R patterns where a modern tidyverse function exists. This guide encodes the current recommended approach.
+
+## Reference files
+
+Consult the appropriate reference file for detailed patterns and examples:
+
+| Topic | Reference file | When to consult |
+|-------|---------------|-----------------|
+| **Joins** | [joins.md](references/joins.md) | Merging data, `*_join`, `join_by`, matching rows, lookup tables |
+| **Grouping & columns** | [grouping.md](references/grouping.md) | `.by`, `group_by`, `across`, `pick`, `reframe`, column operations |
+| **Recoding & replacing** | [recode-replace.md](references/recode-replace.md) | `recode_values`, `replace_values`, `replace_when`, `filter_out`, `when_any`, `when_all` |
+| **Strings** | [stringr.md](references/stringr.md) | String manipulation, regex, `str_*` functions, text processing |
+| **Tidy selection** | [tidyselect.md](references/tidyselect.md) | Column selection helpers, `where()`, `all_of()`, `any_of()`, boolean ops, `.data`/`.env` pronouns |
+| **Style** | [tidyverse-style.md](references/tidyverse-style.md) | Naming, formatting, spacing, error messages, `cli::cli_abort` |
+| **Migration** | [migration.md](references/migration.md) | Updating old code, base R conversion, deprecated functions |
+
+For requests that span multiple topics (e.g., "rewrite this old code" touches migration + style), read multiple files.
+
+## Core principles
+
+1. **Use modern tidyverse patterns** -- Prioritize dplyr 1.2+ features, native pipe, and current APIs
+2. **Write readable code first** -- Optimize only when necessary
+3. **Follow tidyverse style guide** -- Consistent naming, spacing, and structure
+
+## Quick reference
+
+### Pipe and lambda
+
+- Always `|>`, never `%>%`
+- Use `_` placeholder for non-first arguments: `x |> f(1, y = _)`. The placeholder must be named and used exactly once.
+- Always `\(x)`, never `function(x)` or `~` in map/keep/etc.
+
+### Code organization
+
+Use newspaper style: high-level logic first, helpers below. Don't define functions inside other functions unless they are very brief.
+
+### Grouping
+
+- Prefer `.by` for per-operation grouping; use `group_by()` when grouping must persist across multiple operations
+- Never add `ungroup()` before or after `.by` -- it always returns ungrouped data
+- Consolidate multiple `mutate(.by = x)` calls into one when they share the same `.by`; keep separate only when `.by` differs or a later column depends on an earlier one
+- Place `.by` on its own line for readability
+
+### Joins
+
+- Use `join_by()`, never `c("a" = "b")`
+- Use `relationship`, `unmatched`, `na_matches` for quality control
+
+### Recoding and replacing (dplyr >=1.2.0)
+
+| Task | Function |
+|------|----------|
+| Recode values (new column) | `recode_values()` |
+| Replace values in place | `replace_values()` |
+| Conditional update in place | `replace_when()` |
+| Complex conditional (new column) | `case_when()` |
+| Drop rows (NA-safe) | `filter_out()` |
+| OR conditions | `when_any()` |
+| AND conditions | `when_all()` |
+
+### Error handling
+
+Use `cli::cli_abort()` with problem statement + bullets, never `stop()`.
+
+### R idioms
+
+- `TRUE`/`FALSE`, never `T`/`F`
+- `message()` for info, never `cat()`
+- `map_*()` over `sapply()` for type stability
+- `set.seed()` with date-time, never 42
+
+## Example
+
+```r
+library(tidyverse)
+
+penguins <- penguins |>
+  filter_out(is.na(sex)) |>
+  mutate(size = case_when(
+    body_mass > 4500 ~ "large",
+    body_mass > 3500 ~ "medium",
+    .default = "small"
+  ))
+
+# Coordinates for spatial join below
+island_coords <- tribble(
+  ~island,      ~latitude,
+  "Biscoe",     -65.5,
+  "Dream",      -64.7,
+  "Torgersen",  -64.8
+)
+
+island_summary <- penguins |>
+  summarise(
+    mean_flipper = mean(flipper_len),
+    mean_mass = mean(body_mass),
+    n = n(),
+    .by = c(species, island)
+  ) |>
+  left_join(
+    island_coords,
+    by = join_by(island),
+    unmatched = "error"
+  ) |>
+  arrange(species, island)
+```
+
diff --git a/tidyverse/tidy-r/references/grouping.md b/tidyverse/tidy-r/references/grouping.md
new file mode 100644
index 0000000..c576c8d
--- /dev/null
+++ b/tidyverse/tidy-r/references/grouping.md
@@ -0,0 +1,296 @@
+# Modern Grouping and Column Operations (dplyr >=1.2.0)
+
+## Per-operation grouping with .by
+
+The `.by` argument is preferred for per-operation grouping. Use `group_by()` when grouping must persist across multiple operations. `.by` results are always ungrouped.
+
+### Basic usage
+
+```r
+data |>
+  summarise(
+    mean_value = mean(value),
+    .by = category
+  )
+```
+
+### Multiple grouping variables
+
+```r
+data |>
+  summarise(
+    total = sum(revenue),
+    .by = c(company, year)
+  )
+```
+
+### .by with mutate (window functions)
+
+```r
+data |>
+  mutate(
+    pct_of_group = revenue / sum(revenue),
+    rank = row_number(desc(revenue)),
+    .by = region
+  )
+```
+
+### .by with filter (group-level filtering)
+
+```r
+data |>
+  filter(
+    revenue == max(revenue),
+    .by = region
+  )
+```
+
+### Place .by on its own line
+
+```r
+# Good - readable
+data |>
+  summarise(
+    mean_value = mean(value),
+    .by = category
+  )
+
+# Avoid - crammed
+data |>
+  summarise(mean_value = mean(value), .by = category)
+```
+
+### Avoid for single operations - use .by instead
+
+```r
+# Avoid
+data |>
+  group_by(category) |>
+  summarise(mean_value = mean(value)) |>
+  ungroup()
+```
+
+### Avoid - redundant ungroup() around .by
+
+`.by` always returns ungrouped data, so `ungroup()` before or after is a no-op. Remove it.
+
+```r
+# Avoid - ungroup() is redundant
+data |>
+  ungroup() |>
+  mutate(
+    centered = x - mean(x),
+    .by = group
+  )
+
+# Good
+data |>
+  mutate(
+    centered = x - mean(x),
+    .by = group
+  )
+```
+
+### Consolidating mutate() calls
+
+When multiple columns share the same `.by`, combine them in a single `mutate()`.
+
+```r
+# Avoid - repeating .by = year across separate mutate() calls
+data |>
+  mutate(
+    above_med_a = a > median(a),
+    .by = year
+  ) |>
+  mutate(
+    above_med_b = b > median(b),
+    .by = year
+  )
+
+# Good - one mutate(), one .by
+data |>
+  mutate(
+    above_med_a = a > median(a),
+    above_med_b = b > median(b),
+    .by = year
+  )
+```
+
+**When to keep separate `mutate()` calls:**
+
+- **Different `.by` variables** between the calls
+- **Sequential dependency**: a later column uses a column created in an earlier `mutate()` within the same grouped context (the new column must exist before the group-level aggregate can reference it)
+
+```r
+# Separate calls needed: different .by variables
+data |>
+  mutate(
+    x_lag = dplyr::lag(x),
+    .by = id
+  ) |>
+  mutate(
+    above_med = x_lag > median(x_lag),
+    .by = year
+  )
+
+# Separate calls needed: b_rank depends on b_centered
+data |>
+  mutate(
+    b_centered = b - mean(b),
+    .by = group
+  ) |>
+  mutate(
+    b_rank = row_number(desc(b_centered)),
+    .by = group
+  )
+```
+
+## .by with tidyr::fill()
+
+tidyr supports `.by` in `fill()`, matching the dplyr pattern:
+
+```r
+# Good - per-operation grouping
+data |>
+  tidyr::fill(value, .by = group, .direction = "down")
+
+# Avoid - group_by/ungroup wrapper
+data |>
+  group_by(group) |>
+  tidyr::fill(value, .direction = "down") |>
+  ungroup()
+```
+
+## pick() for column selection
+
+Use `pick()` inside data-masking functions to select columns by name or tidyselect helpers:
+
+```r
+data |>
+  summarise(
+    n_x_cols = ncol(pick(starts_with("x"))),
+    n_y_cols = ncol(pick(starts_with("y")))
+  )
+```
+
+### pick() to pass selected columns to functions
+
+```r
+data |>
+  mutate(
+    row_mean = rowMeans(pick(where(is.numeric)))
+  )
+```
+
+## across() for applying functions
+
+Apply one or more functions to multiple columns:
+
+### Single function
+
+```r
+data |>
+  summarise(
+    across(where(is.numeric), \(x) mean(x)),
+    .by = group
+  )
+```
+
+### Multiple functions with naming
+
+```r
+data |>
+  summarise(
+    across(
+      c(revenue, cost),
+      list(mean = \(x) mean(x), sd = \(x) sd(x)),
+      .names = "{.fn}_{.col}"
+    ),
+    .by = region
+  )
+```
+
+### Conditional transformation
+
+```r
+data |>
+  mutate(
+    across(where(is.character), str_to_lower)
+  )
+```
+
+## reframe() for multi-row results
+
+When a summary returns multiple rows per group, use `reframe()` instead of `summarise()`:
+
+```r
+data |>
+  reframe(
+    quantile = c(0.25, 0.50, 0.75),
+    value = quantile(x, c(0.25, 0.50, 0.75)),
+    .by = group
+  )
+```
+
+## Data masking vs tidy selection
+
+Understand the difference for writing functions:
+
+- **Data masking** (`arrange`, `filter`, `mutate`, `summarise`): expressions evaluated in data context
+- **Tidy selection** (`select`, `relocate`, `across`, `pick`): column selection helpers
+
+### Embrace with {{ }} for function arguments
+
+```r
+my_summary <- function(data, summary_var) {
+  data |>
+    summarise(mean_val = mean({{ summary_var }}))
+}
+```
+
+### Character vectors in data-masked contexts use .data[[]]
+
+```r
+for (var in names(mtcars)) {
+  mtcars |> count(.data[[var]]) |> print()
+}
+```
+
+### Character vectors in tidy-select contexts use all_of()/any_of()
+
+The `across(all_of())` bridge is the canonical pattern for passing character vectors into tidy-select:
+
+```r
+vars <- c("mpg", "wt", "hp")
+
+# Good - across(all_of()) for character vectors
+mtcars |>
+  summarise(across(all_of(vars), mean))
+
+# Good - any_of() when some columns may not exist
+mtcars |>
+  select(any_of(vars))
+
+# Avoid - .data[[]] inside tidy-select (deprecated)
+mtcars |>
+  select(.data[["mpg"]], .data[["wt"]])
+```
+
+### Access calling-environment variables with .env
+
+Use `.env$var` to disambiguate when a local variable shares a name with a column:
+
+```r
+threshold <- 10
+data |>
+  filter(value > .env$threshold)
+```
+
+### Multiple columns use across()
+
+```r
+my_summary <- function(data, summary_vars) {
+  data |>
+    summarise(across({{ summary_vars }}, \(x) mean(x)))
+}
+```
diff --git a/tidyverse/tidy-r/references/joins.md b/tidyverse/tidy-r/references/joins.md
new file mode 100644
index 0000000..7ea7140
--- /dev/null
+++ b/tidyverse/tidy-r/references/joins.md
@@ -0,0 +1,90 @@
+# Modern Join Syntax (dplyr >=1.2.0)
+
+## Use join_by() instead of character vectors
+
+### Equality joins
+
+```r
+transactions |>
+  inner_join(companies, by = join_by(company == id))
+```
+
+### Same-name columns
+
+```r
+# When both tables share a column name, use a single name
+orders |>
+  left_join(customers, by = join_by(customer_id))
+```
+
+### Inequality joins
+
+```r
+transactions |>
+  inner_join(companies, by = join_by(company == id, year >= since))
+```
+
+### Rolling joins (closest match)
+
+```r
+transactions |>
+  inner_join(companies, by = join_by(company == id, closest(year >= since)))
+```
+
+### Overlap joins
+
+```r
+# Find events during each interval
+intervals |>
+  inner_join(events, by = join_by(start <= time, end >= time))
+```
+
+### Avoid - Old character vector syntax
+
+```r
+# Avoid
+transactions |>
+  inner_join(companies, by = c("company" = "id"))
+```
+
+## Relationship and match handling
+
+### Enforce expected cardinality with relationship
+
+```r
+# 1:1 - each row matches at most one row in the other table
+inner_join(x, y, by = join_by(id), relationship = "one-to-one")
+
+# Many-to-one - many x rows can match one y row (lookup pattern)
+left_join(x, y, by = join_by(id), relationship = "many-to-one")
+
+# One-to-many
+inner_join(x, y, by = join_by(id), relationship = "one-to-many")
+```
+
+### Ensure all rows match
+
+```r
+inner_join(x, y, by = join_by(id), unmatched = "error")
+```
+
+### Prevent NA matching (recommended)
+
+```r
+# By default, NA matches NA in joins -- usually not desired
+left_join(x, y, by = join_by(id), na_matches = "never")
+```
+
+### Combining guards for production code
+
+```r
+sales |>
+  left_join(
+    products,
+    by = join_by(product_id),
+    relationship = "many-to-one",
+    unmatched = "error",
+    na_matches = "never"
+  )
+```
+
diff --git a/tidyverse/tidy-r/references/migration.md b/tidyverse/tidy-r/references/migration.md
new file mode 100644
index 0000000..65d732a
--- /dev/null
+++ b/tidyverse/tidy-r/references/migration.md
@@ -0,0 +1,175 @@
+# Migration: Base R and Old Tidyverse to Modern Patterns (dplyr >=1.2.0)
+
+## Base R to Modern Tidyverse
+
+### Data manipulation
+
+```r
+subset(data, condition)          # -> filter(data, condition)
+data[order(data$x), ]            # -> arrange(data, x)
+aggregate(x ~ y, data, mean)     # -> summarise(data, mean(x), .by = y)
+merge(x, y, by = "id")           # -> inner_join(x, y, by = join_by(id))
+```
+
+### Functional programming
+
+```r
+sapply(x, f)                     # -> map(x, f)  # type-stable
+lapply(x, f)                     # -> map(x, f)
+vapply(x, f, numeric(1))         # -> map_dbl(x, f)
+```
+
+### String manipulation
+
+```r
+grepl("pattern", text)           # -> str_detect(text, "pattern")
+gsub("old", "new", text)         # -> str_replace_all(text, "old", "new")
+substr(text, 1, 5)               # -> str_sub(text, 1, 5)
+nchar(text)                      # -> str_length(text)
+strsplit(text, ",")              # -> str_split(text, ",")
+tolower(text)                    # -> str_to_lower(text)
+sprintf("Hello %s", name)        # -> str_glue("Hello {name}")
+```
+
+## Old to New Tidyverse Patterns
+
+### Pipes
+
+```r
+data %>% function()              # -> data |> function()
+```
+
+### Anonymous functions
+
+```r
+map(x, function(x) x + 1)       # -> map(x, \(x) x + 1)
+map(x, ~ .x + 1)                # -> map(x, \(x) x + 1)
+```
+
+### Grouping (dplyr >=1.2.0)
+
+```r
+group_by(data, x) |>
+  summarise(mean(y)) |>
+  ungroup()                      # -> summarise(data, mean(y), .by = x)
+```
+
+### Joins
+
+```r
+by = c("a" = "b")                # -> by = join_by(a == b)
+```
+
+### Column selection
+
+```r
+across(starts_with("x"))         # -> pick(starts_with("x"))  # for selection only
+```
+
+### Multi-row summaries
+
+```r
+summarise(data, x, .groups = "drop") # -> reframe(data, x)
+```
+
+### Data reshaping
+
+```r
+gather()/spread()                # -> pivot_longer()/pivot_wider()
+```
+
+### String separation (tidyr >=1.3.0)
+
+```r
+separate(col, into = c("a", "b"))
+# -> separate_wider_delim(col, delim = "_", names = c("a", "b"))
+
+extract(col, into = "x", regex)
+# -> separate_wider_regex(col, patterns = c(x = regex))
+```
+
+### Superseded purrr functions (purrr >=1.0.0)
+
+```r
+map_dfr(x, f)                    # -> map(x, f) |> list_rbind()
+map_dfc(x, f)                    # -> map(x, f) |> list_cbind()
+map2_dfr(x, y, f)                # -> map2(x, y, f) |> list_rbind()
+pmap_dfr(list, f)                # -> pmap(list, f) |> list_rbind()
+imap_dfr(x, f)                   # -> imap(x, f) |> list_rbind()
+```
+
+### Recoding and replacing (dplyr >=1.2.0)
+
+```r
+case_match(x, val ~ result)      # -> recode_values(x, val ~ result)
+recode(x, old = "new")           # -> recode_values(x, "old" ~ "new")
+                                 #    or replace_values(x, "old" ~ "new")
+
+# Conditional replacement: case_when with .default = x -> replace_when
+case_when(
+  cond1 ~ val1,
+  cond2 ~ val2,
+  .default = x
+)                                # -> x |> replace_when(cond1 ~ val1, cond2 ~ val2)
+
+# NA handling
+coalesce(x, default)             # -> replace_values(x, NA ~ default)
+na_if(x, val)                    # -> replace_values(x, val ~ NA)
+tidyr::replace_na(x, default)    # -> replace_values(x, NA ~ default)
+```
+
+### Filter family (dplyr >=1.2.0)
+
+```r
+# Dropping rows with NA-safe negation
+filter(x != val | is.na(x))      # -> filter_out(x == val)
+
+# Combining conditions with OR
+filter(cond1 | cond2 | cond3)    # -> filter(when_any(cond1, cond2, cond3))
+
+# Combining conditions with AND (explicit)
+filter(cond1 & cond2 & cond3)    # -> filter(when_all(cond1, cond2, cond3))
+```
+
+### Reading data
+
+```r
+read.csv("file.csv")                 # -> read_csv("file.csv")  # tibble, faster, better type detection
+read.csv("file.csv", sep = "\t")     # -> read_tsv("file.csv")
+read.csv2("file.csv")                # -> read_csv2("file.csv")  # semicolon-delimited
+```
+
+For large files (>100 MB), `vroom::vroom()` is faster than `read_csv()`. For small files the difference is negligible.
+
+### Serialization
+
+```r
+qs::qsave(x, "file.qs")         # -> qs2::qs_save(x, "file.qs2")
+qs::qread("file.qs")            # -> qs2::qs_read("file.qs2")
+```
+
+### Defunct in dplyr >=1.2.0 (now errors)
+
+```r
+# Underscored SE verbs (defunct since 1.2, deprecated since 0.7)
+mutate_()                        # -> mutate() with modern programming
+filter_()                        # -> filter()
+summarise_()                     # -> summarise()
+# ... all *_() variants
+
+# _each variants (defunct since 1.2, deprecated since 0.7)
+mutate_each()                    # -> mutate(across(...))
+summarise_each()                 # -> summarise(across(...))
+
+# Multi-row summarise (defunct since 1.2, deprecated since 1.1)
+summarise(data, x)               # -> reframe(data, x) for multi-row results
+```
+
+### For side effects
+
+```r
+for (x in xs) write_file(x)     # -> walk(xs, write_file)
+for (i in seq_along(data)) {
+  write_csv(data[[i]], paths[[i]])
+}                                # -> walk2(data, paths, write_csv)
+```
diff --git a/tidyverse/tidy-r/references/recode-replace.md b/tidyverse/tidy-r/references/recode-replace.md
new file mode 100644
index 0000000..9889a8b
--- /dev/null
+++ b/tidyverse/tidy-r/references/recode-replace.md
@@ -0,0 +1,188 @@
+# Recoding, Replacing, and Filtering (dplyr >=1.2.0)
+
+dplyr 1.2 introduced a family of functions for recoding and replacing values, and for NA-safe filtering. These replace older patterns (`case_match`, `recode`, `coalesce`, `na_if`, negated filters).
+
+## The recode/replace family
+
+|                           | **Recoding** (new column) | **Replacing** (update in place) |
+|---------------------------|---------------------------|---------------------------------|
+| **Match with conditions** | `case_when()`             | `replace_when()`                |
+| **Match with values**     | `recode_values()`         | `replace_values()`              |
+
+## recode_values()
+
+Use instead of `case_match()` or repetitive `case_when()` with `==`.
+
+### Formula interface
+
+```r
+score |>
+  recode_values(
+    1 ~ "Strongly disagree",
+    2 ~ "Disagree",
+    3 ~ "Neutral",
+    4 ~ "Agree",
+    5 ~ "Strongly agree"
+  )
+```
+
+### Lookup table interface
+
+```r
+likert |>
+  mutate(score = recode_values(score, from = lookup$from, to = lookup$to))
+```
+
+### With .unmatched = "error" for safety
+
+```r
+# Errors if any value has no match
+score |>
+  recode_values(
+    1 ~ "Low",
+    2 ~ "Medium",
+    3 ~ "High",
+    .unmatched = "error"
+  )
+```
+
+### Avoid
+
+```r
+# Avoid - repetitive case_when with ==
+case_when(score == 1 ~ "Strongly disagree", score == 2 ~ "Disagree", ...)
+
+# Avoid - case_match() is soft-deprecated in dplyr 1.2
+case_match(score, 1 ~ "Strongly disagree", 2 ~ "Disagree", ...)
+
+# Avoid - recode() is soft-deprecated
+recode(score, `1` = "Strongly disagree", `2` = "Disagree", ...)
+```
+
+## replace_values()
+
+Use for partial updates by value. Unmatched values pass through unchanged.
+
+### Replace specific values
+
+```r
+name |>
+  replace_values(
+    c("UNC", "Chapel Hill") ~ "UNC Chapel Hill",
+    c("Duke", "Duke University") ~ "Duke"
+  )
+```
+
+### Replace NA (replaces coalesce/tidyr::replace_na)
+
+```r
+x |> replace_values(NA ~ 0)
+```
+
+### Convert sentinel values to NA (replaces na_if)
+
+```r
+x |> replace_values(from = c(0, -99), to = NA)
+```
+
+## replace_when()
+
+Use for conditional updates. Type-stable on the input; unmatched values pass through unchanged.
+
+### Conditional updates
+
+```r
+racers |>
+  mutate(
+    time = time |>
+      replace_when(
+        id %in% id_banned ~ NA,
+        id %in% id_penalty ~ time + 1/3
+      )
+  )
+```
+
+### Avoid - case_when with .default
+
+```r
+# Avoid - buries the primary input, loses type info
+mutate(time = case_when(
+  id %in% id_banned ~ NA,
+  id %in% id_penalty ~ time + 1/3,
+  .default = time
+))
+```
+
+## case_when() with .unmatched = "error"
+
+Still the right choice for complex conditional recoding into a new column. Use `.unmatched = "error"` for safety:
+
+```r
+tier <- case_when(
+  time < 23 ~ "A",
+  time < 27 ~ "B",
+  time < 30 ~ "C",
+  .unmatched = "error"
+)
+```
+
+## filter_out()
+
+NA-safe row removal. Treats `NA` as `FALSE`, so you don't accidentally drop NA rows:
+
+```r
+# Good - clear intent, NA-safe
+data |> filter_out(deceased, date < 2012)
+
+# Avoid - easy to get wrong with NA
+data |> filter(!(deceased & date < 2012) | is.na(deceased) | is.na(date))
+```
+
+## when_any() and when_all()
+
+Combine conditions with comma-separated syntax instead of `|` and `&`:
+
+### OR conditions
+
+```r
+data |>
+  filter(when_any(
+    name %in% c("US", "CA") & between(score, 200, 300),
+    name %in% c("PR", "RU") & between(score, 100, 200)
+  ))
+```
+
+### Drop rows matching any condition
+
+```r
+data |>
+  filter_out(when_any(
+    is.na(value),
+    status == "invalid"
+  ))
+```
+
+### AND conditions
+
+```r
+data |>
+  filter(when_all(
+    score > 50,
+    !is.na(region),
+    status == "active"
+  ))
+```
+
+## Migration quick reference
+
+| Old pattern | New pattern |
+|-------------|-------------|
+| `case_match(x, val ~ result)` | `recode_values(x, val ~ result)` |
+| `recode(x, old = "new")` | `recode_values(x, "old" ~ "new")` |
+| `case_when(..., .default = x)` | `x \|> replace_when(...)` |
+| `coalesce(x, default)` | `replace_values(x, NA ~ default)` |
+| `na_if(x, val)` | `replace_values(x, val ~ NA)` |
+| `tidyr::replace_na(x, default)` | `replace_values(x, NA ~ default)` |
+| `filter(x != val \| is.na(x))` | `filter_out(x == val)` |
+| `filter(c1 \| c2 \| c3)` | `filter(when_any(c1, c2, c3))` |
+| `filter(c1 & c2 & c3)` | `filter(when_all(c1, c2, c3))` |
diff --git a/tidyverse/tidy-r/references/stringr.md b/tidyverse/tidy-r/references/stringr.md
new file mode 100644
index 0000000..fecadfd
--- /dev/null
+++ b/tidyverse/tidy-r/references/stringr.md
@@ -0,0 +1,117 @@
+# String Manipulation with stringr
+
+Use stringr over base R string functions. Benefits: consistent `str_` prefix, string-first argument order, pipe-friendly and vectorized.
+
+## Core patterns
+
+### Pipe-friendly chaining
+
+```r
+text |>
+  str_to_lower() |>
+  str_trim() |>
+  str_replace_all("pattern", "replacement") |>
+  str_extract("\\d+")
+```
+
+### Detection and extraction
+
+```r
+str_detect(text, "pattern")       # logical: does it match?
+str_which(text, "pattern")        # integer: which elements match?
+str_count(text, "pattern")        # integer: how many matches?
+str_extract(text, "pattern")      # first match
+str_extract_all(text, "pattern")  # all matches (returns list)
+str_match(text, "(\\w+)@(\\w+)")  # capture groups as matrix
+```
+
+### Replacement
+
+```r
+str_replace(text, "old", "new")       # first occurrence
+str_replace_all(text, "old", "new")   # all occurrences
+str_remove(text, "pattern")           # remove first match
+str_remove_all(text, "pattern")       # remove all matches
+```
+
+### Splitting and combining
+
+```r
+str_split(text, ",")                  # split into list
+str_split_fixed(text, ",", n = 3)     # split into matrix (fixed columns)
+str_split_i(text, ",", i = 2)         # extract ith piece directly
+str_c("a", "b", "c", sep = "-")      # combine with separator
+str_flatten(words, collapse = ", ")   # collapse vector to single string
+```
+
+### Substring operations
+
+```r
+str_sub(text, 1, 5)                   # extract positions 1-5
+str_sub(text, -3)                     # last 3 characters
+str_length(text)                      # character count
+str_trunc(text, 20)                   # truncate with ellipsis
+```
+
+### Formatting and case conversion
+
+```r
+str_to_lower(text)                    # lowercase
+str_to_upper(text)                    # uppercase
+str_to_title(text)                    # title case
+str_to_sentence(text)                # sentence case
+str_to_snake(text)                    # snake_case (stringr >=1.6.0)
+str_to_camel(text)                    # camelCase (stringr >=1.6.0)
+str_to_kebab(text)                    # kebab-case (stringr >=1.6.0)
+str_trim(text)                        # remove leading/trailing whitespace
+str_squish(text)                      # trim + collapse internal whitespace
+str_pad(text, 10, side = "left")      # pad to fixed width
+str_wrap(text, width = 80)            # word wrap
+```
+
+### Interpolation
+
+```r
+str_glue("Hello {name}, you scored {score}!")
+str_glue_data(df, "{name}: {value}")
+```
+
+### Case-insensitive matching (stringr >=1.6.0)
+
+```r
+str_ilike(text, "hello*")              # SQL ILIKE-style, case-insensitive glob
+# Replaces: str_like(text, "hello*", ignore_case = TRUE)
+# str_like() ignore_case argument is deprecated; use str_ilike() instead
+```
+
+## Pattern helpers
+
+Use these for clarity about what kind of matching you intend:
+
+```r
+str_detect(text, fixed("$"))           # literal match (no regex)
+str_detect(text, regex("\\d+"))        # explicit regex (default)
+str_detect(text, regex("hello", ignore_case = TRUE))  # case-insensitive
+str_detect(text, coll("e", locale = "fr"))  # locale-aware collation
+str_detect(text, boundary("word"))     # word boundaries
+```
+
+## stringr vs base R
+
+| stringr | base R | Notes |
+|---------|--------|-------|
+| `str_detect(text, "pat")` | `grepl("pat", text)` | Argument order differs |
+| `str_extract(text, "pat")` | `regmatches(text, regexpr(...))` | Much simpler |
+| `str_replace_all(text, "a", "b")` | `gsub("a", "b", text)` | Argument order differs |
+| `str_split(text, ",")` | `strsplit(text, ",")` | |
+| `str_length(text)` | `nchar(text)` | |
+| `str_sub(text, 1, 5)` | `substr(text, 1, 5)` | |
+| `str_to_lower(text)` | `tolower(text)` | |
+| `str_to_upper(text)` | `toupper(text)` | |
+| `str_to_title(text)` | `tools::toTitleCase(text)` | |
+| `str_to_snake(text)` | — | stringr >=1.6.0 |
+| `str_to_camel(text)` | — | stringr >=1.6.0 |
+| `str_to_kebab(text)` | — | stringr >=1.6.0 |
+| `str_ilike(text, "pat*")` | — | case-insensitive glob, stringr >=1.6.0 |
+| `str_trim(text)` | `trimws(text)` | |
+| `str_glue("Hello {x}")` | `sprintf("Hello %s", x)` | More readable |
diff --git a/tidyverse/tidy-r/references/tidyselect.md b/tidyverse/tidy-r/references/tidyselect.md
new file mode 100644
index 0000000..64fbff6
--- /dev/null
+++ b/tidyverse/tidy-r/references/tidyselect.md
@@ -0,0 +1,117 @@
+# Tidy Selection
+
+Tidy selection is the column selection language used by `select()`, `relocate()`, `rename()`, `across()`, `pick()`, `pivot_longer()`, `pivot_wider()`, and other tidyverse functions that accept column specifications.
+
+## Selection helpers
+
+```r
+starts_with("x")          # columns starting with "x"
+ends_with("_id")           # columns ending with "_id"
+contains("score")          # columns containing "score"
+matches("^x\\d+$")        # columns matching a regex
+num_range("x", 1:5)       # x1, x2, x3, x4, x5
+last_col()                 # rightmost column
+everything()               # all columns
+where(is.numeric)          # columns satisfying a predicate
+```
+
+## Selecting by name
+
+```r
+data |> select(name, age)                      # by name
+data |> select(name:age)                       # range
+data |> select(!age)                           # exclude
+data |> select(where(is.numeric) & !id)        # boolean combination
+```
+
+## Boolean algebra on selections
+
+Selections support `!` (complement), `&` (intersection), and `|` (union):
+
+```r
+data |> select(where(is.numeric) & !c(id, year))
+data |> select(starts_with("x") | ends_with("_total"))
+data |> select(!where(is.character))
+```
+
+## Character vectors: all_of() and any_of()
+
+Use `all_of()` for strict matching (errors if a name is missing) and `any_of()` for permissive matching (silently ignores missing names):
+
+```r
+vars <- c("mpg", "wt", "hp")
+
+data |> select(all_of(vars))       # errors if any name absent
+data |> select(any_of(vars))       # ignores missing names
+```
+
+### The across(all_of()) bridge pattern
+
+This is the canonical way to pass character vectors into data-masked contexts that use tidy selection:
+
+```r
+vars <- c("revenue", "cost")
+
+data |>
+  summarise(across(all_of(vars), mean))
+
+data |>
+  mutate(across(all_of(vars), \(x) x / 1000))
+```
+
+## .data and .env pronouns
+
+### .data in data-masked contexts
+
+Use `.data[[var]]` when the column name is a string variable inside data-masked functions (`filter`, `mutate`, `summarise`):
+
+```r
+var <- "mpg"
+mtcars |> filter(.data[[var]] > 20)
+```
+
+### .data is deprecated in tidy-select contexts
+
+Do NOT use `.data$col` or `.data[[var]]` inside tidy-select functions (`select`, `across`, `pick`). Use string names or `all_of()`/`any_of()` instead:
+
+```r
+var <- "mpg"
+
+# Good
+data |> select(all_of(var))
+data |> select(any_of(var))
+
+# Avoid (deprecated)
+data |> select(.data[[var]])
+```
+
+### .env for environment variables
+
+Use `.env$var` to access variables from the calling environment when they might collide with column names:
+
+```r
+threshold <- 10
+
+# Good - unambiguous
+data |> filter(value > .env$threshold)
+
+# Risky - if data has a "threshold" column, it shadows the local variable
+data |> filter(value > threshold)
+```
+
+`.env` is most useful inside functions where you cannot control what columns the data has:
+
+```r
+filter_above <- function(data, col, cutoff) {
+  data |> filter({{ col }} > .env$cutoff)
+}
+```
+
+## Tidy selection vs data masking
+
+| Context | Used by | Column selection | Character vector bridge |
+|---------|---------|-----------------|----------------------|
+| **Tidy selection** | `select`, `across`, `pick`, `relocate`, `pivot_*` | helpers like `where()`, `starts_with()` | `all_of(vars)` |
+| **Data masking** | `filter`, `mutate`, `summarise`, `arrange` | `.data[[var]]` | `across(all_of(vars))` |
+
+The two contexts have different rules. Tidy selection uses helper functions; data masking evaluates R expressions in the data frame environment. `{{ }}` (embrace) works in both contexts for forwarding a single function argument.
diff --git a/tidyverse/tidy-r/references/tidyverse-style.md b/tidyverse/tidy-r/references/tidyverse-style.md
new file mode 100644
index 0000000..95c098f
--- /dev/null
+++ b/tidyverse/tidy-r/references/tidyverse-style.md
@@ -0,0 +1,216 @@
+# Tidyverse Style Guide Summary
+
+Based on https://style.tidyverse.org/
+
+## Object Names
+
+- Use **snake_case**: lowercase letters, numbers, underscores only
+- Variables = **nouns**, functions = **verbs**
+- Avoid reusing common function/variable names
+- Prefix non-standard function arguments with `.` (e.g., `.data`, `.by`)
+- Avoid dots in names except for S3 methods
+
+```r
+# Good
+day_one
+calculate_mean
+user_data
+
+# Bad
+DayOne
+calculateMean
+day.one
+```
+
+## Spacing
+
+**Commas**: space after, never before
+
+```r
+# Good
+x[, 1]
+mean(x, na.rm = TRUE)
+
+# Bad
+x[,1]
+mean(x ,na.rm = TRUE)
+```
+
+**Infix operators**: surround with spaces (`==`, `+`, `-`, `<-`, etc.)
+
+```r
+# Good
+x == y
+z <- 2 + 2
+
+# Bad
+x==y
+z<-2+2
+```
+
+**No spaces** for high-precedence operators: `::`, `$`, `@`, `[`, `[[`, `^`, `:`
+
+```r
+# Good
+sqrt(x^2 + y^2)
+x <- 1:10
+pkg::fun()
+```
+
+## Assignment
+
+Use `<-`, not `=`
+
+```r
+# Good
+x <- 5
+
+# Bad
+x = 5
+```
+
+## Quotes
+
+Use double quotes `"`; single `'` only when text contains double quotes
+
+```r
+# Good
+"Text here"
+'They said "hello"'
+```
+
+## Line Length
+
+Limit to **80 characters**. For long function calls, put each argument on its own line:
+
+```r
+# Good
+do_something(
+  arg1 = "value",
+  arg2 = "value",
+  arg3 = "value"
+)
+```
+
+## Braces
+
+- `{` ends a line
+- Contents indented by **2 spaces**
+- `}` starts a line
+- `else` on same line as `}`
+
+```r
+if (condition) {
+  do_this()
+} else {
+  do_that()
+}
+```
+
+## Functions
+
+**Anonymous functions**: use `\(x)` for short lambdas
+
+```r
+# Good
+map(x, \(x) x + 1)
+
+# Bad
+map(x, function(x) x + 1)
+```
+
+**Return**: use `return()` only for early returns; rely on implicit return otherwise
+
+```r
+# Good
+add_one <- function(x) {
+  x + 1
+}
+
+# Early return
+check_input <- function(x) {
+  if (is.null(x)) {
+    return(NULL)
+  }
+  process(x)
+}
+```
+
+**Multi-line definitions**: single-indent style preferred
+
+```r
+long_function_name <- function(
+  a = "argument",
+  b = "argument"
+) {
+  # body
+}
+```
+
+## Pipes
+
+- Use `|>` (not `%>%`)
+- Space before pipe, newline after
+- Indent continuation by 2 spaces
+
+```r
+# Good
+data |>
+  filter(x > 0) |>
+  mutate(y = x * 2) |>
+  summarise(mean(y))
+
+# Bad
+data |> filter(x > 0) |> mutate(y = x * 2)
+```
+
+**Avoid pipes when**:
+- Manipulating multiple objects
+- Meaningful intermediate objects deserve names
+
+## Comments
+
+- Start with `# ` (hash + space)
+- Explain **why**, not what
+- Use sentence case
+
+```r
+# Skip NA values because downstream analysis requires complete cases
+data <- data |> filter(!is.na(value))
+```
+
+## Control Flow
+
+- Use `&&` and `||` in conditions (not `&` and `|`)
+- Use `TRUE`/`FALSE` (not `T`/`F`)
+- Never use semicolons
+- With `tidyna` loaded, `na.rm = TRUE` is the default for common aggregation functions -- write `mean(x)` instead of `mean(x, na.rm = TRUE)`
+
+## Error Messages
+
+Use `cli::cli_abort()` for errors. See https://style.tidyverse.org/errors.html
+
+**Problem statement**:
+- Start with concise problem in sentence case, ending with `.`
+- Use **"must"** when cause is clear: `` `n` must be a numeric vector, not a character vector.``
+- Use **"can't"** when you cannot state what was expected: ``Can't find column `b` in `.data`.``
+
+**Bullets**:
+- `x` (cross) for problems
+- `i` (info) for context
+- `!` (warning) for warnings
+
+**Formatting**:
+- Surround argument names in backticks: `` `x` ``
+- Use "column" to disambiguate (avoid "variable")
+- Keep under 80 characters; let cli wrap
+- List up to 5 issues, truncate with `...`
+
+**Hints**: place last with `i` bullet, end with `?`
+
+```r
+cli::cli_abort(c(
+  "{.arg x} must be a numeric vector, not {.obj_type_friendly {x}}.",
+  "i" = "Did you mean to use {.fn as.numeric}?"
+))
+```