From 9fe5927a655ccb57213617b497f636e8a7e1ff77 Mon Sep 17 00:00:00 2001 From: Colin Leach Date: Wed, 18 Mar 2026 15:54:24 -0700 Subject: [PATCH 1/4] {add concept] Functions --- concepts/functions/.meta/config.json | 7 + concepts/functions/about.md | 216 +++++++++++++++++++++++++++ concepts/functions/introduction.md | 1 + concepts/functions/links.json | 6 + config.json | 5 + 5 files changed, 235 insertions(+) create mode 100644 concepts/functions/.meta/config.json create mode 100644 concepts/functions/about.md create mode 100644 concepts/functions/introduction.md create mode 100644 concepts/functions/links.json diff --git a/concepts/functions/.meta/config.json b/concepts/functions/.meta/config.json new file mode 100644 index 00000000..97e3602d --- /dev/null +++ b/concepts/functions/.meta/config.json @@ -0,0 +1,7 @@ +{ + "authors": [ + "colinleach" + ], + "contributors": [], + "blurb": "Functions in R are first-class objects, which can be passed as arguments and included in return values." +} \ No newline at end of file diff --git a/concepts/functions/about.md b/concepts/functions/about.md new file mode 100644 index 00000000..2ad1c384 --- /dev/null +++ b/concepts/functions/about.md @@ -0,0 +1,216 @@ +# About + +Functions were introduced back in the [Basics][concept-basics] Concept, with examples such as this: + +```R +squareit <- function(x) { + x * x +} + +squareit(3) +#> [1] 9 + +# shorter form +squareit_short <- function(x) x ^ 2 +``` + +Looking more closely at the definition of `squareit`, we can identify various parts: + +- The function takes a formal argument `x`. +- There is a function body, usually in braces`{ }`. +- Once the function object is created, it is assigned to a variable `squareit`. + +A function is a first class object in R, much like numbers and strings. +Thus, `squareit <- function...` is an assignment which is syntactically just like `x <- 42`. + +~~~~exercism/advanced +Accessing the components of a function is rare in normal use, but quite easy. + +```R +class(squareit) +#> [1] "function" +formals(squareit) +#> $x + +body(squareit) +#> { +#> x * x +#> } +``` + +The arguments obtained with [`formals()`][ref-formals] look like a [list][concept-lists], with the `$x` syntax. +In fact, the type is [`pairlist`][ref-pairlist], a particular type of list containing key-value pairs. + +The `body` is executable code, with the type `language` (*not something you probably need to care about*). + +[ref-formals]: https://www.rdocumentation.org/packages/base/versions/3.3.0/topics/formals +[ref-pairlist]: https://www.rdocumentation.org/packages/base/versions/3.3.0/topics/list +[concept-lists]: https://exercism.org/tracks/r/concepts/lists +~~~~ + +## Arguments + +R makes no clear distinction between positional arguments and keyword arguments, in contrast to other scripting languages such as Python and Julia. + +Function calls can pass values either positionally or by name. +The latter is useful for complex functions with many arguments, where it is hard to remember their order. + +```R +f <- function(x, y) x / y + +# call positionally +f(4, 2) +#> [1] 2 + +# call by name +f(y = 2, x = 4) +#> [1] 2 +``` + +### Optional arguments + +Default argument values can be specified in the function definition, but must come after all the arguments without defaults. + +We can then choose whether to accept the default or override it. + +```R +g <- function(x, y = 2) x / y + +# default y value +g(6) +#> [1] 3 + +# explicit y value +g(6, 3) +#> [1] 2 +``` + +### Extra arguments + +To accept an arbitrary number of additional arguments, use a `...` (ellipsis) in the definition. +It is possible to convert any extra values in the function call to a list, but please read on for an alternative way to use these "dot args" (*called "varargs" in several other languages*). + +```R +var_f <- function(x, y, ...) { + print(list(...)) +} + +var_f(2, 3, "opt1", "opt2") +[[1]] +#> [1] "opt1" + +[[2]] +#> [1] "opt2" +``` + +## Function Environment + +Previously, we said that the formal arguments and the body are both components of a function. + +In fact, there is a third component: the *environment* in which the function is defined. + +This can be illustrated with the case of nested functions: + +```R +outer_func <- function(x) { + inner_func <- function(y) { + x * y + } + + inner_func(3) +} + +outer_func(5) +#> [1] 15 +``` + +The function call passes `x = 5` to the outer function, and this value is available within that function body. + +The inner function is *part* of the outer function body, and has access to the value of `x`. +Worded differently, `x = 5` is in the *environment* of the inner function. + +Technically, this is known as a [closure][wiki-closure]. + +The environment is particularly important with dot args, as any values supplied this way can be passed through to function calls in the function body. +The outer function need not know or care what the dot args mean. + +```R +f_var <- function(x, ...) { + sum(x, ...) +} + +x <- c(1, 2, NA, 6) + +# for sum(), na.rm defaults to FALSE +f_var(x) +#> [1] NA + +# pass through the na.rm value +f_var(x, na.rm = TRUE) +#> [1] 9 +``` + +This technique is used extensively by Tidyverse libraries such as `stringr`. +Many of the `stringr` functions are a user-friendly wrapper around low-level functions from `stringi` and base R. +Extra arguments supplied to the `str_*()` functions are simply passed through to those low-level functions. + +## Anonymous Functions + +When we define a function, we usually bind the resulting function object to a variable: + +```R +squareit_short <- function(x) x ^ 2 +``` + +This makes it easy to use the function later in the script, but such binding is not necessary. +A short, use-once function can be useful in the immediate context. +Without name-binding, it it called an *anonymous function*. + +Use of anonymous functions is so common that (*since R v4.1.0*) there is a shorthand syntax to define them: replace the word `function` with a backslash `\`. + +This section will make more sense once we reach the [Functional Programming][concept-funcprog] Concept. +Below is a preview, using [`sapply()`][ref-sapply] to square each number in a range: + +```R +sapply(1:5, \(x) x ^ 2) +#> [1] 1 4 9 16 25 +``` + +That is not a very useful example, because `(1:5) ^ 2` returns the same result more simply, but it illustrates how we define a function without bothering to think of a name for it. + +## Copy on Modify + +R allows assignment to individual elements of a vector. +If we pass in a vector as a function argument, and modify it in the function body before returning it, we get a modified vector. + +But what happened to the original vector? + +```R +f <- function(vec) { + vec[1] <- 42 + vec +} + +vals <- c(1, 3, 4) + +# f() returns a modified vector +f(vals) +#> [1] 42 3 4 + +# the original vector is unchanged. +vals +#> [1] 1 3 4 +``` + +R is a language designed for data science. +Collecting that data can cost a lot in time, effort, and potentially an eye-watering amount of money: *it is important not to corrupt it!* + +The general policy (with a few exceptions) is *copy on modify*. +If an object (such as a vector) is changed in a way that could cause later problems, R returns a *modified copy* and leaves the original untouched. + +Copying large data structures can be computationally expensive, but this is generally the lesser evil when the alternative is data corruption. + +[concept-basics]: https://exercism.org/tracks/r/concepts/basics +[concept-funcprog]: https://exercism.org/tracks/r/concepts/functional-programming +[wiki-closure]: https://en.wikipedia.org/wiki/Closure_(computer_programming) +[ref-sapply]: https://www.rdocumentation.org/packages/base/versions/3.3.0/topics/lapply diff --git a/concepts/functions/introduction.md b/concepts/functions/introduction.md new file mode 100644 index 00000000..e10b99d0 --- /dev/null +++ b/concepts/functions/introduction.md @@ -0,0 +1 @@ +# Introduction diff --git a/concepts/functions/links.json b/concepts/functions/links.json new file mode 100644 index 00000000..51f2f65e --- /dev/null +++ b/concepts/functions/links.json @@ -0,0 +1,6 @@ +[ + { + "url": "http://adv-r.had.co.nz/Functions.html", + "description": "Functions chapter in Advanced R." + } + ] \ No newline at end of file diff --git a/config.json b/config.json index 8a609a5a..6af4ef00 100644 --- a/config.json +++ b/config.json @@ -1119,6 +1119,11 @@ "slug": "lists", "name": "Lists" }, + { + "uuid": "b966f36b-9359-441e-9646-d8590c74d60d", + "slug": "functions", + "name": "Functions" + }, { "uuid": "85db3d8c-dfec-424c-9682-caa8611db8f8", "slug": "set-operations", From c6efbfd9b709c69046e0cbbbea925dc0fda3d8a8 Mon Sep 17 00:00:00 2001 From: Colin Leach Date: Wed, 18 Mar 2026 17:03:39 -0700 Subject: [PATCH 2/4] caution and blank lines --- concepts/functions/.meta/config.json | 2 +- concepts/functions/about.md | 16 +++++++++++++++- concepts/functions/links.json | 3 ++- 3 files changed, 18 insertions(+), 3 deletions(-) diff --git a/concepts/functions/.meta/config.json b/concepts/functions/.meta/config.json index 97e3602d..65260cf8 100644 --- a/concepts/functions/.meta/config.json +++ b/concepts/functions/.meta/config.json @@ -4,4 +4,4 @@ ], "contributors": [], "blurb": "Functions in R are first-class objects, which can be passed as arguments and included in return values." -} \ No newline at end of file +} diff --git a/concepts/functions/about.md b/concepts/functions/about.md index 2ad1c384..80c2a90d 100644 --- a/concepts/functions/about.md +++ b/concepts/functions/about.md @@ -132,7 +132,9 @@ Worded differently, `x = 5` is in the *environment* of the inner function. Technically, this is known as a [closure][wiki-closure]. The environment is particularly important with dot args, as any values supplied this way can be passed through to function calls in the function body. -The outer function need not know or care what the dot args mean. +The outer function need not know or care what the dot a - recursion + - pipes (or else move that to functional-programming?) +rgs mean. ```R f_var <- function(x, ...) { @@ -153,6 +155,8 @@ f_var(x, na.rm = TRUE) This technique is used extensively by Tidyverse libraries such as `stringr`. Many of the `stringr` functions are a user-friendly wrapper around low-level functions from `stringi` and base R. Extra arguments supplied to the `str_*()` functions are simply passed through to those low-level functions. + - recursion + - pipes (or else move that to functional-programming?) ## Anonymous Functions @@ -210,6 +214,16 @@ If an object (such as a vector) is changed in a way that could cause later probl Copying large data structures can be computationally expensive, but this is generally the lesser evil when the alternative is data corruption. +~~~~exercism/caution +Beware of operations that lead to repeated copying of the same data. + +Paraphrasing (Tidyverse author) Hadley Wickham: + +> Loops are not inherently slow, but they make it dangerously easy to include slow operations within the loop. + +Vectorization or higher-order functions can help to protect you from this type of performance-killer. +~~~~ + [concept-basics]: https://exercism.org/tracks/r/concepts/basics [concept-funcprog]: https://exercism.org/tracks/r/concepts/functional-programming [wiki-closure]: https://en.wikipedia.org/wiki/Closure_(computer_programming) diff --git a/concepts/functions/links.json b/concepts/functions/links.json index 51f2f65e..7f5fe4ee 100644 --- a/concepts/functions/links.json +++ b/concepts/functions/links.json @@ -3,4 +3,5 @@ "url": "http://adv-r.had.co.nz/Functions.html", "description": "Functions chapter in Advanced R." } - ] \ No newline at end of file + ] + \ No newline at end of file From efd73da3b588b53eb1f010a5e7a1c9cee54dcaac Mon Sep 17 00:00:00 2001 From: Colin Leach Date: Thu, 19 Mar 2026 08:02:30 -0700 Subject: [PATCH 3/4] cleanup --- concepts/functions/about.md | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/concepts/functions/about.md b/concepts/functions/about.md index 80c2a90d..b94ca168 100644 --- a/concepts/functions/about.md +++ b/concepts/functions/about.md @@ -132,9 +132,7 @@ Worded differently, `x = 5` is in the *environment* of the inner function. Technically, this is known as a [closure][wiki-closure]. The environment is particularly important with dot args, as any values supplied this way can be passed through to function calls in the function body. -The outer function need not know or care what the dot a - recursion - - pipes (or else move that to functional-programming?) -rgs mean. +The outer function need not know or care what the dot args mean. ```R f_var <- function(x, ...) { @@ -155,8 +153,6 @@ f_var(x, na.rm = TRUE) This technique is used extensively by Tidyverse libraries such as `stringr`. Many of the `stringr` functions are a user-friendly wrapper around low-level functions from `stringi` and base R. Extra arguments supplied to the `str_*()` functions are simply passed through to those low-level functions. - - recursion - - pipes (or else move that to functional-programming?) ## Anonymous Functions From b48dee7235f03c5554c84954079652fbfe19778a Mon Sep 17 00:00:00 2001 From: Colin Leach Date: Tue, 28 Apr 2026 12:06:15 -0700 Subject: [PATCH 4/4] more formatting and links --- concepts/functions/about.md | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/concepts/functions/about.md b/concepts/functions/about.md index b94ca168..87f4b717 100644 --- a/concepts/functions/about.md +++ b/concepts/functions/about.md @@ -96,10 +96,10 @@ var_f <- function(x, y, ...) { } var_f(2, 3, "opt1", "opt2") -[[1]] +#> [[1]] #> [1] "opt1" -[[2]] +#> [[2]] #> [1] "opt2" ``` @@ -150,9 +150,9 @@ f_var(x, na.rm = TRUE) #> [1] 9 ``` -This technique is used extensively by Tidyverse libraries such as `stringr`. -Many of the `stringr` functions are a user-friendly wrapper around low-level functions from `stringi` and base R. -Extra arguments supplied to the `str_*()` functions are simply passed through to those low-level functions. +This technique is used extensively by Tidyverse libraries such as [`stringr`][ref-stringr]. +Many of the `stringr` functions are a user-friendly wrapper around low-level functions from [`stringi`][ref-stringi] and base R. +Extra arguments supplied to the `str_*()` functions are simply passed through to those lower-level functions. ## Anonymous Functions @@ -224,3 +224,5 @@ Vectorization or higher-order functions can help to protect you from this type o [concept-funcprog]: https://exercism.org/tracks/r/concepts/functional-programming [wiki-closure]: https://en.wikipedia.org/wiki/Closure_(computer_programming) [ref-sapply]: https://www.rdocumentation.org/packages/base/versions/3.3.0/topics/lapply +[ref-stringr]: https://stringr.tidyverse.org/index.html +[ref-stringi]: https://stringi.gagolewski.com/