Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,6 @@
^docs$
^pkgdown$
^\.github$
^README\.Rmd$
^doc$
^Meta$
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,6 @@
.DS_Store
.quarto
docs
inst/doc
/doc/
/Meta/
4 changes: 4 additions & 0 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -35,3 +35,7 @@ Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.3
URL: https://github.com/ccsarapas/lighthouse.codebook, https://ccsarapas.github.io/lighthouse.codebook/
BugReports: https://github.com/ccsarapas/lighthouse.codebook/issues
Suggests:
knitr,
rmarkdown
VignetteBuilder: knitr
10 changes: 6 additions & 4 deletions R/cb_create.r
Original file line number Diff line number Diff line change
Expand Up @@ -99,10 +99,12 @@
#' If labels set in `.user_missing` conflict with those in `metadata`, `.user_missing_conflict`
#' controls which labels are used.
#'
#' User missing values are not compatible with logical, date, or datetime (POSIXt)
#' variables. By default, these variables will be ignored if specified in `.user_missing`.
#' (i.e., user missing values will be applied only to compatible variables.) This behavior
#' can be changed using the `.user_missing_incompatible` argument.
#' User missings may be set for numeric, character, factor/ordered factor, and haven_labelled/haven_labelled_spss
#' vectors. For factors, user missings are set based on factor labels (not the underlying
#' integer codes). For `"haven_labelled"` vectors, user missings are set based on
#' values (not value labels). By default, variables with incompatible classes (e.g.,
#' logical, Date, POSIXt) will be ignored if specified in `.user_missing`. This
#' behavior can be changed using the `.user_missing_incompatible` argument.
#'
#' @examples
#' diamonds2 <- ggplot2::diamonds |>
Expand Down
130 changes: 130 additions & 0 deletions README.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
---
output: github_document
---

<!-- README.md is generated from README.Rmd. Please edit that file -->

```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#",
out.width = "100%",
fig.align = "center",
fig.path = "man/figures/",
eval = FALSE
)
```

# lighthouse.codebook

The lighthouse.codebook package includes tools to summarize a dataset into a formatted
Excel workbook, including a data dictionary and summaries. It incorporates external
metadata (such as variable labels, value labels, and user missing / non-response codes),
with functions for using metadata from SPSS and REDCap datasets. Codebooks can be
customized in a number of ways, including options for grouped summaries.

## Installation

You can install lighthouse.codebook by running:

```r
# install.packages("remotes")
remotes::install_github("ccsarapas/lighthouse.codebook")
```

## Creating codebooks
Creating a codebook involves two general steps:

1. Create a a “codebook” object in R from a data frame (and,
optionally, metadata), using `cb_create()` or a specialized variant
(such as `cb_create_spss()` or `cb_create_redcap()`).

2. Write the codebook to disk using `cb_write()`.

``` r
library(lighthouse.codebook)

# create and write a codebook without metadata
dat |>
cb_create() |>
cb_write("cb.xlsx")

# with metadata
dat |>
cb_create(metadata = dat1_metadata) |>
cb_write("cb.xlsx")

# from SPSS data
dat_spss <- haven::read_sav("dat_spss.sav", user_na = TRUE)

dat_spss |>
cb_create_spss() |>
cb_write("cb_spss.xlsx")

# from REDCap data
dat_rc <- REDCapR::redcap_read(redcap_uri = rc_uri, token = rc_token)
meta_rc <- REDCapR::redcap_metadata_read(redcap_uri = rc_uri, token = rc_token)

dat_rc$data |>
cb_create_redcap(metadata = meta_rc$data) |>
cb_write("cb_rc.xlsx")
```

## Customizing codebooks

There are many options for controlling how data is interpreted, summarized, and
presented. See [**Introduction to lighthouse.codebook**](lighthouse-codebook.html) for
some of the most useful options, including grouped data summaries and specifying
user missing codes. Further options are detailed in the documentation for `cb_create()`
and `cb_write()`.
<!-- - The "Creating Codebooks" vignette covers options for controlling how data and
metadata are _interpreted,_ such as by applying value labels, specifying user missing
or nonresponse codes, and taking advantage of specialized metadata (e.g., from SPSS
or REDCap data).
- The "Writing Codebooks" vignette covers how data is _summarized and presented_
in the codebook written to disk, including options for grouped summaries and missing
data. -->

## Codebook contents

The codebook written to disk will include an _overview_ tab listing all variables
in the dataset; _summary_ tabs for numeric, categorical, and text variables; and,
if grouping variables are specified, _grouped summary_ tabs for numeric and categorical
variables.

The _overview_ tab includes one row for each variable in the dataset, with information
on variable types, labels, values, and missingness. By default, each variable is
hyperlinked to its location on the relevant summary tab.

```{r, overview, echo = FALSE, eval = TRUE}
knitr::include_graphics("man/figures/README-overview.png")
```

The _numeric summary_ tab includes descriptive statistics for all numeric variables
in the dataset:

```{r, numeric, echo = FALSE, eval = TRUE}
knitr::include_graphics("man/figures/README-numeric.png")
```

The _categorical summary_ tab includes frequencies for all categorical variables,
optionally with separate rows for user missing values:

```{r, categorical, echo = FALSE, eval = TRUE}
knitr::include_graphics("man/figures/README-categorical.png")
```

Finally, the _text summary_ tab includes frequencies for the most common values for all
text variables in the dataset. (The number of values shown can be adjusted using
the `n_text_vals` argument to `cb_write()`.)

```{r, text, echo = FALSE, eval = TRUE}
knitr::include_graphics("man/figures/README-text.png")
```

If `group_by` is specified in `cb_write()`, additional numeric and categorical summary
tabs will be included grouped by the specified variables.

## SPSS extension

Functionality from this package is also available as an SPSS extension command [here](https://github.com/ccsarapas/lighthouse.codebook.spss).
104 changes: 101 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,114 @@

<!-- README.md is generated from README.Rmd. Please edit that file -->

# lighthouse.codebook

The lighthouse.codebook package includes tools for summarizing datasets used by staff at the [Lighthouse Institute](https://www.chestnut.org/lighthouse-institute/), the research division of Chestnut Health Systems.
The lighthouse.codebook package includes tools to summarize a dataset
into a formatted Excel workbook, including a data dictionary and
summaries. It incorporates external metadata (such as variable labels,
value labels, and user missing / non-response codes), with functions for
using metadata from SPSS and REDCap datasets. Codebooks can be
customized in a number of ways, including options for grouped summaries.

## Installation

Install lighthouse.codebook by running:
You can install lighthouse.codebook by running:

``` r
# install.packages("remotes")
remotes::install_github("ccsarapas/lighthouse.codebook")
```

## Creating codebooks

Creating a codebook involves two general steps:

1. Create a a “codebook” object in R from a data frame (and,
optionally, metadata), using `cb_create()` or a specialized variant
(such as `cb_create_spss()` or `cb_create_redcap()`).

2. Write the codebook to disk using `cb_write()`.

``` r
library(lighthouse.codebook)

# create and write a codebook without metadata
dat |>
cb_create() |>
cb_write("cb.xlsx")

# with metadata
dat |>
cb_create(metadata = dat1_metadata) |>
cb_write("cb.xlsx")

# from SPSS data
dat_spss <- haven::read_sav("dat_spss.sav", user_na = TRUE)

dat_spss |>
cb_create_spss() |>
cb_write("cb_spss.xlsx")

# from REDCap data
dat_rc <- REDCapR::redcap_read(redcap_uri = rc_uri, token = rc_token)
meta_rc <- REDCapR::redcap_metadata_read(redcap_uri = rc_uri, token = rc_token)

dat_rc$data |>
cb_create_redcap(metadata = meta_rc$data) |>
cb_write("cb_rc.xlsx")
```

## Customizing codebooks

There are many options for controlling how data is interpreted,
summarized, and presented. See [**Introduction to
lighthouse.codebook**](lighthouse-codebook.html) for some of the most
useful options, including grouped data summaries and specifying user
missing codes. Further options are detailed in the documentation for
`cb_create()` and `cb_write()`.
<!-- - The "Creating Codebooks" vignette covers options for controlling how data and
metadata are _interpreted,_ such as by applying value labels, specifying user missing
or nonresponse codes, and taking advantage of specialized metadata (e.g., from SPSS
or REDCap data).
- The "Writing Codebooks" vignette covers how data is _summarized and presented_
in the codebook written to disk, including options for grouped summaries and missing
data. -->

## Codebook contents

The codebook written to disk will include an *overview* tab listing all
variables in the dataset; *summary* tabs for numeric, categorical, and
text variables; and, if grouping variables are specified, *grouped
summary* tabs for numeric and categorical variables.

The *overview* tab includes one row for each variable in the dataset,
with information on variable types, labels, values, and missingness. By
default, each variable is hyperlinked to its location on the relevant
summary tab.

<img src="man/figures/README-overview.png" width="100%" style="display: block; margin: auto;" />

The *numeric summary* tab includes descriptive statistics for all
numeric variables in the dataset:

<img src="man/figures/README-numeric.png" width="100%" style="display: block; margin: auto;" />

The *categorical summary* tab includes frequencies for all categorical
variables, optionally with separate rows for user missing values:

<img src="man/figures/README-categorical.png" width="100%" style="display: block; margin: auto;" />

Finally, the *text summary* tab includes frequencies for the most common
values for all text variables in the dataset. (The number of values
shown can be adjusted using the `n_text_vals` argument to `cb_write()`.)

<img src="man/figures/README-text.png" width="100%" style="display: block; margin: auto;" />

If `group_by` is specified in `cb_write()`, additional numeric and
categorical summary tabs will be included grouped by the specified
variables.

## SPSS extension

Functionality from this package is also available as an SPSS extension command [here](https://github.com/ccsarapas/lighthouse.codebook.spss).
Functionality from this package is also available as an SPSS extension
command [here](https://github.com/ccsarapas/lighthouse.codebook.spss).
2 changes: 1 addition & 1 deletion _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,4 @@ authors:
Casey Sarapas:
href: "https://chestnut.org/li/scientists-and-project-directors/category/research-scientists/profile/casey-sarapas-phd"
Chestnut Health Systems:
href: "https://chestnut.org/"
href: "https://chestnut.org/"
Binary file added man/figures/README-categorical.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added man/figures/README-numeric.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added man/figures/README-overview.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added man/figures/README-text.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 2 additions & 0 deletions vignettes/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
*.html
*.R
Loading