-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathREADME.Rmd
More file actions
122 lines (92 loc) · 5.49 KB
/
README.Rmd
File metadata and controls
122 lines (92 loc) · 5.49 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
---
output:
github_document:
html_preview: false
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
requireNamespace("hydroGOF", quietly = TRUE)
requireNamespace("bench", quietly = TRUE)
```
# tidyhydro
<!-- badges: start -->
<p align="center">
<a href="https://github.com/atsyplenkov/tidyhydro/releases">
<img src="https://img.shields.io/github/v/release/atsyplenkov/tidyhydro?style=flat&labelColor=1C2C2E&color=198ce7&logo=GitHub&logoColor=white"></a>
<!-- <a href="https://cran.r-project.org/package=tidyhydro">
<img src="https://img.shields.io/cran/v/tidyhydro?style=flat&labelColor=1C2C2E&color=198ce7&logo=R&logoColor=white"></a> -->
<a href="https://app.codecov.io/gh/atsyplenkov/tidyhydro">
<img src="https://img.shields.io/codecov/c/gh/atsyplenkov/tidyhydro?style=flat&labelColor=1C2C2E&color=256bc0&logo=Codecov&logoColor=white"></a>
<a href="https://github.com/atsyplenkov/tidyhydro/actions/workflows/check-r-pkg.yaml">
<img src="https://img.shields.io/github/actions/workflow/status/atsyplenkov/tidyhydro/check-r-pkg.yaml?style=flat&labelColor=1C2C2E&color=256bc0&logo=GitHub%20Actions&logoColor=white"></a>
</p>
<!-- badges: end -->
The `tidyhydro` package provides a set of commonly used metrics in hydrology (such as _NSE_, _KGE_, _pBIAS_) for use within a [`tidymodels`](https://www.tidymodels.org/) infrastructure. Originally inspired by the [`yardstick`](https://github.com/tidymodels/yardstick/tree/main) and [`hydroGOF`](https://github.com/hzambran/hydroGOF) packages, this library is mainly written in C++ and provides a very quick estimation of desired goodness-of-fit criteria.
Additionally, you'll find here a C++ implementation of lesser-known yet powerful metrics and descriptive statistics recommended in the United States Geological Survey (USGS) and the New Zealand National Environmental Monitoring Standards (NEMS) guidelines. Examples include _PRESS_ (Prediction Error Sum of Squares), _SFE_ (Standard Factorial Error), _MSPE_ (Model Standard Percentage Error) and others. Based on the equations from _Helsel et al._ ([2020](https://pubs.usgs.gov/publication/tm4A3)), _Rasmunsen et al._ ([2008](https://pubs.usgs.gov/tm/tm3c4/)), _Hicks et al._ ([2020](https://www.nems.org.nz/documents/suspended-sediment)) and etc. (see documentation for details).
## Performance metrics
The `tidyhydro` package follows the philosophy of [`yardstick`](https://github.com/tidymodels/yardstick/tree/main) and provides S3 class methods for vectors and data frames. For example, one can estimate `KGE`, `NSE` or `pBIAS` for a data frame like this:
```{r example}
library(tidyhydro)
str(avacha)
kge(avacha, obs, sim)
```
or create a [`metric_set`](https://yardstick.tidymodels.org/reference/metric_set.html) and estimate several parameters at once like this:
```{r metricset}
hydro_metrics <- yardstick::metric_set(nse, pbias)
hydro_metrics(avacha, obs, sim)
```
We do understand that sometimes one needs a qualitative interpretation of the model. Therefore, we populated some functions with a `performance` argument. When `performance = TRUE`, the metric interpretation will be returned according to Moriasi et al. ([2015](https://elibrary.asabe.org/abstract.asp?aid=46548&t=3&dabs=Y&redir=&redirType=)).
```{r interpretation}
hydro_metrics(avacha, obs, sim, performance = TRUE)
```
## Descriptive statistics
In addition to `metric`, inherited from `yardstick`, the `tidyhydro` introduces the `measure` objects. It aims to calculate descriptive statistics of a single dataset, such as `cv()` — coefficient of variation (a measure of variability) or `gm()` — geometric mean (a measure of central tendency):
```{r measures}
# Coefficient of Variation
cv(avacha, obs)
# Geometric mean
gm_vec(avacha$obs)
```
Similarly to `metric_set`, one can create a `measure_set` and estimate desired descriptive statistics at once:
```{r measureset}
ms <- measure_set(cv, gm)
ms(avacha, obs)
```
## Installation
You can install the development version of `tidyhydro` from [GitHub](https://github.com/atsyplenkov/tidyhydro) with:
``` r
# install.packages("pak")
pak::pak("atsyplenkov/tidyhydro")
```
## Benchmarking
Since the package uses `Rcpp` in the background, it performs slightly faster than base R and other R packages (see [benchmarks](https://atsyplenkov.github.io/tidyhydro/articles/benchmarks.html)). This is particularly noticeable with large datasets:
```{r benchmarking}
set.seed(12234)
x <- runif(10^6)
y <- runif(10^6)
nse <- function(truth, estimate, na_rm = TRUE) {
#fmt: skip
1 - (sum((truth - estimate)^2, na.rm = na_rm) /
sum((truth - mean(truth, na.rm = na_rm))^2, na.rm = na_rm))
}
bench::mark(
tidyhydro = tidyhydro::nse_vec(truth = x, estimate = y),
hydroGOF = hydroGOF::NSE(sim = y, obs = x),
baseR = nse(truth = x, estimate = y),
check = TRUE,
relative = TRUE,
filter_gc = FALSE,
iterations = 50L
)
```
## Code of Conduct
Please note that the tidyhydro project is released with a [Contributor Code of Conduct](https://atsyplenkov.github.io/tidyhydro/CODE_OF_CONDUCT.html). By contributing to this project, you agree to abide by its terms.
## See also
* [`hydroGOF`](https://github.com/hzambran/hydroGOF) - Goodness-of-fit functions for comparison of simulated and observed hydrological time series.
* [`yardstick`](https://github.com/tidymodels/yardstick/tree/main) - tidy methods for models performance assessment.