From 10fbe4b9ef474332f54a5c737de6a4f89fb7abb8 Mon Sep 17 00:00:00 2001 From: atsyplenkov Date: Wed, 27 Aug 2025 11:51:33 +1200 Subject: [PATCH 1/3] update articles --- vignettes/articles/available_metrics.qmd | 26 ++++++++++++++++++++++++ vignettes/articles/benchmarks.qmd | 9 ++++++++ vignettes/articles/tidyhydro.qmd | 20 +----------------- 3 files changed, 36 insertions(+), 19 deletions(-) create mode 100644 vignettes/articles/available_metrics.qmd diff --git a/vignettes/articles/available_metrics.qmd b/vignettes/articles/available_metrics.qmd new file mode 100644 index 0000000..cd373de --- /dev/null +++ b/vignettes/articles/available_metrics.qmd @@ -0,0 +1,26 @@ +--- +title: "Available metrics" +format: + html: + toc: false +knitr: + opts_chunk: + collapse: true + comment: '#>' + message: false +--- + +# Goodness-of-fit criteria + +| Name | Abbr. | Function calls | Reference | +|----------------|--------|--------|-------------| +| Kling-Gupta Efficiency | $KGE$ | `kge` | Gupta, H.V.; Kling, H.; Yilmaz, K.K.; Martinez, G.F. (2009). *Journal of Hydrology*, 377(1–2), 80–91. | +| Modified Kling-Gupta Efficiency | $KGE'$ | `kge2012` | Kling, H., Fuchs, M., & Paulin, M. (2012). *Journal of Hydrology*, 424–425, 264–277. | +| Log-transformed Kling-Gupta Efficiency | $KGE_{log}$ | `kgelog`, `kgelog_low`, `kgelog_hi` | Kling, J. (2023). *Journal of Hydrology*, 620, 129414. | +| Nash-Sutcliffe Efficiency | $NSE$ | `nse` | Nash, J. E., & Sutcliffe, J. V. (1970). *Journal of Hydrology*, 10(3), 282–290. | +| Mean Squared Error | $MSE$ | `mse` | Clark, M. P., Vogel, R. M., Lamontagne, J. R., Mizukami, N., Knoben, W. J. M., Tang, G., Gharari, S., Freer, J. E., Whitfield, P. H., Shook, K. R., & Papalexiou, S. M. (2021). The Abuse of Popular Performance Metrics in Hydrologic Modeling. Water Resources Research, 57(9), e2020WR029001. | +| Percent BIAS | $pBIAS$ | `pbias` | Gupta, H. V., S. Sorooshian, and P. O. Yapo. (1999). Status of automatic calibration for hydrologic models: Comparison with multilevel expert calibration. J. Hydrologic Eng. 4(2): 135-143 | +| PRediction Error Sum of Squares | $PRESS$ | `press` | Rasmussen, P. P., Gray, J. R., Glysson, G. D. & Ziegler, A. C. Guidelines and procedures for computing time-series suspended-sediment concentrations and loads from in-stream turbidity-sensor and streamflow data. in U.S. Geological Survey Techniques and Methods book 3, chap. C4 53 (2009) | +| Standard Factorial Error | $SFE$ | `sfe` | Herschy, R.W. 1978: Accuracy. Chapter 10 In: Herschy, R.W. (ed.) Hydrometry - principles and practices. John Wiley and Sons, Chichester, 511 p. | + +: Metrics currently implemented in `tidyhydro` v`r packageVersion("tidyhydro")` diff --git a/vignettes/articles/benchmarks.qmd b/vignettes/articles/benchmarks.qmd index 027fdec..4446fff 100644 --- a/vignettes/articles/benchmarks.qmd +++ b/vignettes/articles/benchmarks.qmd @@ -9,6 +9,8 @@ knitr: Since `tidyhydro` uses C++ under the hood, it performs slightly faster than similar R packages (like `hydroGOF`). The results are particularly noticeable in large datasets with $N$ observations exceeding 1000. +Below are benchmarking results ran during CI process. See machine specs [below](#machine-specs). + ```{r} #| label: setup library(tidyhydro) @@ -79,4 +81,11 @@ bench::mark( filter_gc = FALSE ) +``` + +# Machine specs + +```{r} +system("lscpu | grep -v '^Flags'") +system("lsmem") ``` \ No newline at end of file diff --git a/vignettes/articles/tidyhydro.qmd b/vignettes/articles/tidyhydro.qmd index a853473..faea09b 100644 --- a/vignettes/articles/tidyhydro.qmd +++ b/vignettes/articles/tidyhydro.qmd @@ -21,25 +21,7 @@ theme_set( ) ``` -# Available metrics - -In `tidyhydro` v`r packageVersion("tidyhydro")`, `r length(getNamespaceExports("tidyhydro"))/2` metrics are implemented. - -| Name | Abbr. | Function calls | Reference | -|----------------|--------|--------|-------------| -| Kling-Gupta Efficiency | $KGE$ | `kge`, `kge_vec` | Gupta, H.V.; Kling, H.; Yilmaz, K.K.; Martinez, G.F. (2009). *Journal of Hydrology*, 377(1–2), 80–91. | -| Modified Kling-Gupta Efficiency | $KGE'$ | `kge2012`, `kge_vec` | Kling, H., Fuchs, M., & Paulin, M. (2012). *Journal of Hydrology*, 424–425, 264–277. | -| Nash-Sutcliffe Efficiency | $NSE$ | `nse`, `nse_vec` | Nash, J. E., & Sutcliffe, J. V. (1970). *Journal of Hydrology*, 10(3), 282–290. | -| Mean Squared Error | $MSE$ | `mse`, `mse_vec` | Clark, M. P., Vogel, R. M., Lamontagne, J. R., Mizukami, N., Knoben, W. J. M., Tang, G., Gharari, S., Freer, J. E., Whitfield, P. H., Shook, K. R., & Papalexiou, S. M. (2021). The Abuse of Popular Performance Metrics in Hydrologic Modeling. Water Resources Research, 57(9), e2020WR029001. | -| Percent BIAS | $pBIAS$ | `pbias`, `pbias_vec` | Gupta, H. V., S. Sorooshian, and P. O. Yapo. (1999). Status of automatic calibration for hydrologic models: Comparison with multilevel expert calibration. J. Hydrologic Eng. 4(2): 135-143 | -| PRediction Error Sum of Squares | $PRESS$ | `press`, `press_vec` | Rasmussen, P. P., Gray, J. R., Glysson, G. D. & Ziegler, A. C. Guidelines and procedures for computing time-series suspended-sediment concentrations and loads from in-stream turbidity-sensor and streamflow data. in U.S. Geological Survey Techniques and Methods book 3, chap. C4 53 (2009) | -| Standard Factorial Error | $SFE$ | `sfe`, `sfe_vec` | Herschy, R.W. 1978: Accuracy. Chapter 10 In: Herschy, R.W. (ed.) Hydrometry - principles and practices. John Wiley and Sons, Chichester, 511 p. | - -: Metrics currently implemented in `tidyhydro` v`r packageVersion("tidyhydro")` - -# `avacha` dataset - -The package includes the mean daily water discharge values (`obs` in m³/s) measured at the state gauging station Avacha River — Elizovo City (site No. 2090). Alongside the measured water discharge, the mean water discharge in the last 24 hours derived from the [GloFAS-ERA5 v4.0](https://confluence.ecmwf.int/display/CEMS/GloFAS+v4.0) reanalysis is provided (`sim`). +The package includes the mean daily water discharge values (`obs` in m³/s) measured at the state gauging station [Avacha River — Elizovo City](https://www.openstreetmap.org/#map=18/53.183111/158.393269) (site No. 2090). Alongside the measured water discharge, the mean water discharge in the last 24 hours derived from the [GloFAS-ERA5 v4.0](https://confluence.ecmwf.int/display/CEMS/GloFAS+v4.0) reanalysis is provided (`sim`). ```{r} #| fig-cap: Avacha River - Elizovo City hydrograph From 00ae8c06e0061b2c93f481f90fa27a4fa0f8f613 Mon Sep 17 00:00:00 2001 From: atsyplenkov Date: Wed, 27 Aug 2025 12:32:52 +1200 Subject: [PATCH 2/3] .gitignore update --- paper/src/.gitignore | 1 + 1 file changed, 1 insertion(+) create mode 100644 paper/src/.gitignore diff --git a/paper/src/.gitignore b/paper/src/.gitignore new file mode 100644 index 0000000..075b254 --- /dev/null +++ b/paper/src/.gitignore @@ -0,0 +1 @@ +/.quarto/ From 262dc129f4c53a15a60610179f9a1ce74c18b3e1 Mon Sep 17 00:00:00 2001 From: atsyplenkov Date: Wed, 27 Aug 2025 12:33:56 +1200 Subject: [PATCH 3/3] README update --- README.Rmd | 2 +- README.md | 16 ++++++++-------- 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/README.Rmd b/README.Rmd index 012439b..2323589 100644 --- a/README.Rmd +++ b/README.Rmd @@ -35,7 +35,7 @@ requireNamespace("bench", quietly = TRUE) The `tidyhydro` package provides a set of commonly used metrics in hydrology (such as _NSE_, _KGE_, _pBIAS_) for use within a [`tidymodels`](https://www.tidymodels.org/) infrastructure. Originally inspired by the [`yardstick`](https://github.com/tidymodels/yardstick/tree/main) and [`hydroGOF`](https://github.com/hzambran/hydroGOF) packages, this library is mainly written in C++ and provides a very quick estimation of desired goodness-of-fit criteria. -Additionally, you'll find here a C++ implementation of lesser-known yet powerful metrics and descriptive statistics recommended in the United States Geological Survey (USGS) and the National Environmental Monitoring Standards (NEMS) guidelines. Examples include _PRESS_ (Prediction Error Sum of Squares), _SFE_ (Standard Factorial Error), _MSPE_ (Model Standard Percentage Error) and others. Based on the equations from _Helsel et al._ ([2020](https://pubs.usgs.gov/publication/tm4A3)), _Rasmunsen et al._ ([2008](https://pubs.usgs.gov/tm/tm3c4/)), _Hicks et al._ ([2020](https://www.nems.org.nz/documents/suspended-sediment)) and etc. (see documentation for details). +Additionally, you'll find here a C++ implementation of lesser-known yet powerful metrics and descriptive statistics recommended in the United States Geological Survey (USGS) and the New Zealand National Environmental Monitoring Standards (NEMS) guidelines. Examples include _PRESS_ (Prediction Error Sum of Squares), _SFE_ (Standard Factorial Error), _MSPE_ (Model Standard Percentage Error) and others. Based on the equations from _Helsel et al._ ([2020](https://pubs.usgs.gov/publication/tm4A3)), _Rasmunsen et al._ ([2008](https://pubs.usgs.gov/tm/tm3c4/)), _Hicks et al._ ([2020](https://www.nems.org.nz/documents/suspended-sediment)) and etc. (see documentation for details). ## Performance metrics The `tidyhydro` package follows the philosophy of [`yardstick`](https://github.com/tidymodels/yardstick/tree/main) and provides S3 class methods for vectors and data frames. For example, one can estimate `KGE`, `NSE` or `pBIAS` for a data frame like this: diff --git a/README.md b/README.md index e733c2b..de6246e 100644 --- a/README.md +++ b/README.md @@ -30,11 +30,11 @@ desired goodness-of-fit criteria. Additionally, you’ll find here a C++ implementation of lesser-known yet powerful metrics and descriptive statistics recommended in the United -States Geological Survey (USGS) and the National Environmental -Monitoring Standards (NEMS) guidelines. Examples include *PRESS* -(Prediction Error Sum of Squares), *SFE* (Standard Factorial Error), -*MSPE* (Model Standard Percentage Error) and others. Based on the -equations from *Helsel et al.* +States Geological Survey (USGS) and the New Zealand National +Environmental Monitoring Standards (NEMS) guidelines. Examples include +*PRESS* (Prediction Error Sum of Squares), *SFE* (Standard Factorial +Error), *MSPE* (Model Standard Percentage Error) and others. Based on +the equations from *Helsel et al.* ([2020](https://pubs.usgs.gov/publication/tm4A3)), *Rasmunsen et al.* ([2008](https://pubs.usgs.gov/tm/tm3c4/)), *Hicks et al.* ([2020](https://www.nems.org.nz/documents/suspended-sediment)) and etc. @@ -153,9 +153,9 @@ bench::mark( #> # A tibble: 3 × 6 #> expression min median `itr/sec` mem_alloc `gc/sec` #> -#> 1 tidyhydro 1 1 13.3 NaN NaN -#> 2 hydroGOF 9.69 8.63 1 Inf Inf -#> 3 baseR 5.80 5.54 2.27 Inf Inf +#> 1 tidyhydro 1 1 16.5 NaN NaN +#> 2 hydroGOF 9.74 11.5 1 Inf Inf +#> 3 baseR 6.40 7.92 2.09 Inf Inf ``` ## Code of Conduct