Skip to content

Move chunk_size to argument and retry_on_failure = TRUE#879

Open
ldecicco-USGS wants to merge 36 commits intoDOI-USGS:developfrom
ldecicco-USGS:develop
Open

Move chunk_size to argument and retry_on_failure = TRUE#879
ldecicco-USGS wants to merge 36 commits intoDOI-USGS:developfrom
ldecicco-USGS:develop

Conversation

@ldecicco-USGS
Copy link
Copy Markdown
Collaborator

After some testing, realized the "req_retry" function wasn't really doing much unless retry_on_failure = TRUE was set and httr2::req_perform_iterative needs to have on_error = "return" so that the retry actually gets triggered.

Additionally, moved the chunking argument to a top level argument.

@ldecicco-USGS
Copy link
Copy Markdown
Collaborator Author

Also a lot of argument rejiggering to follow tidyverse design principles:
https://design.tidyverse.org/dots-after-required.html

Copy link
Copy Markdown
Collaborator

@jzemmels jzemmels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These updates look good. I have a few comments and questions, but nothing major. I ran check() locally and didn't see anything worrisome.

"skipGeometry"
)

comma_params <- c(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated but a question: do you know if hydrologic_unit_codes from the monitoring-location endpoint is a single or comma parameter? I was just testing out passing a list of HUC8s in drpy, and I got an answer, but the count was way small and I don't even know what was actually used.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I last asked about that, huc's were not going to be comma parameters "any time soon" (so maybe someday?).

Remember HUCs are tricky. Our sites have a many combination of number of digits. So I think the issue was if you asked for c("012345", "012346"), you would only get those 2 HUCs, even if there were tons of sites with "01234567" and "01234678" for instance.

In dataRetrieval we send the HUC requests to a POST and that allows us to take advantage of the work the API developers put in to make HUCs special so that it really is going is as c("012345%", "012346%")

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait! I forgot I set up the huc POST stuff specifically:
https://github.com/DOI-USGS/dataRetrieval/blob/develop/R/construct_api_requests.R#L498

#' Available options are:
#' `r dataRetrieval:::get_properties_for_docs("parameter-codes", "parameter_code_id")`.
#' The default (`NA`) will return all columns of the data.
#' @param \dots Not used. Included to help differentiate official Water Data API arguments
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what this means. Do you mean differentiate within the documentation file? Like a visual boundary between the API vs. dR-specific parameters?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Riffing off tidyverse's design principle:
https://design.tidyverse.org/dots-after-required.html

The idea is that the arguments after the ...'s are something rarely used and this requires that the users would specifically name them. Granted, in our functions, users mostly need to name the args anyway since there are so many (and no one should be defining ALL the arguments), but essentially the answer is yes - it's a good visual way to differentiate what are queries used in the API, and what we use outside the API (and therefore what users should either not mess with too often, and if they do they should understand what they're doing)

I also put "limit" in the dR-specific because it falls more into the category that users should rarely need to mess with it and it's confusing. They might think if they set the limit=10 they will only get 10 results, but it actually means they'll get 10 per page....so in that regard, since dR does automatic page walking.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, I understand. Thanks for clarifying

Comment thread R/get_ogc_data.R Outdated
#' information. The default `NA` will not specify the argument in the request.
#'
#' @keywords internal
check_arguments_api <- function(bbox, skipGeometry, ...) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the dots used for here?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nothing, pulling them out now...

#' download_and_parse = FALSE)
#' length(recent_query)
#'}
read_waterdata_ratings <- function(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I accidentally ran this while at the API limit and got back the following output:

trying URL '[https://api.waterdata.usgs.gov/stac-files/ratings/USGS.01104475.exsa.rdb](vscode-file://vscode-app/c:/Users/jzemmels/AppData/Local/Programs/Positron/resources/app/out/vs/code/electron-browser/workbench/workbench.html#)'
Warning message:
In utils::download.file(url = url, destfile = full_file_path) :
  cannot open URL '[https://api.waterdata.usgs.gov/stac-files/ratings/USGS.01104475.exsa.rdb](vscode-file://vscode-app/c:/Users/jzemmels/AppData/Local/Programs/Positron/resources/app/out/vs/code/electron-browser/workbench/workbench.html#)': HTTP status was '429 Unknown Error'

Do we want the retry and print functionality to extend to this function as well?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch! I'll see if it acts the same way in the header as the ogc stuff

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the headers look the same for the request, but the download.file is outside httr2, so it won't be trivial to use something like the httr2::req_error(body = error_body) that we use with the APIs

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, got the download working with httr2 instead of the base download.file.

#' @export
#' @inherit read_waterdata_continuous details
#'
#' @return List of data frames which contain the requested rating curves.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a list and those column names the expected output from users? Do you think users would prefer a row-bound, "tidy" data frame with an extra column indicating the site ID? I'm not saying I'd prefer the tidy format, just wondering what a user would expect.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that there is a lot of important information hidden in the "comment" attribute. So somehow we'd need to bundle that information in as well. On the other hand, the list means that each data frame acts the same as the traditional 1 site returned from readNWISrating.

Comment thread R/read_waterdata_ratings.R Outdated
Co-authored-by: Joe Zemmels (he/him) <jzemmels@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants