replaced loop in windfetch() with a parallel backend loop from foreac…#2
replaced loop in windfetch() with a parallel backend loop from foreac…#2anguswg-ucsb wants to merge 3 commits intoblasee:mainfrom
Conversation
…h package, cuts processing time in ~half
|
Just added one more parameter: as_sf = TRUE. TRUE if you want to return an SF object, otherwise returns just fetch lengths. Thought this would come in handy if you're doing millions of fetch calculations and you don't want a cumbersome spatial object being returned. Also alot of the CRS checks in the beginning of the code could be simplified down to 3-4 ifelse statements. Will get to later. |
|
Hey @anguswg-ucsb! Thanks for contributing to windfetch! What you've done sounds great! I'll have a look in more detail when I get a chance in the next few days :-) |
blasee
left a comment
There was a problem hiding this comment.
Just a few minor things to tidy up. I am also keen to hear your thoughts :-). A couple of other general items:
- Please use
=assignment operator for consistency with the rest of the code. - Fix
R CMD CHECKNOTE
| max_dist = 300, | ||
| n_directions = 9, | ||
| quiet = FALSE, | ||
| progress_bar = TRUE, |
There was a problem hiding this comment.
The progress_bar argument is no longer required. I think maybe we should only have a progress bar when 1 core is selected. Thoughts?
| n_directions = 9, | ||
| quiet = FALSE, | ||
| progress_bar = TRUE, | ||
| as_sf = TRUE, |
There was a problem hiding this comment.
as_sf is misleading as the object returned is actually a windFetch object, which doesn't contain the sf class. I think we get rid of this argument, unless you have a good reason why the windfetch function shouldn't act as a constructor function for the windFetch class.
| quiet = FALSE, | ||
| progress_bar = TRUE, | ||
| as_sf = TRUE, | ||
| ncores = 2 |
There was a problem hiding this comment.
I think a good default is a single core.
| # name sites | ||
| if (any(grepl("^[Nn]ames{0,1}$", names(site_layer)))){ | ||
| name_col = grep("^[Nn]ames{0,1}$", names(site_layer)) | ||
| message("name section1") |
There was a problem hiding this comment.
This is an uninformative message.
| site_names = as.character(site_layer[[name_col]]) | ||
|
|
||
| } else { | ||
| message("name section2") |
| st_sfc(st_point(c(as.numeric(x['X0']), as.numeric(x['Y0'])))) | ||
| }), st_sfc), crs = st_crs(polygon_layer))) | ||
|
|
||
| message("Subset section") |
| poly_subset = subset(polygon_layer, lengths(st_intersects(polygon_layer, fetch_df)) > 0) | ||
| message("parralel") | ||
| # make a cluster | ||
| cl <- parallel::makeCluster(ncores) #not to overload your computer |
There was a problem hiding this comment.
I think we should include an error message if ncores > parallel::detectCores()
| sf::st_as_sf() %>% | ||
| sf::st_transform(sf::st_crs(polygon_layer)) | ||
|
|
||
| message("quadrant add") |
|
|
||
| # if no SF object is desired, just return fetch distances | ||
| if(as_sf == TRUE) { | ||
| message("new part") |
| setTxtProgressBar(pb, i) | ||
| } | ||
| cat("\n") | ||
| message("no new part") |
|
Hello @blasee sorry for not getting back to your change requests sooner. I have been working on my own raster based fetch R package (fetchr) to try and make the calculations faster when scaled to thousands of water cells/entire coastlines. Not sure if this is helpful for any of your applications! Runs that use to take between 2-4 hours (maybe crashing my session) for me now take 5-25 seconds ! |
replaced loop in windfetch() with a parallel backend loop from the foreach package. This makes windfetch() ~2x faster. For whatever reason when I was using devtools::check() to build the package, it didn't seem to like running any of the examples with more than 2 cores, so the default number of cores (ncores parameter) is 2. I would recommend using ~75% of your cores to really see improvements on large amounts of datapoints.
Working on rewriting this package using the terra package because that will likely improve some processing times....