Add download_dataset function (and one do Do It Live!)#49
Open
jashapiro wants to merge 16 commits into
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #35
The main addition here is the
download_dataset()function, which really just does what it says and should behave more or less like the download project function, but for taking a dataset... The main extra is that it has a check to be sure the dataset is actually ready before trying to download it.I also added handling of expired datasets, since that is a thing that can happen.
Finally, I also made a function that waits until a dataset is ready and then downloads. I wasn't sure if this should really be a separate function, but the additional option made me think that it probably should be... But I could reconsider and roll them together, with a separate internal function that just does the status polling. Let me know what you think!
The biggest challenge here was deciding how often to poll and how long to wait: For now I set polling to every 30 seconds: In my testing, processing for small datasets takes somewhere between 30 sec and 1 min, so faster polling didn't seem worth doing. But I did leave it as an option in case people really want to poll at a different rate. I also got to play around a bit with
clito make a spinner while waiting. The elapsed time you get withcliis a bit disappointing in that you can't control rounding, unless you want to do all the seconds to minutes translations yourself, so I just left it as the default.I'm also now allowing downloads without automatic unzipping, when ends up in a bit a strange place for the arg because I didn't want to break position-based calls any more than already.