I think it would be good to include the current size of the corpus (as of date x) on the README.
When I started processing, I wasn't sure how much hard drive space I would need.
FWIW, I downloaded and processed the corpus today, and my gutenberg directory is 76G
I think it would be good to include the current size of the corpus (as of date x) on the README.
When I started processing, I wasn't sure how much hard drive space I would need.
FWIW, I downloaded and processed the corpus today, and my
gutenbergdirectory is 76G