Hi @fontclos @martingerlach , I would like to ask if this repo is still actively maintained? I have noticed that... * `get_data.py` downloads the text data (corpus) from pg very slowly, each file after another * there is no indication of the download size of the corpus #29 * it would be great to change the server via #41 * currently, the `get_data.py` only works for Linux-based OS, it would be great to have Windows support for that as well (I could help with that), see #37 , #47 , #42
Hi @fontclos @martingerlach ,
I would like to ask if this repo is still actively maintained? I have noticed that...
get_data.pydownloads the text data (corpus) from pg very slowly, each file after anotherget_data.pyonly works for Linux-based OS, it would be great to have Windows support for that as well (I could help with that), see Not windows-friendly things #37 , Fixed typos, an oversight regarding nltk data download, and added support for multi-threading/processing, Windows, ignoring UTF-8 decoding failures, etc. #47 , Paths #42