yeeao/Scraper
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Repository files navigation
Indexer: (indexer.py) Before running: - install BeautifulSoup (bs4) - create 6 text files to serve as the index and 1 text file for url-docID mapping - provide path for the corups in variable DEV_path on line 14 - provide path for the url-DocID text file in variable URL_path on line 15 - provide path for the first of the six index text files in variable Index1_path on line 16 - provide path for the second of the six index text files in variable Index2_path on line 17 - provide path for the third of the six index text files in variable Index3_path on line 18 - provide path for the fourth of the six index text files in variable Index4_path on line 19 - provide path for the fifth of the six index text files in variable Index5_path on line 20 - provide path for the sixth of the six index text files in variable Index6_path on line 21 indexer.py should now run properly from the command line or through IDLE Search Engine: (Search.py) Before running: - install BeautifulSoup (bs4) - provide path for the url-DocID text file in variable URL_path on line 20 - provide path for the first of the six index text files in variable Index1_path on line 14 - provide path for the second of the six index text files in variable Index2_path on line 15 - provide path for the third of the six index text files in variable Index3_path on line 16 - provide path for the fourth of the six index text files in variable Index4_path on line 17 - provide path for the fifth of the six index text files in variable Index5_path on line 18 - provide path for the sixth of the six index text files in variable Index6_path on line 19 Search.py should now run properly from the command line or through IDLE