This project is a Python web scraper that extracts book data from Books to Scrape and saves it to a CSV file. The scraper handles pagination, error handling, request headers, logging, and delays to behave like a real-world automation tool.
- Scrapes all pages of the website automatically
- Extracts Title, Price, Rating
- Saves data to
output/books_data.csv - Logs progress and errors in
logs/scraper.log - Includes error handling and request delays to prevent blocking
src/ main.py # Entry point scraper.py # Handles scraping pages parser.py # Extracts data from HTML storage.py # Saves data to CSV logs/ # Log file directory output/ # CSV output directory
- Python
- Requests
- BeautifulSoup
- Pandas
- Logging
- Clone the repository:
git clone https://github.com/HothoLina/python-automation-web-scraper.git