A simple and efficient HTML scraper built in Rust, designed for easy command-line usage and interactive scraping.
Seascraper is a lightweight and efficient HTML scraper written in Rust. It allows users to extract text content from web pages using CSS selectors. The tool supports both command-line arguments for quick scraping and an interactive mode for guided usage. It's designed to be simple, fast, and easy to use, making it ideal for developers, data analysts, and anyone needing to scrape web data programmatically.
The project leverages Rust's performance and safety features to provide a reliable scraping solution. It uses libraries like reqwest for HTTP requests, scraper for HTML parsing, and clap for command-line interface handling.
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.
- Rust (version 1.70 or later)
- Cargo (comes with Rust)
You can install Rust using rustup:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
- Clone the repository:
git clone <repository-url>
cd seascraper
- Build the project:
cargo build --release
- Run the scraper:
cargo run
This will start the interactive mode where you can enter the URL, selector, and amount to scrape.
Run the scraper without arguments to enter interactive mode:
cargo run
You'll be prompted to enter:
- URL to scrape
- CSS selector
- Number of items to scrape
Use command-line arguments for direct scraping:
cargo run -- <url> <selector> [amount]
Example:
cargo run -- https://scrapeme.live/shop/ "span.price" 5
--bannerless: Run without displaying the ASCII banner--help: Display help information
Using the following arguments:
url: https://scrapeme.live/shop/
selector: span.price
amount: 5
The text is: £109.99
The text is: £109.99
The text is: £109.99
The text is: £109.99
The text is: £109.99
- Rust - Programming Language
- Clap - Command Line Argument Parser
- Tokio - Asynchronous Runtime
- Reqwest - HTTP Client
- Scraper - HTML Parsing Library
- Dialoguer - Interactive Prompts
- Colored - Terminal Colors
- Anyhow - Error Handling
- @rafainsights - Idea & Initial work
░█▀▀░█▀▀░█▀█░█▀▀░█▀▀░█▀▄░█▀█░█▀█░█▀▀░█▀▄ ░▀▀█░█▀▀░█▀█░▀▀█░█░░░█▀▄░█▀█░█▀▀░█▀▀░█▀▄ ░▀▀▀░▀▀▀░▀░▀░▀▀▀░▀▀▀░▀░▀░▀░▀░▀░░░▀▀▀░▀░▀
