A sophisticated web scraper that analyzes pizza deals from e-food.gr to find the best value-for-money offers in Greece. The scraper uses Playwright for automation and calculates Value For Money (VFM) metrics based on pizza size, price, and restaurant rating.
- Automated Scraping: Uses Playwright to scrape restaurant data and pizza deals
- API Integration: Fetches catalog data directly from e-food.gr API for faster processing
- Dynamic Size Discovery: Automatically detects store-specific pizza sizes from catalogs
- VFM Calculation: Ranks deals by value using formula:
VFM = (pizza_area / price) × (rating / 5.0) - Smart Filtering: Support for restaurant allowlists and blocklists
- Comprehensive Output: Generates CSV, JSON, and visualization charts
- Closed Store Support: Extracts ratings and deals even from currently closed restaurants
- Popup Handling: Automatically closes interrupting popups and modals
VFM Index = (Total Pizza Area / Price) × (Rating / 5.0)
Where:
- Total Pizza Area = π × (diameter/2)² × quantity
- Rating Factor = Linear penalty from 1.0 (5-star) to 0.2 (1-star)
- Python 3.10 or higher
- Git
- Clone the repository
git clone https://github.com/FueledByRedBull/efoodScraper.git
cd efoodScraper- Create a virtual environment
python -m venv venv- Activate the virtual environment
- Windows (PowerShell):
venv\Scripts\Activate.ps1
- Windows (Command Prompt):
venv\Scripts\activate.bat
- macOS/Linux:
source venv/bin/activate
- Install dependencies
pip install -r requirements.txt- Install Playwright browsers
playwright install chromiumREQUIRED: Configure your delivery location before running:
-
Copy the example file:
cp .env.example .env
-
Find your coordinates from e-food.gr:
- Open e-food.gr in your browser and select your delivery address
- Navigate to any restaurant page (e.g., pizza category)
- Open Developer Tools (F12) → Network tab
- Look for a request to
catalog?shop_id=...or similar API endpoint - Click on it and check the Query String Parameters or Payload:
- Copy the
latitudevalue (e.g.,0.0) - Copy the
longitudevalue (e.g.,0.0) - Copy the
user_addressID from the URL or parameters
- Copy the
-
Edit
.envfile with your values:EFOOD_USER_ADDRESS=YOUR_ADDRESS_ID EFOOD_LATITUDE=0.0 EFOOD_LONGITUDE=0.0 # Optional settings EFOOD_HEADLESS=false EFOOD_USE_API=true
The configuration is automatically loaded from .env file when you run the scraper.
Create restaurant_filters.json to customize which restaurants to scrape:
-
Copy the example file:
cp restaurant_filters.example.json restaurant_filters.json
-
Edit the file:
{ "skip_restaurants": [ "Toronto", "Pizza Fan" ], "allowed_restaurants": [] }
skip_restaurants: Blacklist (skip these restaurants)allowed_restaurants: Whitelist (only scrape these, overrides blacklist)
Note: The configuration is automatically loaded when you run the scraper.
You can customize behavior by editing .env:
| Setting | Description | Default |
|---|---|---|
EFOOD_HEADLESS |
Run browser in headless mode | false |
EFOOD_USE_API |
Use API (faster) vs page scraping | true |
EFOOD_MAX_RESTAURANTS |
Limit number of restaurants | None (all) |
python main.py# PowerShell
$env:EFOOD_HEADLESS = "true"
python main.py
# Or edit config.py and set headless = TrueUsing allowlist (only scrape these):
# In src/config.py
allowed_restaurants: list[str] = Field(default_factory=lambda: [
"Papagalino",
"Pizza Crust",
"La Strada"
])Using blocklist (skip these):
# In src/config.py
skip_restaurants: list[str] = Field(default_factory=lambda: [
"Pizza Fan",
"Toronto"
])The scraper generates several outputs in the output/ directory:
pizza_vfm.csv: Complete dataset with all deals and VFM metricspizza_vfm.json: JSON format of complete scrape resultscharts/vfm_distribution.png: Histogram of VFM scorescharts/restaurant_comparison.png: Average VFM by restaurantcharts/top10_deals.png: Bar chart of top 10 deals
The scraper prints:
- Progress for each restaurant
- Top 10 deals for 2-pizza, 3-pizza, and 4-pizza categories
- Summary statistics
Example output:
==================================================
SCRAPING SUMMARY
==================================================
Total Restaurants: 45
Total Deals: 231
Average VFM: 145.67 cm2/EUR
Top 10 Deals with 2 Pizzas:
1. Papagalino (4.5) - 2 Οικογενειακές Πίτσες : 189.32 cm2/EUR (VFM: 170.39)
2. Pizza Crust (4.3) - 2 Γίγας της επιλογής σας : 176.45 cm2/EUR (VFM: 151.75)
...
efoodScraper/
├── src/
│ ├── __init__.py
│ ├── analysis.py # Analysis and reporting logic
│ ├── api_client.py # E-food API client
│ ├── catalog_parser.py # JSON catalog parser
│ ├── config.py # Configuration settings
│ ├── constants.py # Centralized constants
│ ├── export.py # CSV/JSON export functions
│ ├── logging_config.py # Logging configuration
│ ├── models.py # Data models (Restaurant, Deal, VFM)
│ ├── scraper.py # Main Playwright scraper
│ └── vfm.py # VFM calculation functions
├── output/ # Generated reports (gitignored)
├── main.py # Entry point
├── requirements.txt # Python dependencies
├── .gitignore
└── README.md
Default pizza size mappings (in cm diameter):
| Greek Name | English | Diameter |
|---|---|---|
| μικρή | Small | 25cm |
| κανονική | Regular | 30cm |
| μεσαία | Medium | 32cm |
| μεγάλη | Large | 36cm |
| οικογενειακή | Family | 36cm |
| γίγας | Giant | 40cm |
| jumbo | Jumbo | 45cm |
Note: The scraper dynamically discovers store-specific sizes that override these defaults.
If you discover restaurants with non-standard pizza sizes, you can create restaurant_overrides.json to override the dynamic size discovery:
-
Copy the example file:
cp restaurant_overrides.example.json restaurant_overrides.json
-
Edit
restaurant_overrides.jsonwith restaurant-specific sizes:{ "Pizza Mare": { "sizes": { "οικογενειακή": 30, "γίγας": 40 }, "url_patterns": ["pizza-mare", "pizzamare"] } }
Note: This file is optional. The scraper automatically discovers sizes from each store's catalog, so overrides are only needed for edge cases.
Save authenticated cookies to cookies.json for accessing delivery-specific pricing:
[
{
"name": "session_id",
"value": "your_session_value",
"domain": ".e-food.gr",
"path": "/"
}
]Security Warning: The cookies.json file contains sensitive personal information including session tokens, addresses, and coordinates. This file is gitignored by default but you should:
- Never commit this file to version control
- Keep it private and do not share
- Regenerate cookies periodically
# Reinstall Playwright
pip uninstall playwright
pip install playwright
playwright install chromiumThe scraper automatically handles the "Tyxeri Peiniata" popup, but if issues persist:
- Set
headless: bool = Falseto watch the browser - The scraper presses ESC to close modals
If ratings aren't extracted:
- The scraper navigates to each restaurant's detail page
- Closed stores have ratings extracted from their detail page
- Check
allowed_restaurantsconfiguration - Verify the restaurant has deals in the "Προσφορές" section
- Run with
headless=Falseto debug visually
- playwright - Browser automation
- aiohttp - Async HTTP client for API calls
- pandas - Data analysis and CSV export
- matplotlib - Chart generation
- pydantic - Configuration validation
See requirements.txt for full list.
MIT License - feel free to use and modify.
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Submit a pull request
This tool is for educational purposes only. Please respect e-food.gr's terms of service and use rate limiting appropriately. The scraper includes delays between requests to be respectful of the service.
Made with love for finding the best pizza deals