Skip to content

Patrick-Vogt/MapMiner

Repository files navigation

πŸ—ΊοΈ MapMiner

Automate lead generation by extracting business data from Google Maps and enriching it with contact information from company websites. Perfect for sales teams, marketers, and researchers who need to build targeted contact lists at scale.

✨ Features

🎯 Smart Data Collection

  • Stage 1: Google Maps Mining - Extract business information (name, address, phone, website, rating, reviews)
  • Stage 2: Website Enrichment - Automatically visit websites to find email addresses and owner information
  • Intelligent Filtering - Filter by required words (e.g., "GmbH", "Co. KG", "AG") before processing
  • Website Requirement - Option to only save entries with websites for Stage 2 processing

πŸš€ Performance Optimized

  • Early Filtering - Checks company names in search results before clicking (massive speed boost)
  • Parallel Processing - Configurable workers for Stage 2 (default: 10 concurrent threads)
  • Smart Delays - Human-like behavior with randomized delays
  • Browser Selection - Choose Safari, Chrome, or Edge

🎨 Beautiful UI

  • Apple-Inspired Design - Clean, elegant interface with smooth animations
  • Real-Time Updates - Live progress tracking via WebSocket
  • Activity Logs - Color-coded logs for easy monitoring
  • Statistics Dashboard - Track businesses scraped, emails found, and more
  • Responsive Design - Works perfectly on all screen sizes

🌍 Multi-Language Support

  • English and German translations
  • Easy to extend with additional languages

πŸ“Έ Screenshots

Main Dashboard

Main Dashboard

Configuration Panel

Configuration

Real-Time Progress

Progress

πŸš€ Quick Start

Prerequisites

  • Python 3.8+
  • Node.js 20.19+ or 22.12+
  • Safari (built-in on Mac) or Chrome/Edge browser

Installation

  1. Clone the repository
git clone <your-repo-url>
cd MapMiner
  1. Backend Setup
cd backend
python3 -m venv venv
source venv/bin/activate  # On Mac/Linux
# OR
venv\Scripts\activate     # On Windows

pip install -r requirements.txt
  1. Frontend Setup
cd frontend
npm install

Running the Application

Option 1: Use the start script (Mac/Linux)

./start.sh

Option 2: Manual start

Terminal 1 - Backend:

cd backend
source venv/bin/activate
python app.py

Terminal 2 - Frontend:

cd frontend
npm run dev

Open your browser to http://localhost:5173

πŸ“– Usage Guide

Basic Configuration

  1. Search Term - What you're looking for (e.g., "Autohaus", "Restaurant", "Hotel")
  2. Cities - Comma-separated list (e.g., "Berlin, MΓΌnchen, Hamburg")
  3. Entries per City - How many matching results to collect per city
  4. Required Words - Filter by company type (e.g., "GmbH, Co. KG, AG")
  5. Browser - Choose Safari, Chrome, or Edge
  6. Require Website - Only save entries with websites (recommended for Stage 2)
  7. Run Stage 2 - Enable/disable website enrichment

Advanced Settings

  • Min/Max Delay - Random delays between actions (2-5s default)
  • Scroll Delays - Delays when scrolling results (3-7s default)
  • Max Workers - Parallel threads for Stage 2 (10 default, max 50)

How It Works

Stage 1: Google Maps Mining

  • Searches Google Maps for your search term in each city
  • Filters results by required words (if specified)
  • Only clicks listings that match your criteria (speed optimization!)
  • Extracts: name, address, phone, website, rating, reviews
  • Saves to CSV in backend/output/

Stage 2: Website Enrichment

  • Visits each website in parallel
  • Extracts email addresses using smart patterns
  • Finds owner/manager names (German patterns supported)
  • Updates CSV with contact information

Stage 3: Download

  • Download your completed CSV file with all data

πŸ“ Output

CSV files are saved in backend/output/ with the format:

{search_term}_{timestamp}.csv

Example: Autohaus_20260203_104530.csv

CSV Columns

  • name - Business name
  • address - Full address
  • phone - Phone number
  • website - Website URL
  • rating - Google Maps rating
  • reviews - Number of reviews
  • email - Email address(es) found
  • owner - Owner/manager name found

πŸ—οΈ Project Structure

MapMiner/
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ output/              # CSV output files
β”‚   β”œβ”€β”€ app.py              # Flask server + WebSocket
β”‚   β”œβ”€β”€ scraper_orchestrator.py  # Workflow manager
β”‚   └── requirements.txt    # Python dependencies
β”œβ”€β”€ frontend/
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ components/     # React components
β”‚   β”‚   β”œβ”€β”€ App.jsx        # Main app
β”‚   β”‚   └── translations.js # i18n
β”‚   └── package.json       # Node dependencies
β”œβ”€β”€ maps_scraper_configurable.py   # Stage 1: MapMiner
β”œβ”€β”€ website_scraper_configurable.py # Stage 2: Enrichment
└── README.md             # This file

βš™οΈ Configuration

Browser Setup

Safari (Mac only)

safaridriver --enable

Chrome/Edge

  • Automatically managed by Selenium
  • WebDriver downloads happen automatically

Environment Variables

None required! Everything is configured through the UI.

πŸ› Troubleshooting

"Could not find search box"

  • Google Maps structure may have changed
  • Try a different browser
  • Check your internet connection

"Failed to download CSV"

  • Ensure scraping completed successfully
  • Check backend/output/ directory exists
  • Restart backend server

"White screen in browser"

  • Check browser console for errors (F12)
  • Verify all dependencies installed: npm install
  • Clear cache and hard reload (Cmd+Shift+R / Ctrl+Shift+R)

Browser crashes

  • Try a different browser (Safari β†’ Chrome β†’ Edge)
  • Reduce entries per city
  • Increase delays in advanced settings

πŸš€ Performance Tips

  1. Use Required Words filter - Dramatically speeds up scraping by filtering before clicking
  2. Enable "Require Website" - Skips entries without websites (faster Stage 1)
  3. Increase Max Workers - More parallel processing in Stage 2 (20-30 for fast machines)
  4. Start small - Test with 1-2 cities and low entry count first
  5. Monitor logs - Watch for errors or rate limiting

⚠️ Legal & Ethical Use

  • Respect Google Maps Terms of Service
  • Respect robots.txt files
  • Use reasonable delays to avoid overwhelming servers
  • Use scraped data responsibly and ethically
  • Consider privacy laws (GDPR, etc.) when handling contact information

πŸ”„ Updates

Updating Dependencies

Backend:

cd backend
source venv/bin/activate
pip install --upgrade -r requirements.txt

Frontend:

cd frontend
npm update

πŸ“„ License

This project is for educational purposes. Please ensure you comply with all applicable laws and terms of service when using this tool.


Made with ❀️ for efficient data collection

About

Automate lead generation by extracting business data from Google Maps and enriching it with contact information from company websites. Perfect for sales teams, marketers, and researchers who need to build targeted contact lists at scale.

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors