Graphclone is a structural website mapping tool that crawls complex site hierarchies and converts them into a normalized graph abstraction. The resulting structure feeds a TypeScript/React wireframe builder that generates visual documentation and exportable artifacts for engineering scoping and architectural analysis. Graphclone is intended for authorized environments such as internal systems, owned properties, or sites where you have explicit permission to perform automated analysis.
- Traverses nested directory structures and dynamic routing patterns while preventing infinite loops and redundant traversal.
WAIT_FOR_LOGINallows the user to login with their credentials like they would normally- Keeps passwords private by never requiring your password to be entered into Graphclone
- Enables scraping of gated content.
- Applies structural heuristics to normalize dynamic URLs
- Handles parameterized routes to prevent duplicate mapping
- Configurable thresholds for predictable outputs
- Raw HTML snapshot
- Page screenshot
- Canonicalized route metadata
Structured outputs feed directly into a Next.js TypeScript application /wireframe that reconstructs page layouts and component hierarchies for review, documentation, and scoping workflows.
- Scraper (main.py)
- Asynchronous Python
- Playwright-driven browser automation
- HTML snapshots
- screenshot.png
- Route metadata
- Wireframe Builder
/wireframe - Next.js
- TypeScript
- Consumes structural mapping output
- Renders interactive wireframes
- Supports PDF export
- Install Python Dependencies
pip install -r requirements.txtandplaywright install chrome - Setup Next.js Wireframe Builder Bash Copy code cd wireframe npm install Usage
- Run the Mapper Bash Copy code python main.py
- Authenticated Mode (Optional)
In config.py:
Python
Copy code
WAIT_FOR_LOGIN = TrueWhen enabled: A visible browser window opens. Log into your website as you normally would. Return to the terminal and press Enter. Crawling continues using the authenticated session. Authenticated runs are be labeled with a suffix in the_logged_into distinguish them from public crawls. - With the scrape results, feed into an LLM of your choice (
.github/agents/graphclone-builder.mdis recommended) to output to/wireframe - Generate your Product requirements document with
python generate_prd.py
python -m pytest tests -q