Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .agent/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -379,7 +379,7 @@ npx @modelcontextprotocol/inspector scrapegraph-mcp
## 📅 Changelog

### April 2026
- ✅ Migrated MCP client and tools to **API v2** ([scrapegraph-py#84](https://github.com/ScrapeGraphAI/scrapegraph-py/pull/84)): base `https://api.scrapegraphai.com/api/v2`, `SGAI-APIKEY` header (matches SDK wire format), new crawl/monitor/credits/history tools; removed sitemap, agentic_scrapper, status polling tools. Env vars aligned with SDK: `SGAI_API_URL`, `SGAI_TIMEOUT` (legacy alias `SGAI_TIMEOUT_S` still honored).
- ✅ Migrated MCP client and tools to **API v2** ([scrapegraph-py#84](https://github.com/ScrapeGraphAI/scrapegraph-py/pull/84)): base `https://v2-api.scrapegraphai.com/api`, `SGAI-APIKEY` header (matches SDK wire format), new crawl/monitor/credits/history tools; removed sitemap, agentic_scrapper, status polling tools. Env vars aligned with SDK: `SGAI_API_URL`, `SGAI_TIMEOUT` (legacy alias `SGAI_TIMEOUT_S` still honored).
- ✅ Added `monitor_activity` tool for paginated tick history (GET /monitor/:id/activity), mirroring `sgai.monitor.activity()` in scrapegraph-py v2.

### January 2026
Expand Down
8 changes: 4 additions & 4 deletions .agent/system/mcp_protocol.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,10 +86,10 @@ The **Model Context Protocol** (MCP) is an open standard that defines how AI ass
└────────────┬────────────────────┘
│ HTTPS API requests
┌─────────────────────────────────┐
│ ScrapeGraphAI API │
https://api.scrapegraphai.com │
└─────────────────────────────────┘
┌───────────────────────────────────
│ ScrapeGraphAI API
v2-api.scrapegraphai.com/api
└───────────────────────────────────
```

### FastMCP Framework
Expand Down
6 changes: 3 additions & 3 deletions .agent/system/project_architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@ AI Assistant (Claude/Cursor)
↓ (stdio via MCP)
FastMCP Server (this project)
↓ (HTTPS API calls)
ScrapeGraphAI API (default https://api.scrapegraphai.com/api/v2)
ScrapeGraphAI API (default https://v2-api.scrapegraphai.com/api)
↓ (web scraping)
Target Websites
```
Expand All @@ -141,7 +141,7 @@ The server follows a simple, single-file architecture:

**`ScapeGraphClient` Class:**
- HTTP client wrapper for ScrapeGraphAI API v2 ([scrapegraph-py#84](https://github.com/ScrapeGraphAI/scrapegraph-py/pull/84))
- Base URL: `https://api.scrapegraphai.com/api/v2` (override with env `SGAI_API_URL`)
- Base URL: `https://v2-api.scrapegraphai.com/api` (override with env `SGAI_API_URL`)
- Auth: `SGAI-APIKEY`, `X-SDK-Version: scrapegraph-mcp@2.0.0` (matches scrapegraph-py v2)
- v2 methods include `scrape_v2`, `extract`, `search_api`, `crawl_*`, `monitor_*`, `credits`, `history`, plus compatibility wrappers used by MCP tools

Expand Down Expand Up @@ -391,7 +391,7 @@ If status is "completed":

### ScrapeGraphAI API

**Base URL:** `https://api.scrapegraphai.com/api/v2` (configurable via `SGAI_API_URL`)
**Base URL:** `https://v2-api.scrapegraphai.com/api` (configurable via `SGAI_API_URL`)

**Authentication:**
- Headers: `SGAI-APIKEY: <key>` (matches scrapegraph-py v2 wire format)
Expand Down
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,11 +28,11 @@ A production-ready [Model Context Protocol](https://modelcontextprotocol.io/intr

## API v2

This MCP server targets **ScrapeGraph API v2** (`https://api.scrapegraphai.com/api/v2`), aligned 1:1 with
This MCP server targets **ScrapeGraph API v2** (`https://v2-api.scrapegraphai.com/api`), aligned 1:1 with
[scrapegraph-py PR #84](https://github.com/ScrapeGraphAI/scrapegraph-py/pull/84). Auth uses the
`SGAI-APIKEY` header. Environment variables mirror the Python SDK:

- **`SGAI_API_URL`** — override the base URL (default `https://api.scrapegraphai.com/api/v2`)
- **`SGAI_API_URL`** — override the base URL (default `https://v2-api.scrapegraphai.com/api`)
- **`SGAI_TIMEOUT`** — request timeout in seconds (default `120`)
- **`SGAI_API_KEY`** — API key (can also be passed via MCP `scrapegraphApiKey` or `X-API-Key` header)

Expand Down Expand Up @@ -670,7 +670,7 @@ For comprehensive developer documentation, see:

### API Integration
- **ScrapeGraph AI API** - Enterprise web scraping service
- **Base URL**: `https://api.scrapegraphai.com/v1`
- **Base URL**: `https://v2-api.scrapegraphai.com/api`
- **Authentication**: API key-based

## License
Expand Down
2 changes: 1 addition & 1 deletion server.json
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
"name": "SGAI_API_KEY"
},
{
"description": "Override API base URL (default https://api.scrapegraphai.com/api/v2)",
"description": "Override API base URL (default https://v2-api.scrapegraphai.com/api)",
"isRequired": false,
"format": "string",
"isSecret": false,
Expand Down
10 changes: 5 additions & 5 deletions src/scrapegraph_mcp/server.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
Removed on v2 (no API equivalent): sitemap, agentic_scrapper, markdownify_status, smartscraper_status.

Environment variables (match scrapegraph-py v2):
- SGAI_API_URL (default https://api.scrapegraphai.com/api/v2) — base URL override
- SGAI_API_URL (default https://v2-api.scrapegraphai.com/api) — base URL override
- SGAI_TIMEOUT (default 120) — request timeout in seconds
- SCRAPEGRAPH_API_BASE_URL — legacy alias for SGAI_API_URL (still honored)
- SGAI_TIMEOUT_S — legacy alias for SGAI_TIMEOUT (still honored)
Expand Down Expand Up @@ -88,8 +88,8 @@
logger = logging.getLogger(__name__)

MCP_SERVER_VERSION = "2.0.0"
# Matches scrapegraph-py v2 (env.py): https://api.scrapegraphai.com/api/v2
DEFAULT_API_BASE_URL = "https://api.scrapegraphai.com/api/v2"
# Matches scrapegraph-py v2 (env.py): https://v2-api.scrapegraphai.com/api
DEFAULT_API_BASE_URL = "https://v2-api.scrapegraphai.com/api"


def _api_base_url() -> str:
Expand Down Expand Up @@ -662,7 +662,7 @@ def web_scraping_guide() -> str:
1. Use **markdownify** or **scrape** before **smartscraper** when you only need readable text.
2. Multi-page **AI** extraction: run **smartscraper** per URL, or use **monitor_create** on a schedule.
3. Poll **smartcrawler_fetch_results** until the crawl finishes.
4. Override API host with env **SGAI_API_URL** if needed (default `https://api.scrapegraphai.com/api/v2`).
4. Override API host with env **SGAI_API_URL** if needed (default `https://v2-api.scrapegraphai.com/api`).
"""


Expand Down Expand Up @@ -727,7 +727,7 @@ def api_status() -> str:
return """# ScapeGraph API Status (MCP v2)

- **MCP package version**: 2.0.0 (matches [scrapegraph-py#84](https://github.com/ScrapeGraphAI/scrapegraph-py/pull/84) API surface)
- **Default API base**: `https://api.scrapegraphai.com/api/v2` (override with `SGAI_API_URL`)
- **Default API base**: `https://v2-api.scrapegraphai.com/api` (override with `SGAI_API_URL`)
- **Auth headers**: `SGAI-APIKEY`, `X-SDK-Version: scrapegraph-mcp@2.0.0`

## Tools
Expand Down
Loading