docs: add Markdown MCP docs generator (Docusaurus- and pdoc-compatible)#1015
Conversation
Introduces `scripts/generate_mcp_markdown.py` (exposed via `poe mcp-docs-md`) which introspects the MCP server with `fastmcp inspect` and renders a small set of Markdown files under `docs/mcp-generated/`: - index.md — server overview + counts + TOC - tools.md — one H2 per tool with a GFM parameters table and collapsible input/output JSON schemas - resources.md — concrete resources and resource templates - prompts.md — prompts and their arguments Formatting is modeled on `mcpdocs-gen` (evaluated in PR #1013) but emitted as plain CommonMark + GFM + YAML front-matter + `<details>` blocks, so the pages render correctly in both Docusaurus and `pdoc` without MDX-only components. Every tool/resource/prompt has a stable slug anchor for deep-linking.
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. 💡 Show Tips and TricksTesting This PyAirbyte VersionYou can test this version of PyAirbyte using the following: # Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1776409252-mcp-markdown-docs' pyairbyte --help
# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1776409252-mcp-markdown-docs'PR Slash CommandsAirbyte Maintainers can execute the following slash commands on your PR:
📚 Show Repo GuidanceHelpful ResourcesCommunity SupportQuestions? Join the #pyairbyte channel in our Slack workspace. |
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 48 minutes and 11 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds a CLI script and Poe task to generate CommonMark docs for a FastMCP server into Changes
Sequence DiagramsequenceDiagram
participant User
participant Poe as "poe mcp-docs-md"
participant Script as "generate_mcp_markdown.py"
participant FastMCP as "fastmcp inspect"
participant Parser as "JSON parser"
participant Renderer as "Markdown writer"
participant FS as "filesystem (docs/mcp-generated/)"
User->>Poe: run mcp-docs-md
Poe->>Script: invoke script
Script->>FastMCP: execute `fastmcp inspect --server-spec`
FastMCP-->>Script: return JSON report
Script->>Parser: parse report & group primitives
Parser-->>Script: structured primitives
Script->>Renderer: render index + module pages (MD + anchors + schemas)
Renderer->>FS: write files to docs/mcp-generated/
Script-->>User: exit code / messages
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Would you like a deeper walkthrough of the generation script internals or unit-test suggestions, wdyt? 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
- UTF-8 encoding on file I/O (Windows locale safety)
- json.dumps(v) instead of repr(v) for enum values (JSON-shaped output)
- Split 'FastMCP version' into 'Protocol version' + 'FastMCP version' lines
- Fix `{'./tools'}` no-op f-string → explicit `./tools.md` links
- Add minimal safety guard refusing to rmtree '/', HOME, or CWD
- Strip backticks from H1/H2/H3 in generated markdown; pdoc's TOC
extractor was emitting an unbalanced <code> tag in the sidebar that
leaked through the page as monospace rendering.
- Add __all__ = [] to airbyte/mcp/{cloud,local,registry,prompts}.py so
pdoc hides the redundant Python-side tool declarations and uses the
markdown include as the single source of truth on the page.
- Bump pdoc markdown toc depth from 2 to 3 so per-tool H3 anchors show
up in the left-nav.
- custom.css: progressively indent H3-and-deeper entries in pdoc's left sidebar TOC so per-tool anchors visually nest under the 'Tools (N)' H2 and the '<module> module' H1. pdoc's default layout.css uses a single indent step for all non-top-level entries, which made H2 and H3 render at the same depth. - generate_mcp_markdown.py: drop the inline 'Index: tool_a, tool_b, …' row from module pages. The left nav now lists every tool under its section, so the inline list was redundant.
There was a problem hiding this comment.
Pull request overview
Adds a first-party Markdown documentation generator for the PyAirbyte MCP server (via fastmcp inspect) and wires the generated output into the existing pdoc docs flow via module-level includes.
Changes:
- Introduces
scripts/generate_mcp_markdown.pyto render MCP surface area intodocs/mcp-generated/Markdown (per-module pages + index). - Adds a Poe task (
poe mcp-docs-md) and gitignores the generated output directory. - Updates
pdocgeneration to expose deeper (H3) anchors in the sidebar, and updates MCP modules to include generated Markdown + hide symbols from API listings.
Reviewed changes
Copilot reviewed 9 out of 10 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/generate_mcp_markdown.py | New generator: runs fastmcp inspect, buckets primitives by mcp_module, emits Markdown pages + index. |
| pyproject.toml | Adds poe mcp-docs-md task to run the generator. |
| docs/generate.py | Monkey-patches pdoc TOC depth to include H3 entries in sidebar. |
| docs/CONTRIBUTING.md | Adds contributor instructions for regenerating MCP Markdown docs. |
| airbyte/mcp/cloud.py | Includes generated MCP Markdown and suppresses public API listing via __all__ = []. |
| airbyte/mcp/local.py | Includes generated MCP Markdown and suppresses public API listing via __all__ = []. |
| airbyte/mcp/registry.py | Includes generated MCP Markdown and suppresses public API listing via __all__ = []. |
| airbyte/mcp/prompts.py | Includes generated MCP Markdown and suppresses public API listing via __all__ = []. |
| .gitignore | Ignores docs/mcp-generated/ output directory. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- scripts/generate_mcp_markdown.py: refresh module docstring to match current behavior (per-module output, no front-matter on module pages, DEFAULT_SERVER_SPEC is a .py path not a dotted module). - scripts/generate_mcp_markdown.py: guard _render_index against empty instructions (splitlines()[0] raised IndexError). - scripts/generate_mcp_markdown.py: tighten _prepare_output_dir to require --output to be strictly inside the current working directory (rejects /, ~, .., and arbitrary absolute paths outside the repo). - docs/generate.py: regenerate docs/mcp-generated/ before pdoc so .. include:: directives resolve on a clean checkout (docs/mcp-generated is git-ignored). Falls back to a warning if generation fails. - docs/CONTRIBUTING.md: describe actual per-module output layout (index.md + cloud/local/registry/prompts/misc) and deep-link shape.
…sfy deptry The previous static 'from generate_mcp_markdown import ...' triggered deptry's DEP001 rule (the script lives under scripts/ which is not on sys.path, so deptry treated it as a missing external dependency). Use importlib.util.spec_from_file_location to load the module from its on-disk path instead.
…itives Every tool / prompt / resource is now rendered in a stable alphabetical order inside each module page (case-insensitive sort by name/uri), and the 'misc' catch-all module is pinned last in the module table. Module order on the index is alphabetical. For each tool we now surface MCP tool-annotation hints as inline-code badges right below the H3 — 'read-only', 'destructive', 'idempotent', 'open-world'. Hints are only rendered when explicitly True, so a tool like 'list_cloud_workspaces' shows '`read-only` · `idempotent` · `open-world`' while 'permanently_delete_cloud_connection' shows '`destructive` · `open-world`'. An optional human-readable 'annotations.title' override (distinct from the top-level title) is also surfaced when present.
…on, anchor output path to repo root Addresses three CodeRabbit findings on commit dffeaef: 1. `_run_fastmcp_inspect` now passes `timeout=120` to `subprocess.run` and translates `TimeoutExpired` into an actionable `RuntimeError`. Previously a hung `fastmcp inspect` (blocking import, stalled network I/O during tool registration, etc.) would make `poe docs-generate` / `poe mcp-docs-md` hang indefinitely rather than fail loudly in CI. 2. `_resolve_extra_module_map` now iterates the private `fastmcp_extensions.decorators._REGISTERED_{PROMPTS,RESOURCES}` tuples *inside* the same `try`/`except Exception` that imports them. Previously any shape drift in those private tuples (third element added, `ann` becoming a dataclass, etc.) would escape the guard and abort doc generation \u2014 now the function falls back to an empty map exactly as its docstring promises. 3. `_prepare_output_dir` is now anchored to the repo root (derived from `__file__`), not `Path.cwd()`. `DEFAULT_OUTPUT` is a repo-relative path, so anchoring to cwd meant running `poe mcp-docs-md` from inside `docs/` (or anywhere other than the repo root) would silently write into the wrong directory while still passing the strict `is_relative_to(cwd)` guard. A new `_resolve_output_dir` helper encapsulates the relative-to-repo-root resolution; the existing safety guard semantics are preserved (repo root itself is rejected, absolute paths outside the repo root are rejected).
Follow-up to cbb1364. Devin Review caught that `_prepare_output_dir` was resolving paths against `_REPO_ROOT` for mkdir/rmtree while the caller in `generate()` still used the raw (cwd-relative) `output` for `write_text`, so running from a subdirectory would prepare `<repo>/docs/mcp-generated/` but then try to write to `<cwd>/docs/mcp-generated/` (which doesn't exist) and raise `FileNotFoundError`. `_prepare_output_dir` now returns the resolved absolute path, and `generate()` routes all subsequent file writes through it, so the two always agree regardless of where the task is invoked from.
…oc-compatible) Vendors the MCP markdown generator from airbytehq/PyAirbyte#1015 as a first-party public API under `fastmcp_extensions.utils.docs`, exposing a `generate_markdown_docs` function and a CLI entry point (`python -m fastmcp_extensions.utils.docs`). This lets consumers (PyAirbyte, airbyte-ops-mcp, ...) generate Markdown docs for their MCP server without carrying an inline copy of the generator script. Moving the generator into this package lets us drop the `noqa: PLC2701` that cross-package access to `_REGISTERED_*` required, because it is no longer a private-name access across packages — it is an internal reference. Link to Devin session: https://app.devin.ai/sessions/52a3cf7bc9084a39b7dfda021c4116d5
Summary
Adds a small, first-party script that introspects the PyAirbyte MCP server via
fastmcp inspectand renders a Markdown documentation site underdocs/mcp-generated/(git-ignored). Modeled on themcpdocs-genoutput evaluated in #1013, but emits CommonMark + GFM tables + YAML front-matter +<details>blocks instead of a self-contained HTML site, so the pages can be hosted in Docusaurus or rendered alongsidepdocwithout MDX-only components.Four files are produced:
index.md— server overview (name, version, instructions, counts, TOC)tools.md— one H2 per tool with a parameters table and collapsible input/output JSON schemasresources.md— concrete resources + resource templatesprompts.md— prompts and their argumentsEvery tool/resource/prompt name gets a stable slug anchor (e.g.
tools.md#list_connectors) for deep-linking.New
poe mcp-docs-mdtask wires up the script..gitignoreignores the output directory.docs/CONTRIBUTING.mdpicks up a short section explaining how to regenerate.Smoke-run on this branch produced a site for 51 tools, 1 resource, 1 prompt. Lint (
ruff check/ruff format --check) and type-check (pyrefly check) pass locally.This PR is deliberately parallel to #1013 (the
mcpdocs-genevaluation) so the two approaches can be compared side-by-side before picking one.Review & Testing Checklist for Human
docs/mcp-generated/into a real Docusaurus site and confirm front-matter,{#anchor}heading IDs, tables, and<details>blocks all render as expected. The compatibility claim is based on spec-reading, not end-to-end rendering.pdoccompatibility — eyeball the generated pages next to the existingdocs/generated/(pdoc) output and confirm they coexist cleanly. Same caveat: not end-to-end rendered.docs/mcp-generated/tools.md: unions (string | null), enums, arrays of objects, and tools with no parameters. The_fmt_typehelper handlesanyOf/oneOf/enum/array/objectbut does not resolve$reforallOf; if PyAirbyte tools start using those, the type column will collapse toany/object.shutil.rmtree(output)behavior — the generator unconditionally wipes and recreates--output. Fine for the defaultdocs/mcp-generated/, worth a second look if you foresee callers pointing this at a shared/sensitive path.Suggested manual test:
git fetch origin devin/1776409252-mcp-markdown-docs git checkout devin/1776409252-mcp-markdown-docs uv sync --group dev poe mcp-docs-md # Open docs/mcp-generated/tools.md in a Markdown previewer and spot-check 3 tools.Notes
fastmcp-genis NOT added as a project dep — the script shells out to thefastmcpCLI that already ships in the dev group.poe mcp-docs-md.Link to Devin session: https://app.devin.ai/sessions/359e794efeb844b2a8adf02b5831f999
Requested by: Aaron ("AJ") Steers (@aaronsteers)
Summary by CodeRabbit
Documentation
Chores