Skip to content

docs: add Markdown MCP docs generator (Docusaurus- and pdoc-compatible)#1015

Merged
Aaron ("AJ") Steers (aaronsteers) merged 10 commits into
mainfrom
devin/1776409252-mcp-markdown-docs
Apr 17, 2026
Merged

docs: add Markdown MCP docs generator (Docusaurus- and pdoc-compatible)#1015
Aaron ("AJ") Steers (aaronsteers) merged 10 commits into
mainfrom
devin/1776409252-mcp-markdown-docs

Conversation

@aaronsteers
Copy link
Copy Markdown
Member

@aaronsteers Aaron ("AJ") Steers (aaronsteers) commented Apr 17, 2026

Summary

Adds a small, first-party script that introspects the PyAirbyte MCP server via fastmcp inspect and renders a Markdown documentation site under docs/mcp-generated/ (git-ignored). Modeled on the mcpdocs-gen output evaluated in #1013, but emits CommonMark + GFM tables + YAML front-matter + <details> blocks instead of a self-contained HTML site, so the pages can be hosted in Docusaurus or rendered alongside pdoc without MDX-only components.

Four files are produced:

  • index.md — server overview (name, version, instructions, counts, TOC)
  • tools.md — one H2 per tool with a parameters table and collapsible input/output JSON schemas
  • resources.md — concrete resources + resource templates
  • prompts.md — prompts and their arguments

Every tool/resource/prompt name gets a stable slug anchor (e.g. tools.md#list_connectors) for deep-linking.

New poe mcp-docs-md task wires up the script. .gitignore ignores the output directory. docs/CONTRIBUTING.md picks up a short section explaining how to regenerate.

Smoke-run on this branch produced a site for 51 tools, 1 resource, 1 prompt. Lint (ruff check / ruff format --check) and type-check (pyrefly check) pass locally.

This PR is deliberately parallel to #1013 (the mcpdocs-gen evaluation) so the two approaches can be compared side-by-side before picking one.

Review & Testing Checklist for Human

  • Docusaurus compatibility — drop docs/mcp-generated/ into a real Docusaurus site and confirm front-matter, {#anchor} heading IDs, tables, and <details> blocks all render as expected. The compatibility claim is based on spec-reading, not end-to-end rendering.
  • pdoc compatibility — eyeball the generated pages next to the existing docs/generated/ (pdoc) output and confirm they coexist cleanly. Same caveat: not end-to-end rendered.
  • Schema-shape edge cases — spot-check 3+ tool sections in docs/mcp-generated/tools.md: unions (string | null), enums, arrays of objects, and tools with no parameters. The _fmt_type helper handles anyOf/oneOf/enum/array/object but does not resolve $ref or allOf; if PyAirbyte tools start using those, the type column will collapse to any/object.
  • shutil.rmtree(output) behavior — the generator unconditionally wipes and recreates --output. Fine for the default docs/mcp-generated/, worth a second look if you foresee callers pointing this at a shared/sensitive path.
  • Decide whether to adopt this, feat: prototype MCP docs generation via mcpdocs-gen #1013, or both. This is a prototype; no tests added.

Suggested manual test:

git fetch origin devin/1776409252-mcp-markdown-docs
git checkout devin/1776409252-mcp-markdown-docs
uv sync --group dev
poe mcp-docs-md
# Open docs/mcp-generated/tools.md in a Markdown previewer and spot-check 3 tools.

Notes

  • No MCP server tool definitions, docstrings, or signatures were modified; any doc-quality gaps reflect the current state of tool annotations.
  • fastmcp-gen is NOT added as a project dep — the script shells out to the fastmcp CLI that already ships in the dev group.
  • Output directory is git-ignored on purpose; regenerate on demand via poe mcp-docs-md.

Link to Devin session: https://app.devin.ai/sessions/359e794efeb844b2a8adf02b5831f999
Requested by: Aaron ("AJ") Steers (@aaronsteers)

Summary by CodeRabbit

  • Documentation

    • New guidance for generating MCP server Markdown (index, modules, tools, prompts, resources) and inclusion into module help/docs.
    • Docs generator integrates outputs into the docs site, adds deeper H3 anchors in the TOC, and subtly adjusts sidebar nesting spacing.
    • Module docs now hide internal symbols from public API listings.
  • Chores

    • Added ignore rule for generated MCP docs and an automated task/CLI to produce them; docs are regenerated during doc builds (nonblocking on failure).

Introduces `scripts/generate_mcp_markdown.py` (exposed via `poe mcp-docs-md`)
which introspects the MCP server with `fastmcp inspect` and renders a small
set of Markdown files under `docs/mcp-generated/`:

- index.md — server overview + counts + TOC
- tools.md — one H2 per tool with a GFM parameters table and collapsible
  input/output JSON schemas
- resources.md — concrete resources and resource templates
- prompts.md — prompts and their arguments

Formatting is modeled on `mcpdocs-gen` (evaluated in PR #1013) but emitted
as plain CommonMark + GFM + YAML front-matter + `<details>` blocks, so the
pages render correctly in both Docusaurus and `pdoc` without MDX-only
components. Every tool/resource/prompt has a stable slug anchor for
deep-linking.
@devin-ai-integration
Copy link
Copy Markdown
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link
Copy Markdown

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

💡 Show Tips and Tricks

Testing This PyAirbyte Version

You can test this version of PyAirbyte using the following:

# Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1776409252-mcp-markdown-docs' pyairbyte --help

# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1776409252-mcp-markdown-docs'

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /fix-pr - Fixes most formatting and linting issues
  • /uv-lock - Updates uv.lock file
  • /test-pr - Runs tests with the updated PyAirbyte
  • /prerelease - Builds and publishes a prerelease version to PyPI
📚 Show Repo Guidance

Helpful Resources

Community Support

Questions? Join the #pyairbyte channel in our Slack workspace.

📝 Edit this welcome message.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 17, 2026

Warning

Rate limit exceeded

@devin-ai-integration[bot] has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 48 minutes and 11 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 48 minutes and 11 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 874d8670-eb49-4a75-a0f5-f08d98504a29

📥 Commits

Reviewing files that changed from the base of the PR and between dffeaef and 41d092c.

📒 Files selected for processing (1)
  • scripts/generate_mcp_markdown.py
📝 Walkthrough

Walkthrough

Adds a CLI script and Poe task to generate CommonMark docs for a FastMCP server into docs/mcp-generated/, updates CONTRIBUTING and pdoc config, git-ignores the generated output, and embeds the generated pages into several airbyte/mcp module docstrings while adding __all__ = [] to those modules.

Changes

Cohort / File(s) Summary
Generation script
scripts/generate_mcp_markdown.py
New CLI that runs fastmcp inspect, parses the JSON report, groups primitives by module, and writes index.md plus one <module>.md per module into docs/mcp-generated/ with tables, anchors, and schema blocks; includes safety checks for output path and error reporting.
Project config & docs
pyproject.toml, docs/CONTRIBUTING.md
Added a Poe task mcp-docs-md to invoke the generator and documented the generation workflow, artifact layout, and regeneration instructions in CONTRIBUTING.
Generated docs ignore
.gitignore
Added docs/mcp-generated/ to ignore generated documentation files.
Module docstring includes & exports
airbyte/mcp/cloud.py, airbyte/mcp/local.py, airbyte/mcp/prompts.py, airbyte/mcp/registry.py
Inserted .. include:: ../../docs/mcp-generated/<module>.md into module docstrings and added __all__: list[str] = [] to prevent doc tooling from exposing individual tool/helper symbols as the public API.
Docs build integration & styling
docs/generate.py, docs/templates/custom.css
Added invocation to regenerate MCP markdown before pdoc (tolerating failures) and set pdoc TOC depth to 3; adjusted sidebar CSS padding for deeper TOC nesting.

Sequence Diagram

sequenceDiagram
    participant User
    participant Poe as "poe mcp-docs-md"
    participant Script as "generate_mcp_markdown.py"
    participant FastMCP as "fastmcp inspect"
    participant Parser as "JSON parser"
    participant Renderer as "Markdown writer"
    participant FS as "filesystem (docs/mcp-generated/)"

    User->>Poe: run mcp-docs-md
    Poe->>Script: invoke script
    Script->>FastMCP: execute `fastmcp inspect --server-spec`
    FastMCP-->>Script: return JSON report
    Script->>Parser: parse report & group primitives
    Parser-->>Script: structured primitives
    Script->>Renderer: render index + module pages (MD + anchors + schemas)
    Renderer->>FS: write files to docs/mcp-generated/
    Script-->>User: exit code / messages
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Would you like a deeper walkthrough of the generation script internals or unit-test suggestions, wdyt?

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding a Markdown documentation generator for MCP that works with both Docusaurus and pdoc, which is the core purpose of this PR.
Docstring Coverage ✅ Passed Docstring coverage is 92.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch devin/1776409252-mcp-markdown-docs

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot]

This comment was marked as resolved.

- UTF-8 encoding on file I/O (Windows locale safety)
- json.dumps(v) instead of repr(v) for enum values (JSON-shaped output)
- Split 'FastMCP version' into 'Protocol version' + 'FastMCP version' lines
- Fix `{'./tools'}` no-op f-string → explicit `./tools.md` links
- Add minimal safety guard refusing to rmtree '/', HOME, or CWD
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 17, 2026

PyTest Results (Fast Tests Only, No Creds)

343 tests  ±0   343 ✅ ±0   6m 13s ⏱️ +23s
  1 suites ±0     0 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit 41d092c. ± Comparison against base commit ce1a589.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 17, 2026

PyTest Results (Full)

413 tests  ±0   395 ✅ ±0   22m 21s ⏱️ - 5m 6s
  1 suites ±0    18 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit 41d092c. ± Comparison against base commit ce1a589.

♻️ This comment has been updated with latest results.

- Strip backticks from H1/H2/H3 in generated markdown; pdoc's TOC
  extractor was emitting an unbalanced <code> tag in the sidebar that
  leaked through the page as monospace rendering.
- Add __all__ = [] to airbyte/mcp/{cloud,local,registry,prompts}.py so
  pdoc hides the redundant Python-side tool declarations and uses the
  markdown include as the single source of truth on the page.
- Bump pdoc markdown toc depth from 2 to 3 so per-tool H3 anchors show
  up in the left-nav.
@aaronsteers Aaron ("AJ") Steers (aaronsteers) marked this pull request as ready for review April 17, 2026 08:39
Copilot AI review requested due to automatic review settings April 17, 2026 08:39
- custom.css: progressively indent H3-and-deeper entries in pdoc's left
  sidebar TOC so per-tool anchors visually nest under the 'Tools (N)'
  H2 and the '<module> module' H1. pdoc's default layout.css uses a
  single indent step for all non-top-level entries, which made H2 and
  H3 render at the same depth.
- generate_mcp_markdown.py: drop the inline 'Index: tool_a, tool_b, …'
  row from module pages. The left nav now lists every tool under its
  section, so the inline list was redundant.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a first-party Markdown documentation generator for the PyAirbyte MCP server (via fastmcp inspect) and wires the generated output into the existing pdoc docs flow via module-level includes.

Changes:

  • Introduces scripts/generate_mcp_markdown.py to render MCP surface area into docs/mcp-generated/ Markdown (per-module pages + index).
  • Adds a Poe task (poe mcp-docs-md) and gitignores the generated output directory.
  • Updates pdoc generation to expose deeper (H3) anchors in the sidebar, and updates MCP modules to include generated Markdown + hide symbols from API listings.

Reviewed changes

Copilot reviewed 9 out of 10 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
scripts/generate_mcp_markdown.py New generator: runs fastmcp inspect, buckets primitives by mcp_module, emits Markdown pages + index.
pyproject.toml Adds poe mcp-docs-md task to run the generator.
docs/generate.py Monkey-patches pdoc TOC depth to include H3 entries in sidebar.
docs/CONTRIBUTING.md Adds contributor instructions for regenerating MCP Markdown docs.
airbyte/mcp/cloud.py Includes generated MCP Markdown and suppresses public API listing via __all__ = [].
airbyte/mcp/local.py Includes generated MCP Markdown and suppresses public API listing via __all__ = [].
airbyte/mcp/registry.py Includes generated MCP Markdown and suppresses public API listing via __all__ = [].
airbyte/mcp/prompts.py Includes generated MCP Markdown and suppresses public API listing via __all__ = [].
.gitignore Ignores docs/mcp-generated/ output directory.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread scripts/generate_mcp_markdown.py
Comment thread scripts/generate_mcp_markdown.py Outdated
Comment thread scripts/generate_mcp_markdown.py
Comment thread docs/CONTRIBUTING.md Outdated
Comment thread docs/generate.py
devin-ai-integration[bot]

This comment was marked as resolved.

- scripts/generate_mcp_markdown.py: refresh module docstring to match
  current behavior (per-module output, no front-matter on module pages,
  DEFAULT_SERVER_SPEC is a .py path not a dotted module).
- scripts/generate_mcp_markdown.py: guard _render_index against empty
  instructions (splitlines()[0] raised IndexError).
- scripts/generate_mcp_markdown.py: tighten _prepare_output_dir to
  require --output to be strictly inside the current working directory
  (rejects /, ~, .., and arbitrary absolute paths outside the repo).
- docs/generate.py: regenerate docs/mcp-generated/ before pdoc so
  .. include:: directives resolve on a clean checkout (docs/mcp-generated
  is git-ignored). Falls back to a warning if generation fails.
- docs/CONTRIBUTING.md: describe actual per-module output layout
  (index.md + cloud/local/registry/prompts/misc) and deep-link shape.
…sfy deptry

The previous static 'from generate_mcp_markdown import ...' triggered
deptry's DEP001 rule (the script lives under scripts/ which is not on
sys.path, so deptry treated it as a missing external dependency). Use
importlib.util.spec_from_file_location to load the module from its
on-disk path instead.
…itives

Every tool / prompt / resource is now rendered in a stable alphabetical
order inside each module page (case-insensitive sort by name/uri), and
the 'misc' catch-all module is pinned last in the module table. Module
order on the index is alphabetical.

For each tool we now surface MCP tool-annotation hints as inline-code
badges right below the H3 — 'read-only', 'destructive', 'idempotent',
'open-world'. Hints are only rendered when explicitly True, so a tool
like 'list_cloud_workspaces' shows '`read-only` · `idempotent` ·
`open-world`' while 'permanently_delete_cloud_connection' shows
'`destructive` · `open-world`'. An optional human-readable
'annotations.title' override (distinct from the top-level title) is
also surfaced when present.
coderabbitai[bot]

This comment was marked as resolved.

…on, anchor output path to repo root

Addresses three CodeRabbit findings on commit dffeaef:

1. `_run_fastmcp_inspect` now passes `timeout=120` to `subprocess.run`
   and translates `TimeoutExpired` into an actionable `RuntimeError`.
   Previously a hung `fastmcp inspect` (blocking import, stalled
   network I/O during tool registration, etc.) would make
   `poe docs-generate` / `poe mcp-docs-md` hang indefinitely rather
   than fail loudly in CI.

2. `_resolve_extra_module_map` now iterates the private
   `fastmcp_extensions.decorators._REGISTERED_{PROMPTS,RESOURCES}`
   tuples *inside* the same `try`/`except Exception` that imports
   them. Previously any shape drift in those private tuples (third
   element added, `ann` becoming a dataclass, etc.) would escape the
   guard and abort doc generation \u2014 now the function falls back to an
   empty map exactly as its docstring promises.

3. `_prepare_output_dir` is now anchored to the repo root (derived
   from `__file__`), not `Path.cwd()`. `DEFAULT_OUTPUT` is a
   repo-relative path, so anchoring to cwd meant running
   `poe mcp-docs-md` from inside `docs/` (or anywhere other than the
   repo root) would silently write into the wrong directory while
   still passing the strict `is_relative_to(cwd)` guard. A new
   `_resolve_output_dir` helper encapsulates the relative-to-repo-root
   resolution; the existing safety guard semantics are preserved
   (repo root itself is rejected, absolute paths outside the repo
   root are rejected).
devin-ai-integration[bot]

This comment was marked as resolved.

Follow-up to cbb1364. Devin Review caught that `_prepare_output_dir`
was resolving paths against `_REPO_ROOT` for mkdir/rmtree while the
caller in `generate()` still used the raw (cwd-relative) `output` for
`write_text`, so running from a subdirectory would prepare
`<repo>/docs/mcp-generated/` but then try to write to
`<cwd>/docs/mcp-generated/` (which doesn't exist) and raise
`FileNotFoundError`.

`_prepare_output_dir` now returns the resolved absolute path, and
`generate()` routes all subsequent file writes through it, so the two
always agree regardless of where the task is invoked from.
Comment thread scripts/generate_mcp_markdown.py
Comment thread docs/CONTRIBUTING.md
Comment thread docs/generate.py
Comment thread scripts/generate_mcp_markdown.py
Comment thread scripts/generate_mcp_markdown.py
Comment thread scripts/generate_mcp_markdown.py
devin-ai-integration Bot added a commit to airbytehq/fastmcp-extensions that referenced this pull request Apr 17, 2026
…oc-compatible)

Vendors the MCP markdown generator from airbytehq/PyAirbyte#1015 as a first-party

public API under `fastmcp_extensions.utils.docs`, exposing a `generate_markdown_docs`

function and a CLI entry point (`python -m fastmcp_extensions.utils.docs`). This lets

consumers (PyAirbyte, airbyte-ops-mcp, ...) generate Markdown docs for their MCP

server without carrying an inline copy of the generator script.

Moving the generator into this package lets us drop the `noqa: PLC2701` that

cross-package access to `_REGISTERED_*` required, because it is no longer a

private-name access across packages — it is an internal reference.

Link to Devin session: https://app.devin.ai/sessions/52a3cf7bc9084a39b7dfda021c4116d5
@aaronsteers Aaron ("AJ") Steers (aaronsteers) merged commit 82bf1e4 into main Apr 17, 2026
21 checks passed
@aaronsteers Aaron ("AJ") Steers (aaronsteers) deleted the devin/1776409252-mcp-markdown-docs branch April 17, 2026 15:24
@aaronsteers Aaron ("AJ") Steers (aaronsteers) changed the title feat: add Markdown MCP docs generator (Docusaurus- and pdoc-compatible) docs: add Markdown MCP docs generator (Docusaurus- and pdoc-compatible) Apr 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants