Skip to content

feat(mcp): expand MCP server with learn, status, session_list tools (#717)#729

Merged
magicpro97 merged 6 commits into
mainfrom
feat/i717-mcp-write-rebased
May 30, 2026
Merged

feat(mcp): expand MCP server with learn, status, session_list tools (#717)#729
magicpro97 merged 6 commits into
mainfrom
feat/i717-mcp-write-rebased

Conversation

@magicpro97
Copy link
Copy Markdown
Owner

Summary

Expands MCP server with write tools: learn, status, session_list.

Closes #717

Changes

  • mcp-server.py: Added learn, status, session_list MCP tools
  • test_fixes.py: 35 I717 tests (now correctly placed after sys.exit fix)

Fixes

  • Ruff CI errors fixed (rebased on main b0352a5)
  • I717 tests were dead code (after sys.exit) — fixed (original review by review-pr725)

Review

  • Original Opus review completed: dead code bug found and fixed
  • All 35 I717 tests now execute and pass ✅

Copilot AI review requested due to automatic review settings May 30, 2026 08:59
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Expands the stdio MCP server with three new tools — learn, status, and session_list — turning mcp-server.py from a read-only surface into a read/write knowledge surface, and adds an I717 test block in test_fixes.py exercising both source-level expectations and an end-to-end subprocess round-trip against a temporary $HOME and knowledge.db.

Changes:

  • Adds VALID_LEARN_CATEGORIES, three new TOOLS entries, dispatch wiring, and three handlers: _run_learn shells out to learn.py and parses an entry id from stdout; _run_status reports session/entry counts plus watcher liveness from .watcher.lock; _run_session_list returns recent sessions via parameterized SQL in read-only mode.
  • Adds ~35 I717-* tests in test_fixes.py covering source markers, dispatch wiring, SQL safety, and a JSON-RPC subprocess round-trip against a synthetic DB.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
mcp-server.py Registers and implements learn, status, session_list MCP tools and dispatch entries.
test_fixes.py Adds I717 source-check and integration tests for the new MCP write/read tools.

Comment thread mcp-server.py
Comment on lines +184 to +232
{
"name": "learn",
"description": "Write a knowledge entry (mistake, pattern, feature, discovery, etc.) to the local knowledge base. Issue #717.",
"inputSchema": {
"type": "object",
"properties": {
"category": {
"type": "string",
"enum": sorted(VALID_LEARN_CATEGORIES),
"description": "Knowledge category.",
},
"title": {"type": "string", "description": "Short title for the knowledge entry."},
"description": {"type": "string", "description": "Content / body of the knowledge entry."},
"tags": {
"type": "string",
"description": "Comma-separated tags (optional).",
},
},
"required": ["category", "title", "description"],
"additionalProperties": False,
},
},
{
"name": "status",
"description": "Return a JSON health snapshot: session count, entry count, watcher status. Issue #717.",
"inputSchema": {
"type": "object",
"properties": {},
"required": [],
"additionalProperties": False,
},
},
{
"name": "session_list",
"description": "Return the most recent sessions from the local knowledge base. Issue #717.",
"inputSchema": {
"type": "object",
"properties": {
"limit": {
"type": "integer",
"minimum": 1,
"maximum": 100,
"description": "Maximum sessions to return (default 20).",
},
},
"required": [],
"additionalProperties": False,
},
},
Comment thread mcp-server.py Outdated
Comment on lines +505 to +511
combined = result.stdout + result.stderr
entry_id = None
m = re.search(r"#(\d+)", combined)
if m:
entry_id = int(m.group(1))

body = {"status": "ok", "message": "Entry recorded", "id": entry_id}
Comment thread mcp-server.py
Comment on lines +473 to +503
def _run_learn(arguments: dict[str, Any]) -> dict[str, Any]:
"""Write a knowledge entry by calling learn.py as a subprocess."""
category = _require_string(arguments, "category")
if category not in VALID_LEARN_CATEGORIES:
raise JsonRpcError(
JSONRPC_INVALID_PARAMS,
f"'category' must be one of: {', '.join(sorted(VALID_LEARN_CATEGORIES))}",
)
title = _require_string(arguments, "title")
description = _require_string(arguments, "description")
tags = _optional_string(arguments, "tags", max_length=500)

learn_py = TOOLS_DIR / "learn.py"
if not learn_py.exists():
raise JsonRpcError(JSONRPC_INTERNAL_ERROR, "learn.py not found")

flag = f"--{category}"
cmd = [sys.executable, str(learn_py), flag, title, description]
if tags:
cmd += ["--tags", tags]

try:
result = subprocess.run(cmd, capture_output=True, text=True, timeout=30)
except subprocess.TimeoutExpired as exc:
raise JsonRpcError(JSONRPC_INTERNAL_ERROR, "learn.py timed out") from exc
except Exception as exc:
raise JsonRpcError(JSONRPC_INTERNAL_ERROR, f"learn.py subprocess error: {exc}") from exc

if result.returncode != 0:
msg = result.stderr.strip() or result.stdout.strip() or "learn.py failed"
raise JsonRpcError(JSONRPC_INTERNAL_ERROR, msg)
@magicpro97 magicpro97 force-pushed the feat/i717-mcp-write-rebased branch 3 times, most recently from a590699 to 8404fe0 Compare May 30, 2026 11:39
Copy link
Copy Markdown
Owner Author

@magicpro97 magicpro97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — PR #729 (feat(mcp): expand MCP server with learn, status, session_list tools)

Three real bugs found. Style/formatting issues are not flagged.


🔴 Issue 1 — _run_learn: title and description have no length cap (DoS / subprocess arg overflow)

File: mcp-server.py, in _run_learn

title = _require_string(arguments, "title")         # ← no max_length
description = _require_string(arguments, "description")  # ← no max_length
tags = _optional_string(arguments, "tags", max_length=500)  # tags IS capped

_require_string enforces non-empty but has no max_length parameter. A caller can pass an arbitrarily large string (e.g. 10 MB); it will be forwarded as a subprocess command-line argument:

cmd = [sys.executable, str(learn_py), flag, title, description]

This can cause: OOM in the subprocess, OS arg-length rejection (ARG_MAX ~2 MB on macOS/Linux), or a timeout (the 30 s guard fires but the damage is already done). Tags are already capped, but the higher-risk fields (body content) are not.

Fix: add explicit length validation before the subprocess call, e.g.:

if len(title) > 500:
    raise JsonRpcError(JSONRPC_INVALID_PARAMS, "'title' must be 500 characters or fewer")
if len(description) > 10_000:
    raise JsonRpcError(JSONRPC_INVALID_PARAMS, "'description' must be 10 000 characters or fewer")

🔴 Issue 2 — _run_status and _run_session_list: SQLite connection not closed on unexpected exceptions

File: mcp-server.py, in _run_status and _run_session_list

Both new functions open a DB connection but only call db.close() on the happy path. The established pattern in this file (see _run_query_memory, the finally: db.close() block) is try / finally. Neither new function follows it.

In _run_status:

try:
    db = sqlite3.connect(db_uri, uri=True)
    ...
    db.close()        # ← skipped if ProgrammingError, MemoryError, etc.
except Exception:
    pass              # ← silently swallows everything; db never closed

In _run_session_list the outer except Exception re-raises as JsonRpcError but also never closes the connection when a non-OperationalError occurs mid-query.

CPython reference counting makes this unlikely to leak in practice, but it deviates from the invariant maintained everywhere else in the file.

Fix: use try / finally (matching _run_query_memory):

db = sqlite3.connect(db_uri, uri=True)
try:
    ...
finally:
    db.close()

🟡 Issue 3 — _run_learn: fragile entry_id regex can return the wrong ID

File: mcp-server.py, in _run_learn

combined = result.stdout + result.stderr
m = re.search(r"#(\d+)", combined)

re.search returns the first match of #N anywhere in stdout+stderr. If learn.py emits any other #N before the entry-ID line (progress counter, line reference, etc.), the returned id will silently be wrong — not null, but a misidentified integer.

Fix: anchor to the known output format of learn.py, or have it emit a machine-readable marker (e.g. ENTRY_ID=123) to parse unambiguously.


✅ What looks good

  • SQL injection: session_list correctly uses ? placeholder — no interpolation.
  • Category validation: VALID_LEARN_CATEGORIES whitelist checked server-side before subprocess call; also enforced in JSON schema.
  • Read-only DB: both new handlers open with ?mode=ro.
  • status missing-DB path: gracefully returns zeros.
  • No shell injection: subprocess.run uses list form (shell=False) despite user-supplied args.
  • 30 s timeout on learn.py subprocess is appropriate.
  • Tests I717-15..22: real end-to-end subprocess round-trips with synthetic DB — good coverage. I717-22 validates invalid-category rejection.

@magicpro97
Copy link
Copy Markdown
Owner Author

Fixed all reviewer issues:

  • Token gating: Added _check_auth(arguments) at the top of _run_learn — matches the pattern in _run_query_memory
  • Length caps: Added max_length=500 for title and max_length=10_000 for description via updated _require_string signature
  • Fragile regex: Now passes --json flag to learn.py and parses structured {"id": N} JSON output; only falls back to anchored regex Added new \w+ #(\d+) when JSON parse fails
  • Connection leaks in _run_status / _run_session_list: Wrapped db.close() calls in try/finally blocks

Requesting re-review. CI pending.

@magicpro97 magicpro97 force-pushed the feat/i717-mcp-write-rebased branch from 5afd762 to b89fd2d Compare May 30, 2026 12:11
Linh Ngo and others added 6 commits May 30, 2026 19:16
…717)

- mcp-server.py: add learn tool (writes knowledge entry via subprocess), status tool (DB health snapshot), session_list tool (last N sessions)
- Wire all 3 in _handle_tools_call() with proper input schemas
- test_fixes.py: +35 tests (I717)

Closes #717

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The 35 I717 test cases were placed after sys.exit() and never
executed. Move the terminal exit to after the I717 block and
add a proper final summary block.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…+ JSON ID parsing

- Add _check_auth() call to _run_learn for COPILOT_MCP_TOKEN gating (matching _run_query_memory)
- Add max_length=500 to title and max_length=10000 to description in _run_learn
- Update _require_string to accept optional max_length parameter
- Fix fragile #(\d+) regex: pass --json to learn.py and parse structured JSON output
  for entry_id; anchored fallback regex 'Added new \w+ #(\d+)' used only if JSON missing
- Fix connection leak in _run_status: use try/finally around db.close()
- Fix connection leak in _run_session_list: use try/finally around db.close()

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@magicpro97 magicpro97 force-pushed the feat/i717-mcp-write-rebased branch from b89fd2d to 2552679 Compare May 30, 2026 12:16
@magicpro97 magicpro97 merged commit 2226bae into main May 30, 2026
34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(mcp): expand MCP server with learn, status, session_list tools

2 participants