Merged
Conversation
…res, cost tracking, PR blast radius Adds 11 capabilities across the indexing pipeline, persistence layer, MCP tools, and CLI. MCP tool count is unchanged; new functionality is folded into existing tools (get_risk, get_overview, get_dead_code). Pipeline & generation - ProcessPool-based parsing with sequential fallback; ingestion and git stages now run concurrently via asyncio.gather - RAG-aware doc generation: dependency summaries are pre-fetched from the vector store and injected into the file_page prompt; pages generated in topological order so leaves are summarized before their dependents - Dynamic import hint extractors (Django INSTALLED_APPS/ROOT_URLCONF/ MIDDLEWARE/url include, pytest conftest fixtures, Node package.json exports + tsconfig path aliases) wired into GraphBuilder.add_dynamic_edges Persistence - AtomicStorageCoordinator with async transaction() context manager and health_check() spanning SQL, in-memory graph, and vector store - recompute_git_percentiles now uses a single SQL PERCENT_RANK() window function instead of in-memory Python ranking - New temporal_hotspot_score column on git_metadata, computed via exp decay (180-day half-life) and used as the primary percentile sort key - New llm_costs and security_findings tables; matching ORM models - vector_store.get_page_summary_by_path() on all three backends Cost tracking - CostTracker with per-call recording, persisted to llm_costs; pricing table covers Claude 4.6 family, GPT-4o, and Gemini 1.5/2.5/3.x variants - Wired into Anthropic, Gemini, OpenAI, and LiteLLM providers - Live USD column on the indexing progress bar - New `repowise costs` CLI grouping by operation/model/day Analysis - PRBlastRadiusAnalyzer: transitive ancestor BFS over graph_edges, co-change warnings, recommended reviewers by temporal ownership, test gaps, 0–10 overall risk score - SecurityScanner: pattern-based scan for eval/exec/pickle/raw SQL/ hardcoded secrets/weak hashes; persisted at index time MCP tool extensions - get_risk(changed_files=[...]) returns blast radius; per-file payload now includes test_gap and security_signals - get_overview returns knowledge_map with top owners, knowledge silos (>80% ownership), and onboarding targets - get_dead_code accepts min_confidence, include_internals, include_zombie_packages, no_unreachable, no_unused_exports CLI - `repowise dead-code` exposes the same sensitivity flags - `repowise doctor` adds a coordinator drift health check (Check #10) - `repowise costs` command registered Tests - test_models.py: expected table set updated to include llm_costs and security_findings; full suite green (757 passed, 9 skipped) - End-to-end validated against test-repos/microdot: 164 files ingested, 83 pages generated, 132 git_metadata rows with temporal hotspot score, 83 cost rows totaling $0.0258, 2 security findings, drift = 0
…pact context default
Adds two new MCP tools, two supporting alembic migrations, and a set of
ingestion / generation improvements that make the wiki layer usable for
single-call agent workflows. All existing tools continue to work
unchanged. Bumps the public tool count from 8 to 10.
New MCP tools
-------------
- mcp_server/tool_answer.py (new): get_answer(question, scope?, repo?)
is a one-call RAG endpoint over the wiki layer. It runs an FTS pass
with a coverage re-ranker, splits relational questions on connectives
and boosts pages at the intersection of both halves, gates synthesis
on a top/second dominance ratio (>= 1.2x), and only invokes the LLM
when retrieval is clearly dominant. High-confidence responses include
a note explaining the consumer can cite directly without verification
reads. Ambiguous retrievals return ranked excerpts so the agent
grounds in source instead of anchoring on a wrong frame. Synthesised
answers are persisted to AnswerCache by question hash so repeat
questions return at zero LLM cost. Degrades cleanly to retrieval-only
mode when no provider is configured.
- mcp_server/tool_symbol.py (new): get_symbol(symbol_id) resolves a
qualified id of the form "path/to/file.py::Class::method" (also
accepts the dot separator) to its source body, signature, file
location, line range, and docstring. Recovers the rich on-disk
signature so base classes, decorators, and full type annotations
reach the LLM (the stripped DB form would lose these). Handles
duplicate-row resolution by canonical pick rather than raising
MultipleResultsFound.
- mcp_server/_meta.py (new): shared _meta envelope and per-tool hint
builders used by tool_answer / tool_context / tool_symbol so all
three return a consistent metadata block (timing, hint, page counts).
- mcp_server/__init__.py: re-exports the new tools, updates the
module docstring to "10 tools".
Schema migrations
-----------------
- alembic/versions/0012_page_summary.py (new): adds wiki_pages.summary
TEXT NOT NULL DEFAULT "". Stores a 1–3 sentence purpose blurb per
page so get_context can return narrative file-level descriptions
without shipping content_md on every turn. Server default backfills
existing rows on upgrade. Reversible downgrade defined.
- alembic/versions/0013_answer_cache.py (new): creates the answer_cache
table with (id, repository_id, question_hash, question, payload_json,
provider_name, model_name, created_at), a unique constraint on
(repository_id, question_hash), an index on repository_id, and a
CASCADE foreign key to repositories so dropping a repo cleans up its
cache automatically. Pure CREATE TABLE — no impact on existing data.
Reversible downgrade defined.
- core/persistence/models.py: adds the Page.summary column and the
AnswerCache ORM model matching the migrations above.
- core/persistence/crud.py: helpers for upserting page summaries and
reading/writing AnswerCache rows.
Existing MCP tools
------------------
- mcp_server/tool_context.py: get_context now defaults to compact=True.
Compact mode drops the structure block, the imported_by list, and
per-symbol docstring/end_line fields, keeping responses under ~10K
characters on dense files. Pass compact=False to get the full payload
on demand. Docstring trimmed to clean tool documentation. Internal
Fallback labels relabeled in plain English.
- mcp_server/tool_search.py: docstring expanded into clean tool
documentation; behaviour unchanged.
- mcp_server/tool_risk.py: cleanup pass; behaviour unchanged.
- server/chat_tools.py and docstring counts: updated to 10 tools.
Ingestion / generation
----------------------
- core/generation/page_generator.py: _is_significant_file() now treats
any file tagged is_test=True (with at least one extracted symbol) as
significant, regardless of PageRank. Test files have near-zero
centrality because nothing imports them back, but they answer
"what test exercises X" / "where is Y verified" questions and the
doc layer is the right place to surface those. Filtering remains
available via --skip-tests.
- core/ingestion/traverser.py: removes the workaround that excluded
tests/, test/, spec/, specs/, __tests__ from the traversal. The
underlying pagerank-inflation bug it guarded against is fixed in
graph.py via the deterministic stem-priority disambiguation
(_stem_priority / _build_stem_map), so test files can now be
indexed safely while still being tagged is_test=True for downstream
filtering.
- core/ingestion/graph.py: prose cleanup in the stem-priority docstring
and _build_stem_map; explains the test-fixture-named-like-the-package
failure mode in neutral terms. Framework-aware synthetic-edge code
(_add_conftest_edges, _add_django_edges, _add_fastapi_edges,
_add_flask_edges, dispatched by add_framework_edges(tech_stack))
is unchanged.
- core/ingestion/parser.py, core/generation/models.py: small cleanups
feeding the new wiki_pages.summary field through the generation
pipeline.
CLI
---
- cli/main.py: minor wiring for the new tools and the compact default.
Tests
-----
- tests/unit/server/test_tool_symbol.py (new): unit tests for
_resolve_symbol covering separator-style mismatches between
Class.method and Class::method and MultipleResultsFound handling
on duplicate lookup keys.
- tests/unit/server/test_mcp.py: counter and fixture updates for the
10-tool surface.
- tests/unit/ingestion/test_graph.py: fixture updates around the
stem-priority cleanup.
Docs
----
- README.md: bumps "Eight MCP tools" → "Ten MCP tools" in the headline,
abstract, comparison table, and competitor matrix; adds get_answer,
get_symbol, and compact-default rows to the tool table; documents
the test-files-in-wiki and single-call-answer additions in the
"What's new" section.
- docs/ARCHITECTURE.md: schema table now lists the summary column on
wiki_pages and the new answer_cache table; the page-generator
section documents the test-file inclusion rule; references to "8
tools" updated to 10.
- docs/CHANGELOG.md: Unreleased Added entries for get_answer,
get_symbol, the two migrations, and test-file indexing; Changed
entry for the get_context compact default.
- docs/USER_GUIDE.md: tool table updated to 10 entries.
- docs/architecture-guide.md, docs/CHAT.md: tool counts updated.
- packages/server/README.md, plugins/claude-code/DEVELOPER.md,
website/index.md, website/concepts.md, website/mcp-server.md,
website/claude-md-generator.md: tool counts updated; mcp-server.md
gains full sections (parameters, returns, examples) for get_answer
and get_symbol and documents the new compact parameter on
get_context.
Verified
--------
Ran `repowise init --index-only` end-to-end against pallets/flask:
125 files, 1,624 symbols, 125 nodes, 241 edges (191 imports + 28
framework + 22 dynamic), 8 languages, 14 hotspots, 13 dead-code
findings. SQL audit confirmed both new migrations applied
(answer_cache table present; wiki_pages.summary column present),
test files contributed 920 symbols, and conftest framework edges
fired. Live MCP-tool checks against the full-mode wiki: get_symbol
resolved src/flask/app.py::Flask to its source body and signature
across lines 109–508; get_context returned the LLM summary without
the structure / imported_by blocks (compact default); get_answer
ran retrieval, hit the dominance gate at 1.07× < 1.2×, and correctly
returned ranked excerpts instead of synthesising a wrong frame.
RaghavChamadiya
approved these changes
Apr 9, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds two new MCP tools, two supporting alembic migrations, and a set of
ingestion / generation improvements that make the wiki layer usable for
single-call agent workflows. All existing tools continue to work
unchanged. Bumps the public tool count from 8 to 10.
New MCP tools
mcp_server/tool_answer.py (new): get_answer(question, scope?, repo?)
is a one-call RAG endpoint over the wiki layer. It runs an FTS pass
with a coverage re-ranker, splits relational questions on connectives
and boosts pages at the intersection of both halves, gates synthesis
on a top/second dominance ratio (>= 1.2x), and only invokes the LLM
when retrieval is clearly dominant. High-confidence responses include
a note explaining the consumer can cite directly without verification
reads. Ambiguous retrievals return ranked excerpts so the agent
grounds in source instead of anchoring on a wrong frame. Synthesised
answers are persisted to AnswerCache by question hash so repeat
questions return at zero LLM cost. Degrades cleanly to retrieval-only
mode when no provider is configured.
mcp_server/tool_symbol.py (new): get_symbol(symbol_id) resolves a
qualified id of the form "path/to/file.py::Class::method" (also
accepts the dot separator) to its source body, signature, file
location, line range, and docstring. Recovers the rich on-disk
signature so base classes, decorators, and full type annotations
reach the LLM (the stripped DB form would lose these). Handles
duplicate-row resolution by canonical pick rather than raising
MultipleResultsFound.
mcp_server/_meta.py (new): shared _meta envelope and per-tool hint
builders used by tool_answer / tool_context / tool_symbol so all
three return a consistent metadata block (timing, hint, page counts).
mcp_server/init.py: re-exports the new tools, updates the
module docstring to "10 tools".
Schema migrations
alembic/versions/0012_page_summary.py (new): adds wiki_pages.summary
TEXT NOT NULL DEFAULT "". Stores a 1–3 sentence purpose blurb per
page so get_context can return narrative file-level descriptions
without shipping content_md on every turn. Server default backfills
existing rows on upgrade. Reversible downgrade defined.
alembic/versions/0013_answer_cache.py (new): creates the answer_cache
table with (id, repository_id, question_hash, question, payload_json,
provider_name, model_name, created_at), a unique constraint on
(repository_id, question_hash), an index on repository_id, and a
CASCADE foreign key to repositories so dropping a repo cleans up its
cache automatically. Pure CREATE TABLE — no impact on existing data.
Reversible downgrade defined.
core/persistence/models.py: adds the Page.summary column and the
AnswerCache ORM model matching the migrations above.
core/persistence/crud.py: helpers for upserting page summaries and
reading/writing AnswerCache rows.
Existing MCP tools
mcp_server/tool_context.py: get_context now defaults to compact=True.
Compact mode drops the structure block, the imported_by list, and
per-symbol docstring/end_line fields, keeping responses under ~10K
characters on dense files. Pass compact=False to get the full payload
on demand. Docstring trimmed to clean tool documentation. Internal
Fallback labels relabeled in plain English.
mcp_server/tool_search.py: docstring expanded into clean tool
documentation; behaviour unchanged.
mcp_server/tool_risk.py: cleanup pass; behaviour unchanged.
server/chat_tools.py and docstring counts: updated to 10 tools.
Ingestion / generation
core/generation/page_generator.py: _is_significant_file() now treats
any file tagged is_test=True (with at least one extracted symbol) as
significant, regardless of PageRank. Test files have near-zero
centrality because nothing imports them back, but they answer
"what test exercises X" / "where is Y verified" questions and the
doc layer is the right place to surface those. Filtering remains
available via --skip-tests.
core/ingestion/traverser.py: removes the workaround that excluded
tests/, test/, spec/, specs/, tests from the traversal. The
underlying pagerank-inflation bug it guarded against is fixed in
graph.py via the deterministic stem-priority disambiguation
(_stem_priority / _build_stem_map), so test files can now be
indexed safely while still being tagged is_test=True for downstream
filtering.
core/ingestion/graph.py: prose cleanup in the stem-priority docstring
and _build_stem_map; explains the test-fixture-named-like-the-package
failure mode in neutral terms. Framework-aware synthetic-edge code
(_add_conftest_edges, _add_django_edges, _add_fastapi_edges,
_add_flask_edges, dispatched by add_framework_edges(tech_stack))
is unchanged.
core/ingestion/parser.py, core/generation/models.py: small cleanups
feeding the new wiki_pages.summary field through the generation
pipeline.
CLI
Tests
_resolve_symbol covering separator-style mismatches between
Class.method and Class::method and MultipleResultsFound handling
on duplicate lookup keys.
10-tool surface.
stem-priority cleanup.
Docs
abstract, comparison table, and competitor matrix; adds get_answer,
get_symbol, and compact-default rows to the tool table; documents
the test-files-in-wiki and single-call-answer additions in the
"What's new" section.
wiki_pages and the new answer_cache table; the page-generator
section documents the test-file inclusion rule; references to "8
tools" updated to 10.
get_symbol, the two migrations, and test-file indexing; Changed
entry for the get_context compact default.
website/index.md, website/concepts.md, website/mcp-server.md,
website/claude-md-generator.md: tool counts updated; mcp-server.md
gains full sections (parameters, returns, examples) for get_answer
and get_symbol and documents the new compact parameter on
get_context.