repowise-dev · RaghavChamadiya · Apr 7, 2026 · Apr 7, 2026 · Apr 7, 2026 · Apr 7, 2026
@@ -0,0 +1,128 @@
+# Changelog
+
+All notable changes to repowise are documented here.
+This project follows [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+
+## [0.2.0] — 2026-04-07
+
+A large overhaul: faster indexing, smarter doc generation, transactional storage,
+new analysis capabilities, and a completely revamped web UI that surfaces every
+new signal — all without changing the eight MCP tool surface.
+
+### Added
+
+#### Pipeline & ingestion
+- **Parallel indexing.** AST parsing now runs across all CPU cores via
+  `ProcessPoolExecutor`. Graph construction and git history indexing run
+  concurrently with `asyncio.gather`. Per-file git history fetched through a
+  thread executor with a semaphore.
+- **RAG-aware doc generation.** Pages are generated in topological order; each
+  generation prompt now includes summaries of the file's direct dependencies,
+  pulled from the vector store of already-generated pages.
+- **Atomic three-store coordinator.** New `AtomicStorageCoordinator` buffers
+  writes across SQL, the in-memory dependency graph, and the vector store, then
+  flushes them as a single transaction. Failure in any store rolls back all three.
+- **Dynamic import hint extractors.** The dependency graph now captures edges
+  that pure AST parsing misses: Django `INSTALLED_APPS` / `ROOT_URLCONF` /
+  `MIDDLEWARE`, pytest `conftest.py` fixture wiring, and Node/TS path aliases
+  from `tsconfig.json` and `package.json` `exports`.
+
+#### Analysis
+- **Temporal hotspot decay.** New `temporal_hotspot_score` column on
+  `git_metadata`, computed as `Σ exp(-ln2 · age_days / 180) · min(lines/100, 3)`
+  per commit. Hotspot ranking now uses this score; commits from a year ago
+  contribute ~25% as much as commits from today.
+- **Percentile ranks via SQL window function.** `recompute_git_percentiles()`
+  is now a single `PERCENT_RANK() OVER (PARTITION BY repo ORDER BY ...)` UPDATE
+  instead of an in-Python sort. Faster and correct on large repos.
+- **PR blast radius analyzer.** New `PRBlastRadiusAnalyzer` returns direct
+  risks, transitive affected files, co-change warnings, recommended reviewers,
+  test gaps, and an overall 0–10 risk score. Surfaced via `get_risk(changed_files=...)`
+  and a new web page.
+- **Security pattern scanner.** Indexing now runs `SecurityScanner` over each
+  file. Findings (eval/exec, weak crypto, raw SQL string construction,
+  hardcoded secrets, `pickle.loads`, etc.) are stored in a new
+  `security_findings` table.
+- **Knowledge map.** Top owners, "bus factor 1" knowledge silos (>80% single
+  owner), and high-centrality "onboarding targets" with thin documentation —
+  surfaced in `get_overview` and the web overview page.
+
+#### LLM cost tracking
+- New `llm_costs` table records every LLM call (model, tokens, USD cost).
+- `CostTracker` aggregates session totals; pricing covers Claude 4.6 family,
+  GPT-4.1 family, and Gemini.
+- New `repowise costs` CLI: `--since`, `--by operation|model|day`.
+- Indexing progress bar shows a live `Cost: $X.XXXX` counter.
+
+#### MCP tool enhancements (still 8 tools — strictly more capable)
+- `get_risk(targets, changed_files=None)` — when `changed_files` is provided,
+  returns the full PR blast-radius report (transitive affected, co-change
+  warnings, recommended reviewers, test gaps, overall 0–10 score). Per-file
+  responses now include `test_gap: bool` and `security_signals: list`.
+- `get_overview()` — now includes a `knowledge_map` block (top owners, silos,
+  onboarding targets).
+- `get_dead_code(min_confidence?, include_internals?, include_zombie_packages?)` —
+  sensitivity controls for false positives in framework-heavy code.
+
+#### REST endpoints (new)
+- `GET /api/repos/{id}/costs` and `/costs/summary` — grouped LLM spend.
+- `GET /api/repos/{id}/security` — security findings, filterable by file/severity.
+- `POST /api/repos/{id}/blast-radius` — PR impact analysis.
+- `GET /api/repos/{id}/knowledge-map` — owners / silos / onboarding targets.
+- `GET /api/repos/{id}/health/coordinator` — three-store drift status.
+- `GET /api/repos/{id}/hotspots` now returns `temporal_hotspot_score` and is
+  ordered by it.
+- `GET /api/repos/{id}/git-metadata` now returns `test_gap`.
+- Job SSE stream now emits `actual_cost_usd` (running cost since job start).
+
+#### Web UI (new pages and components)
+- **Costs page** — daily bar chart, grouped tables by operation/model/day.
+- **Blast Radius page** — paste files (or click hotspot suggestion chips) to
+  see risk gauge, transitive impact, co-change warnings, reviewers, test gaps.
+- **Knowledge Map card** on the overview dashboard.
+- **Trend column** on the hotspots table with flame indicator (default sort).
+- **Security Panel** in the wiki page right sidebar.
+- **"No tests" badge** on wiki pages with no detected test file.
+- **System Health card** on the settings page (SQL / Vector / Graph counts +
+  drift % + status).
+- **Live cost indicator** on the generation progress bar.
+
+#### CLI
+- `repowise costs [--since DATE] [--by operation|model|day]` — new command.
+- `repowise dead-code` — new flags `--min-confidence`, `--include-internals`,
+  `--include-zombie-packages`, `--no-unreachable`, `--no-unused-exports`.
+- `repowise doctor` — new Check #10 reports coordinator drift across all
+  three stores. `--repair` deletes orphaned vectors and rebuilds missing graph
+  nodes from SQL.
+
+### Fixed
+- C++ dependency resolution edge cases.
+- Decision extraction timeout on very large histories.
+- Resume / progress bar visibility for oversized files.
+- Coordinator `health_check` falsely reporting 100% drift on LanceDB / Pg
+  vector stores (was returning -1 for the count). Now uses `list_page_ids()`.
+- Coordinator `health_check` returning `null` graph node count when no
+  in-memory `GraphBuilder` is supplied. Now falls back to SQL `COUNT(*)`.
+
+### Internal
+- Three new Alembic migrations: `0009_llm_costs`, `0010_temporal_hotspot_score`,
+  `0011_security_findings`.
+- New module: `packages/core/.../persistence/coordinator.py`
+- New module: `packages/core/.../ingestion/dynamic_hints/` (5 files)
+- New module: `packages/core/.../analysis/pr_blast.py`
+- New module: `packages/core/.../analysis/security_scan.py`
+- New module: `packages/core/.../generation/cost_tracker.py`
+- New module: `packages/server/.../services/knowledge_map.py`
+
+### Compatibility
+- Existing repositories must run migrations: `repowise doctor` will detect
+  the missing tables and prompt; alternatively re-run `repowise init` to
+  rebuild from scratch.
+- The eight MCP tool names and signatures are backwards compatible — new
+  parameters are all optional.
+
+---
+
+## [0.1.31] — earlier
+
+See git history for releases prior to 0.2.0.
@@ -94,11 +94,11 @@ Most tools are designed around data entities — one module, one file, one symbo
 |---|---|---|
 | `get_overview()` | Architecture summary, module map, entry points | First call on any unfamiliar codebase |
 | `get_context(targets, include?)` | Docs, ownership, decisions, freshness for any targets — files, modules, or symbols | Before reading or modifying code. Pass all relevant targets in one call. |
-| `get_risk(targets)` | Hotspot scores, dependents, co-change partners, plain-English risk summary | Before modifying files — understand what could break |
+| `get_risk(targets?, changed_files?)` | Hotspot scores, dependents, co-change partners, blast radius, recommended reviewers, test gaps, security signals, 0–10 risk score | Before modifying files — understand what could break |
 | `get_why(query?)` | Three modes: NL search over decisions · path-based decisions for a file · no-arg health dashboard | Before architectural changes — understand existing intent |
 | `search_codebase(query)` | Semantic search over the full wiki. Natural language. | When you don't know where something lives |
 | `get_dependency_path(from, to)` | Connection path between two files, modules, or symbols | When tracing how two things are connected |
-| `get_dead_code()` | Unreachable code sorted by confidence and cleanup impact | Cleanup tasks |
+| `get_dead_code(min_confidence?, include_internals?, include_zombie_packages?)` | Unreachable code sorted by confidence and cleanup impact | Cleanup tasks |
 | `get_architecture_diagram(module?)` | Mermaid diagram for the repo or a specific module | Documentation and presentation |
 
 ### Tool call comparison — a real task
@@ -172,9 +172,13 @@ This is what happens when an AI agent has real codebase intelligence.
 | **Symbols** | Searchable index of every function, class, and method |
 | **Coverage** | Doc freshness per file with one-click regeneration |
 | **Ownership** | Contributor attribution and bus factor risk |
-| **Hotspots** | Ranked high-churn files with commit history |
+| **Hotspots** | Ranked by trend-weighted score (180-day decay) and churn |
 | **Dead Code** | Unused code with confidence scores and bulk actions |
 | **Decisions** | Architectural decisions with staleness monitoring |
+| **Costs** | LLM spend by day, model, or operation, with running session totals |
+| **Blast Radius** | Paste a PR file list, see transitive impact, reviewers, and test gaps |
+| **Knowledge Map** | Top owners, bus-factor silos, and onboarding targets on the dashboard |
+| **System Health** | SQL/vector/graph drift status from the atomic store coordinator |
 
 ---
 
@@ -333,9 +337,18 @@ repowise search "<query>"         # semantic search over the wiki
 repowise status                   # coverage, freshness, dead code summary
 
 # Dead code
-repowise dead-code                # full report
-repowise dead-code --safe-only    # only safe-to-delete findings
-repowise dead-code resolve <id>   # mark resolved / false positive
+repowise dead-code                          # full report
+repowise dead-code --safe-only              # only safe-to-delete findings
+repowise dead-code --min-confidence 0.8     # raise the confidence threshold
+repowise dead-code --include-internals      # include private/underscore symbols
+repowise dead-code --include-zombie-packages  # include unused declared packages
+repowise dead-code resolve <id>             # mark resolved / false positive
+
+# Cost tracking
+repowise costs                    # total LLM spend to date
+repowise costs --by operation     # grouped by operation type
+repowise costs --by model         # grouped by model
+repowise costs --by day           # grouped by day
 
 # Decisions
 repowise decision add             # record a decision (interactive)
@@ -348,7 +361,8 @@ repowise generate-claude-md       # regenerate CLAUDE.md
 
 # Utilities
 repowise export [PATH]            # export wiki as markdown files
-repowise doctor                   # check setup, API keys, connectivity
+repowise doctor                   # check setup, API keys, store drift
+repowise doctor --repair          # check and fix detected store mismatches
 repowise reindex                  # rebuild vector store (no LLM calls)
 ```
 

@@ -6,4 +6,4 @@
 AI-generated documentation.
 """
 
-__version__ = "0.1.31"
+__version__ = "0.2.0"
@@ -0,0 +1,157 @@
+"""``repowise costs`` — display LLM cost history from the cost ledger."""
+
+from __future__ import annotations
+
+from datetime import datetime
+from pathlib import Path
+from typing import Any
+
+import click
+from rich.table import Table
+
+from repowise.cli.helpers import (
+    console,
+    get_db_url_for_repo,
+    resolve_repo_path,
+    run_async,
+)
+
+
+def _parse_date(value: str | None) -> datetime | None:
+    """Parse an ISO date string into a datetime, or return None."""
+    if value is None:
+        return None
+    try:
+        return datetime.fromisoformat(value)
+    except ValueError:
+        try:
+            from dateutil.parser import parse as _parse  # type: ignore[import-untyped]
+
+            return _parse(value)
+        except Exception as exc:
+            raise click.BadParameter(f"Cannot parse date '{value}': {exc}") from exc
+
+
+@click.command("costs")
+@click.argument("path", required=False, default=None)
+@click.option(
+    "--since",
+    default=None,
+    metavar="DATE",
+    help="Only show costs since this date (ISO format, e.g. 2026-01-01).",
+)
+@click.option(
+    "--by",
+    "group_by",
+    type=click.Choice(["operation", "model", "day"]),
+    default="operation",
+    show_default=True,
+    help="Group costs by operation, model, or day.",
+)
+@click.option(
+    "--repo-path",
+    "repo_path_flag",
+    default=None,
+    metavar="PATH",
+    help="Repository path (defaults to current directory).",
+)
+def costs_command(
+    path: str | None,
+    since: str | None,
+    group_by: str,
+    repo_path_flag: str | None,
+) -> None:
+    """Show LLM cost history for a repository.
+
+    PATH (or --repo-path) defaults to the current directory.
+    """
+    # Support both positional PATH and --repo-path flag
+    raw_path = path or repo_path_flag
+    repo_path = resolve_repo_path(raw_path)
+
+    repowise_dir = repo_path / ".repowise"
+    if not repowise_dir.exists():
+        console.print("[yellow]No .repowise/ directory found. Run 'repowise init' first.[/yellow]")
+        return
+
+    since_dt = _parse_date(since)
+
+    rows = run_async(_query_costs(repo_path, since=since_dt, group_by=group_by))
+
+    if not rows:
+        msg = "No cost records found"
+        if since_dt:
+            msg += f" since {since_dt.date()}"
+        msg += ". Run 'repowise init' with an LLM provider to generate costs."
+        console.print(f"[yellow]{msg}[/yellow]")
+        return
+
+    # Build table
+    group_label = group_by.capitalize()
+    table = Table(
+        title=f"LLM Costs — grouped by {group_by}",
+        border_style="dim",
+        show_footer=True,
+    )
+    table.add_column(group_label, style="cyan", footer="[bold]TOTAL[/bold]")
+    table.add_column("Calls", justify="right", footer=str(sum(r["calls"] for r in rows)))
+    table.add_column(
+        "Input Tokens",
+        justify="right",
+        footer=f"{sum(r['input_tokens'] for r in rows):,}",
+    )
+    table.add_column(
+        "Output Tokens",
+        justify="right",
+        footer=f"{sum(r['output_tokens'] for r in rows):,}",
+    )
+    table.add_column(
+        "Cost USD",
+        justify="right",
+        footer=f"[bold green]${sum(r['cost_usd'] for r in rows):.4f}[/bold green]",
+    )
+
+    for row in rows:
+        table.add_row(
+            str(row["group"] or "—"),
+            str(row["calls"]),
+            f"{row['input_tokens']:,}",
+            f"{row['output_tokens']:,}",
+            f"[green]${row['cost_usd']:.4f}[/green]",
+        )
+
+    console.print()
+    console.print(table)
+    console.print()
+
+
+async def _query_costs(
+    repo_path: Path,
+    since: datetime | None,
+    group_by: str,
+) -> list[dict[str, Any]]:
+    """Open the DB, look up the repo, and return aggregated cost rows."""
+    from repowise.core.generation.cost_tracker import CostTracker
+    from repowise.core.persistence import (
+        create_engine,
+        create_session_factory,
+        get_session,
+        init_db,
+    )
+    from repowise.core.persistence.crud import get_repository_by_path
+
+    url = get_db_url_for_repo(repo_path)
+    engine = create_engine(url)
+    await init_db(engine)
+    sf = create_session_factory(engine)
+
+    try:
+        async with get_session(sf) as session:
+            repo = await get_repository_by_path(session, str(repo_path))
+            if repo is None:
+                return []
+
+        tracker = CostTracker(session_factory=sf, repo_id=repo.id)
+        return await tracker.totals(since=since, group_by=group_by)
+    finally:
+        await engine.dispose()