diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index a93d3b3..e2df5a9 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -30,7 +30,7 @@ jobs: - run: pnpm -r build - run: pnpm -r typecheck - name: unit tests - run: node --test packages/suppression/dist/*.test.js packages/triage/dist/*.test.js + run: node --test packages/shared/dist/*.test.js packages/suppression/dist/*.test.js packages/triage/dist/*.test.js - name: bundle smoke test (release artifact builds + parses) run: | pnpm build:release diff --git a/CHANGELOG.md b/CHANGELOG.md index b240a82..7479a44 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,44 @@ this project uses [semantic versioning](https://semver.org) (pre-1.0: minor vers ## [Unreleased] +## [0.9.0] — agent-led: the engine on tap + +Splus flips from engine-*led* (push a finding list through a gate) to agent-*led* +(a curious reviewer pulls deterministic signal on demand). The engine becomes a +toolbelt, a per-repo contract is read first, and a reviewer's diligence compounds. + +### Added +- **`inspect` — the engine on tap.** Ask one deterministic question instead of + triaging a list: `definition` / `callers` / `blast_radius` / `complexity` / + `exports` / `imports`. New engine `inspect` subcommand reusing the existing + analysis tier (graph, symbols, SCIP, complexity); exposed as an MCP tool so the + reviewer can investigate — open call sites, confirm blast radius, recurse on a + smell — rather than relay the floor. +- **`floor` tool** — re-ground on the deterministic finding set for any scope, + without the directive. +- **`splus.md` — the repo's review contract, read first.** A human-authored file + (`./splus.md` layered over `~/.splus/splus.md`): prose preferences/nits injected + into the reviewer, plus **binding** `mute:`/`skip:` rules enforced at review + time (matching findings dropped and reported, never silently). New `preferences` + tool + a `prefs` skill to author it. +- **Compounding memory.** `note` records a discovered convention; `accept` now also + stores a recallable memory; `recall` surfaces past confirmed findings and + conventions relevant to a hunk (embedding match over `.splus-cache/memory.json`, + on the existing `Embedder` seam). +- **Skill bundle** (`skills/`): a `review` orchestrator playbook that fans out + **fresh, unbiased sub-agents** per unit (finder ≠ verifier) and degrades to a + sequential pass on hosts without sub-agents; plus `investigate` / `lenses` / + `verify` / `dispatch` references and the `prefs` skill. + +### Changed +- **`review` now orients instead of just answering** — it injects the `splus.md` + contract, enforces its binding rules, returns the floor, and drives the agent + through investigate → verify → report → teach with the toolbelt. Backward + compatible (the floor is still returned). +- **Import resolution understands TS ESM `.js` specifiers** (`import … from + "./x.js"` for an `x.ts` source), so callers / blast radius resolve on modern + TypeScript codebases — improving the existing blast-radius collector too. + ## [0.8.1] — splus reviews splus A patch release of precision and honesty fixes surfaced by running a full diff --git a/Cargo.lock b/Cargo.lock index f8de30f..432e923 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -67,6 +67,12 @@ version = "1.0.102" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "7f202df86484c868dbad7eaa557ef785d5c66295e41b460ef922eca0723b842c" +[[package]] +name = "bitflags" +version = "2.12.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "84d7ced0ae9557296835c32bf1b1e02b44c746701f898460fb000d7eaa84f00a" + [[package]] name = "bytes" version = "1.11.1" @@ -83,6 +89,12 @@ dependencies = [ "shlex", ] +[[package]] +name = "cfg-if" +version = "1.0.4" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9330f8b2ff13f34540b44e946ef35111825727b38d33286ef986142615121801" + [[package]] name = "clap" version = "4.6.1" @@ -141,12 +153,56 @@ version = "1.0.2" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "877a4ace8713b0bcf2a4e7eec82529c029f1d0619886d18145fea96c3ffe5c0f" +[[package]] +name = "errno" +version = "0.3.14" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "39cab71617ae0d63f51a36d69f866391735b51691dbda63cf6f96d042b63efeb" +dependencies = [ + "libc", + "windows-sys", +] + +[[package]] +name = "fastrand" +version = "2.4.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9f1f227452a390804cdb637b74a86990f2a7d7ba4b7d5693aac9b4dd6defd8d6" + [[package]] name = "find-msvc-tools" version = "0.1.9" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "5baebc0774151f905a1a2cc41989300b1e6fbb29aff0ceffa1064fdd3088d582" +[[package]] +name = "foldhash" +version = "0.1.5" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "d9c4f5dac5e15c24eb999c26181a6ca40b39fe946cbe4c263c7209467bc83af2" + +[[package]] +name = "getrandom" +version = "0.4.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "0de51e6874e94e7bf76d726fc5d13ba782deca734ff60d5bb2fb2607c7406555" +dependencies = [ + "cfg-if", + "libc", + "r-efi", + "wasip2", + "wasip3", +] + +[[package]] +name = "hashbrown" +version = "0.15.5" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9229cfe53dfd69f0609a49f65461bd93001ea1ef889cd5529dd176593f5338a1" +dependencies = [ + "foldhash", +] + [[package]] name = "hashbrown" version = "0.17.1" @@ -159,6 +215,12 @@ version = "0.5.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "2304e00983f87ffb38b55b444b5e3b60a884b5d30c0fca7d82fe33449bbe55ea" +[[package]] +name = "id-arena" +version = "2.3.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "3d3067d79b975e8844ca9eb072e16b31c3c1c36928edf9c6789548c524d0d954" + [[package]] name = "indexmap" version = "2.14.0" @@ -166,7 +228,9 @@ source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "d466e9454f08e4a911e14806c24e16fba1b4c121d1ea474396f396069cf949d9" dependencies = [ "equivalent", - "hashbrown", + "hashbrown 0.17.1", + "serde", + "serde_core", ] [[package]] @@ -190,18 +254,58 @@ version = "1.0.18" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "8f42a60cbdf9a97f5d2305f08a87dc4e09308d1276d28c869c684d7777685682" +[[package]] +name = "leb128fmt" +version = "0.1.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "09edd9e8b54e49e587e4f6295a7d29c3ea94d469cb40ab8ca70b288248a81db2" + +[[package]] +name = "libc" +version = "0.2.186" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "68ab91017fe16c622486840e4c83c9a37afeff978bd239b5293d61ece587de66" + +[[package]] +name = "linux-raw-sys" +version = "0.12.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "32a66949e030da00e8c7d4434b251670a91556f4144941d37452769c25d58a53" + +[[package]] +name = "log" +version = "0.4.32" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "953f07c43838f8e6f9758cab68bf5bed85465e7587ebe0b823f1bcd81978ad3a" + [[package]] name = "memchr" version = "2.8.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "6b947ae49db0d222b1dbc6b113ce7248a3fc3a6ca21b696717bfc000ba4484d8" +[[package]] +name = "once_cell" +version = "1.21.4" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9f7c3e4beb33f85d45ae3e3a1792185706c8e16d043238c593331cc7cd313b50" + [[package]] name = "once_cell_polyfill" version = "1.70.2" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "384b8ab6d37215f3c5301a95a4accb5d64aa607f1fcb26a11b5303878451b4fe" +[[package]] +name = "prettyplease" +version = "0.2.37" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "479ca8adacdd7ce8f1fb39ce9ecccbfe93a3f1344b3d0d97f20bc0196208f62b" +dependencies = [ + "proc-macro2", + "syn", +] + [[package]] name = "proc-macro2" version = "1.0.106" @@ -243,6 +347,12 @@ dependencies = [ "proc-macro2", ] +[[package]] +name = "r-efi" +version = "6.0.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "f8dcc9c7d52a811697d2151c701e0d08956f92b0e24136cf4cf27b57a6a0d9bf" + [[package]] name = "regex" version = "1.12.3" @@ -272,6 +382,25 @@ version = "0.8.10" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "dc897dd8d9e8bd1ed8cdad82b5966c3e0ecae09fb1907d58efaa013543185d0a" +[[package]] +name = "rustix" +version = "1.1.4" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "b6fe4565b9518b83ef4f91bb47ce29620ca828bd32cb7e408f0062e9930ba190" +dependencies = [ + "bitflags", + "errno", + "libc", + "linux-raw-sys", + "windows-sys", +] + +[[package]] +name = "semver" +version = "1.0.28" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8a7852d02fc848982e0c167ef163aaff9cd91dc640ba85e263cb1ce46fae51cd" + [[package]] name = "serde" version = "1.0.228" @@ -324,7 +453,7 @@ checksum = "f8fadd59c855ef2080decdef8ff161eb6661b86933c9d82e5ba29dc602a55aba" [[package]] name = "splus-engine" -version = "0.8.1" +version = "0.9.0" dependencies = [ "anyhow", "clap", @@ -332,6 +461,7 @@ dependencies = [ "regex", "serde", "serde_json", + "tempfile", "tree-sitter", "tree-sitter-bash", "tree-sitter-c", @@ -373,6 +503,19 @@ dependencies = [ "unicode-ident", ] +[[package]] +name = "tempfile" +version = "3.27.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "32497e9a4c7b38532efcdebeef879707aa9f794296a4f0244f6f69e9bc8574bd" +dependencies = [ + "fastrand", + "getrandom", + "once_cell", + "rustix", + "windows-sys", +] + [[package]] name = "tree-sitter" version = "0.25.10" @@ -549,12 +692,70 @@ version = "1.0.24" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "e6e4313cd5fcd3dad5cafa179702e2b244f760991f45397d14d4ebf38247da75" +[[package]] +name = "unicode-xid" +version = "0.2.6" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "ebc1c04c71510c7f702b52b7c350734c9ff1295c464a03335b00bb84fc54f853" + [[package]] name = "utf8parse" version = "0.2.2" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "06abde3611657adf66d383f00b093d7faecc7fa57071cce2578660c9f1010821" +[[package]] +name = "wasip2" +version = "1.0.3+wasi-0.2.9" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "20064672db26d7cdc89c7798c48a0fdfac8213434a1186e5ef29fd560ae223d6" +dependencies = [ + "wit-bindgen 0.57.1", +] + +[[package]] +name = "wasip3" +version = "0.4.0+wasi-0.3.0-rc-2026-01-06" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "5428f8bf88ea5ddc08faddef2ac4a67e390b88186c703ce6dbd955e1c145aca5" +dependencies = [ + "wit-bindgen 0.51.0", +] + +[[package]] +name = "wasm-encoder" +version = "0.244.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "990065f2fe63003fe337b932cfb5e3b80e0b4d0f5ff650e6985b1048f62c8319" +dependencies = [ + "leb128fmt", + "wasmparser", +] + +[[package]] +name = "wasm-metadata" +version = "0.244.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "bb0e353e6a2fbdc176932bbaab493762eb1255a7900fe0fea1a2f96c296cc909" +dependencies = [ + "anyhow", + "indexmap", + "wasm-encoder", + "wasmparser", +] + +[[package]] +name = "wasmparser" +version = "0.244.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "47b807c72e1bac69382b3a6fb3dbe8ea4c0ed87ff5629b8685ae6b9a611028fe" +dependencies = [ + "bitflags", + "hashbrown 0.15.5", + "indexmap", + "semver", +] + [[package]] name = "windows-link" version = "0.2.1" @@ -570,6 +771,100 @@ dependencies = [ "windows-link", ] +[[package]] +name = "wit-bindgen" +version = "0.51.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "d7249219f66ced02969388cf2bb044a09756a083d0fab1e566056b04d9fbcaa5" +dependencies = [ + "wit-bindgen-rust-macro", +] + +[[package]] +name = "wit-bindgen" +version = "0.57.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "1ebf944e87a7c253233ad6766e082e3cd714b5d03812acc24c318f549614536e" + +[[package]] +name = "wit-bindgen-core" +version = "0.51.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "ea61de684c3ea68cb082b7a88508a8b27fcc8b797d738bfc99a82facf1d752dc" +dependencies = [ + "anyhow", + "heck", + "wit-parser", +] + +[[package]] +name = "wit-bindgen-rust" +version = "0.51.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "b7c566e0f4b284dd6561c786d9cb0142da491f46a9fbed79ea69cdad5db17f21" +dependencies = [ + "anyhow", + "heck", + "indexmap", + "prettyplease", + "syn", + "wasm-metadata", + "wit-bindgen-core", + "wit-component", +] + +[[package]] +name = "wit-bindgen-rust-macro" +version = "0.51.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "0c0f9bfd77e6a48eccf51359e3ae77140a7f50b1e2ebfe62422d8afdaffab17a" +dependencies = [ + "anyhow", + "prettyplease", + "proc-macro2", + "quote", + "syn", + "wit-bindgen-core", + "wit-bindgen-rust", +] + +[[package]] +name = "wit-component" +version = "0.244.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9d66ea20e9553b30172b5e831994e35fbde2d165325bec84fc43dbf6f4eb9cb2" +dependencies = [ + "anyhow", + "bitflags", + "indexmap", + "log", + "serde", + "serde_derive", + "serde_json", + "wasm-encoder", + "wasm-metadata", + "wasmparser", + "wit-parser", +] + +[[package]] +name = "wit-parser" +version = "0.244.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "ecc8ac4bc1dc3381b7f59c34f00b67e18f910c2c0f50015669dde7def656a736" +dependencies = [ + "anyhow", + "id-arena", + "indexmap", + "log", + "semver", + "serde", + "serde_derive", + "serde_json", + "unicode-xid", + "wasmparser", +] + [[package]] name = "zmij" version = "1.0.21" diff --git a/Cargo.toml b/Cargo.toml index fb697df..9e9d03c 100644 --- a/Cargo.toml +++ b/Cargo.toml @@ -6,7 +6,7 @@ resolver = "2" members = ["crates/splus-engine"] [workspace.package] -version = "0.8.1" +version = "0.9.0" edition = "2021" license = "MIT" repository = "https://github.com/kiwi-init/splus" diff --git a/README.md b/README.md index c59088d..d978a03 100644 --- a/README.md +++ b/README.md @@ -11,7 +11,6 @@ vibes, runs a real **review protocol** (detect → impact → triage → remedia the grounding; your agent does the reviewing. No account, no token, nothing leaves your machine. [![CI](https://github.com/kiwi-init/splus/actions/workflows/ci.yml/badge.svg)](https://github.com/kiwi-init/splus/actions/workflows/ci.yml) -[![Splus self-review](https://github.com/kiwi-init/splus/actions/workflows/splus-review.yml/badge.svg)](https://github.com/kiwi-init/splus/actions/workflows/splus-review.yml) @@ -75,19 +74,38 @@ your editor is the reviewer. Your agent connects to the local server and calls these: -| Tool | What it does | -| ----------- | --------------------------------------------------------------------------- | -| `review` | Review `working` / `staged` / `base..HEAD` / whole-repo (`all`) changes. | -| `dismiss` | Teach Splus a finding is noise — it generalizes to close variants. | -| `accept` | Teach Splus a finding was real — it reinforces close variants going forward. | -| `mute` | Mute an entire rule for this repo. | -| `learnings` | List what's been learned on this repo. | -| `index` | Build a SCIP index locally for the precise (compiler-grade) blast-radius tier. | - -One flow: `review` returns the grounded deterministic floor plus a directive that drives your agent -through the full protocol (triage → discover → verify) over the changed code. No API key, ever — the -model already in your editor does the reasoning. Learnings (both `dismiss` and `accept`) are stored -per-repo in `.splus-cache/learnings.json` — they stay in your checkout. +| Tool | What it does | +| ------------- | --------------------------------------------------------------------------- | +| `review` | Read `splus.md`, return the deterministic floor + a directive, drive the review. | +| `inspect` | The engine **on tap**: `definition` · `callers` · `blast_radius` · `complexity` · `exports` · `imports` — investigate on demand. | +| `floor` | Re-ground on the deterministic finding floor for a scope (no directive). | +| `preferences` | Show the merged `splus.md` contract (repo + `~/.splus`). | +| `recall` | Surface past confirmed findings / conventions relevant to a hunk. | +| `note` | Remember a repo convention you discovered (→ `recall`). | +| `dismiss` | Teach Splus a finding is noise — it generalizes to close variants. | +| `accept` | Teach Splus a finding was real — reinforces, and becomes recallable. | +| `mute` | Mute an entire rule for this repo. | +| `learnings` | List what's been learned on this repo. | +| `report` | Render the review as a standalone offline HTML report. | +| `index` | Build a SCIP index locally for the precise (compiler-grade) blast-radius tier. | + +Agent-led, one flow: `review` injects the repo's `splus.md` contract and returns the grounded +deterministic floor; **you** drive the review — pull signal on demand with `inspect`, verify before +posting, then `report` and teach. No API key, ever — the model already in your editor does the +reasoning. Learnings stay per-repo in `.splus-cache/` (suppressions in `learnings.json`, memory in +`memory.json`) — they never leave your checkout. + +### `splus.md` — the repo's review contract + +Drop a `splus.md` at the repo root (layered over your personal `~/.splus/splus.md`). Splus reads it +**first** on every review: prose preferences/nits guide the reviewer, and binding `mute: ` / +`skip: ` lines drop matching findings (and say so — never silently). The `prefs` skill scaffolds one. + +### Skills + +The `skills/` bundle drives the agent-led flow: `review` (fans out **fresh, unbiased sub-agents** per +unit — finder ≠ verifier — and degrades to a sequential pass where sub-agents aren't available) and +`prefs` (author `splus.md`). **Full reference: [`docs/TOOLS.md`](docs/TOOLS.md)** — every tool, parameter, and return shape. diff --git a/crates/splus-engine/Cargo.toml b/crates/splus-engine/Cargo.toml index 4ba2e8c..8249fdb 100644 --- a/crates/splus-engine/Cargo.toml +++ b/crates/splus-engine/Cargo.toml @@ -45,3 +45,4 @@ tree-sitter-bash = "0.25" prost = "0.13" [dev-dependencies] +tempfile = "3" diff --git a/crates/splus-engine/src/analysis/graph.rs b/crates/splus-engine/src/analysis/graph.rs index 17fae19..977089f 100644 --- a/crates/splus-engine/src/analysis/graph.rs +++ b/crates/splus-engine/src/analysis/graph.rs @@ -70,19 +70,28 @@ impl RepoGraph { } let base = segs.join("/"); - if self.files.contains(&base) { - return Some(base); - } - for ext in EXTS { - let cand = format!("{base}{ext}"); - if self.files.contains(&cand) { - return Some(cand); + // TS ESM commonly imports the *compiled* `.js` path for a `.ts` source + // (`import … from "./embed.js"`). Try the source-extension forms of any + // js-family specifier too, then the literal path. + let stems: Vec<&str> = match base.rsplit_once('.') { + Some((stem, "js" | "jsx" | "mjs" | "cjs")) => vec![stem, base.as_str()], + _ => vec![base.as_str()], + }; + for stem in stems { + if self.files.contains(stem) { + return Some(stem.to_string()); } - } - for ext in EXTS { - let cand = format!("{base}/index{ext}"); - if self.files.contains(&cand) { - return Some(cand); + for ext in EXTS { + let cand = format!("{stem}{ext}"); + if self.files.contains(&cand) { + return Some(cand); + } + } + for ext in EXTS { + let cand = format!("{stem}/index{ext}"); + if self.files.contains(&cand) { + return Some(cand); + } } } None @@ -227,6 +236,17 @@ mod tests { assert!(info.transitive >= 1); } + #[test] + fn resolves_ts_esm_dot_js_imports() { + // ESM/NodeNext TS imports the compiled `.js` path for a `.ts` source. + let g = graph_from(&[ + ("packages/x/src/embed.ts", "export function hashEmbedder(){ return 1; }"), + ("packages/x/src/index.ts", "import { hashEmbedder } from './embed.js';\nhashEmbedder();"), + ]); + let info = g.callers_of("hashEmbedder", "packages/x/src/embed.ts"); + assert_eq!(info.direct, vec!["packages/x/src/index.ts".to_string()]); + } + #[test] fn does_not_match_same_name_from_other_file() { let g = graph_from(&[ diff --git a/crates/splus-engine/src/inspect.rs b/crates/splus-engine/src/inspect.rs new file mode 100644 index 0000000..d6ddfcb --- /dev/null +++ b/crates/splus-engine/src/inspect.rs @@ -0,0 +1,321 @@ +//! On-demand code intelligence — the engine "on tap". +//! +//! Where `review` pushes a finding floor, `inspect` lets the agent in the chair +//! *pull* one deterministic signal at a time while it investigates: who calls a +//! symbol, where it's defined, its blast radius, a file's complexity / exports / +//! imports. Every kind reuses an analysis the review pipeline already computes — +//! this module only re-exposes them as addressable, single-question queries. +//! +//! Resolution is the JS/TS name+import heuristic (same honest, known-confidence +//! tier as blast radius); the SCIP precise tier is used for `blast_radius` when an +//! index is present. Non-JS/TS targets resolve to an empty, honest answer. + +use crate::analysis::graph::RepoGraph; +use crate::analysis::complexity::function_complexities; +use crate::analysis::scip::ScipGraph; +use crate::analysis::symbols::{self, Symbol}; +use crate::collectors::Lang; +use anyhow::{bail, Result}; +use serde_json::{json, Value}; +use std::fs; +use std::path::Path; + +/// The questions a reviewer can ask the engine on demand. +pub const KINDS: &[&str] = &[ + "definition", + "callers", + "blast_radius", + "complexity", + "exports", + "imports", +]; + +/// Answer one `inspect` query as canonical JSON. +/// +/// `target` is a **symbol** for `definition` / `callers` / `blast_radius` and a +/// **file path** for `complexity` / `exports` / `imports`. `file` optionally pins +/// the defining file for a symbol query (disambiguates same-named symbols). +pub fn inspect( + root: &Path, + kind: &str, + target: &str, + file: Option<&str>, + scip_path: Option<&Path>, +) -> Result { + match kind { + "definition" => Ok(definition(root, target)), + "callers" => Ok(callers(root, target, file)), + "blast_radius" => Ok(blast_radius(root, target, file, scip_path)), + "complexity" => complexity(root, target), + "exports" => exports(root, target), + "imports" => imports(root, target), + other => bail!("unknown --kind: {other} (use {})", KINDS.join("|")), + } +} + +// --- symbol queries (JS/TS name heuristic) --------------------------------- + +/// Every definition of `name` across the indexed JS/TS surface. +fn definition(root: &Path, name: &str) -> Value { + let g = RepoGraph::build(root); + let defs = definitions(&g, root, name); + json!({ + "kind": "definition", + "symbol": name, + "definitions": defs.iter().map(|(f, s)| json!({ + "file": f, + "kind": s.kind.as_str(), + "line": s.start_line, + "exported": s.exported, + })).collect::>(), + "note": resolution_note(&g), + }) +} + +/// Files that import `name` from its defining file (direct + transitive count). +fn callers(root: &Path, name: &str, file: Option<&str>) -> Value { + let g = RepoGraph::build(root); + let Some(def_file) = resolve_def_file(&g, root, name, file) else { + return not_found(&g, name); + }; + let info = g.callers_of(name, &def_file); + json!({ + "kind": "callers", + "symbol": name, + "defFile": def_file, + "directCallers": info.direct, + "transitiveCallers": info.transitive, + "crossesApiBoundary": info.crosses_api, + "note": resolution_note(&g), + }) +} + +/// Full blast radius for `name` — SCIP-precise when an index is loaded, else the +/// name+import heuristic, always with an honest `resolutionConfidence`/`method`. +fn blast_radius(root: &Path, name: &str, file: Option<&str>, scip_path: Option<&Path>) -> Value { + let g = RepoGraph::build(root); + let Some(def_file) = resolve_def_file(&g, root, name, file) else { + return not_found(&g, name); + }; + + // Precise tier: resolve via the symbol's definition line against the SCIP index. + if let Some(scip) = scip_path.and_then(ScipGraph::load) { + if let Some(def_line) = def_line_of(&g, root, name, &def_file) { + if let Some(pc) = scip.resolve(&def_file, def_line, name) { + return json!({ + "kind": "blast_radius", + "symbol": name, + "defFile": def_file, + "directCallers": pc.direct.len(), + "transitiveCallers": pc.total_refs, + "filesAffected": pc.direct, + "crossesApiBoundary": pc.crosses_api, + "resolutionConfidence": 0.97, + "resolutionMethod": "scip (compiler-grade)", + }); + } + } + } + + let info = g.callers_of(name, &def_file); + json!({ + "kind": "blast_radius", + "symbol": name, + "defFile": def_file, + "directCallers": info.direct.len(), + "transitiveCallers": info.transitive, + "filesAffected": info.direct, + "crossesApiBoundary": info.crosses_api, + "resolutionConfidence": 0.6, + "resolutionMethod": "name+import heuristic (TS/JS)", + }) +} + +// --- file queries ---------------------------------------------------------- + +/// Cognitive complexity per function in a file (optionally one function). +fn complexity(root: &Path, file: &str) -> Result { + let src = read_file(root, file)?; + let mut fns = function_complexities(Lang::from_path(file), &src); + fns.sort_by(|a, b| b.score.cmp(&a.score)); + let max = fns.iter().map(|f| f.score).max().unwrap_or(0); + Ok(json!({ + "kind": "complexity", + "file": file, + "max": max, + "functions": fns.iter().map(|f| json!({ + "name": f.name, + "startLine": f.start_line, + "endLine": f.end_line, + "score": f.score, + })).collect::>(), + })) +} + +/// Exported symbols of a file. +fn exports(root: &Path, file: &str) -> Result { + let src = read_file(root, file)?; + let (syms, _) = symbols::extract(Lang::from_path(file), &src); + Ok(json!({ + "kind": "exports", + "file": file, + "exports": syms.iter().filter(|s| s.exported).map(|s| json!({ + "name": s.name, + "kind": s.kind.as_str(), + "line": s.start_line, + })).collect::>(), + })) +} + +/// Import statements of a file. +fn imports(root: &Path, file: &str) -> Result { + let src = read_file(root, file)?; + let (_, imps) = symbols::extract(Lang::from_path(file), &src); + Ok(json!({ + "kind": "imports", + "file": file, + "imports": imps.iter().map(|i| json!({ + "names": i.names, + "source": i.source, + })).collect::>(), + })) +} + +// --- helpers --------------------------------------------------------------- + +fn read_file(root: &Path, file: &str) -> Result { + let full = root.join(file); + fs::read_to_string(&full) + .map_err(|e| anyhow::anyhow!("cannot read {}: {e}", full.display())) +} + +/// All `(file, Symbol)` definitions of `name` across the indexed JS/TS files. +fn definitions(g: &RepoGraph, root: &Path, name: &str) -> Vec<(String, Symbol)> { + let mut files: Vec<&String> = g.files.iter().collect(); + files.sort(); + let mut out = Vec::new(); + for f in files { + if let Ok(src) = fs::read_to_string(root.join(f)) { + for s in symbols::extract(Lang::from_path(f), &src).0 { + if s.name == name { + out.push((f.clone(), s)); + } + } + } + } + out +} + +/// Pick the defining file for a symbol: the caller's `file` hint if it actually +/// defines `name`, else the first exported definition, else the first definition. +fn resolve_def_file(g: &RepoGraph, root: &Path, name: &str, file: Option<&str>) -> Option { + let defs = definitions(g, root, name); + if let Some(hint) = file { + if defs.iter().any(|(f, _)| f == hint) { + return Some(hint.to_string()); + } + } + defs.iter() + .find(|(_, s)| s.exported) + .or_else(|| defs.first()) + .map(|(f, _)| f.clone()) +} + +fn def_line_of(g: &RepoGraph, root: &Path, name: &str, def_file: &str) -> Option { + definitions(g, root, name) + .into_iter() + .find(|(f, _)| f == def_file) + .map(|(_, s)| s.start_line) +} + +fn not_found(g: &RepoGraph, name: &str) -> Value { + json!({ + "kind": "not_found", + "symbol": name, + "definitions": [], + "note": format!("No definition of `{name}` found in the indexed surface. {}", resolution_note(g)), + }) +} + +fn resolution_note(g: &RepoGraph) -> String { + let mut note = format!( + "Heuristic graph over {} JS/TS file(s); non-JS/TS symbols are not resolved here.", + g.indexed + ); + if g.truncated { + note.push_str(" Index truncated at the file cap — results may be partial."); + } + note +} + +#[cfg(test)] +mod tests { + use super::*; + use std::fs; + use tempfile::tempdir; + + fn write(dir: &Path, path: &str, src: &str) { + let full = dir.join(path); + fs::create_dir_all(full.parent().unwrap()).unwrap(); + fs::write(full, src).unwrap(); + } + + fn fixture() -> tempfile::TempDir { + let d = tempdir().unwrap(); + let r = d.path(); + write(r, "src/utils/auth.ts", "export function validateToken(t){ return !!t; }\nfunction helper(){ return 1; }\n"); + write(r, "src/api/login.ts", "import { validateToken } from '../utils/auth';\nexport function login(){ return validateToken('y'); }\n"); + d + } + + #[test] + fn definition_finds_exported_symbol() { + let d = fixture(); + let v = inspect(d.path(), "definition", "validateToken", None, None).unwrap(); + let defs = v["definitions"].as_array().unwrap(); + assert_eq!(defs.len(), 1); + assert_eq!(defs[0]["file"], "src/utils/auth.ts"); + assert_eq!(defs[0]["exported"], true); + } + + #[test] + fn callers_finds_importer_across_api_boundary() { + let d = fixture(); + let v = inspect(d.path(), "callers", "validateToken", None, None).unwrap(); + assert_eq!(v["directCallers"].as_array().unwrap().len(), 1); + assert_eq!(v["crossesApiBoundary"], true); + } + + #[test] + fn blast_radius_is_honest_without_scip() { + let d = fixture(); + let v = inspect(d.path(), "blast_radius", "validateToken", None, None).unwrap(); + assert_eq!(v["resolutionMethod"], "name+import heuristic (TS/JS)"); + assert_eq!(v["directCallers"], 1); + } + + #[test] + fn exports_lists_only_exported() { + let d = fixture(); + let v = inspect(d.path(), "exports", "src/utils/auth.ts", None, None).unwrap(); + let ex = v["exports"].as_array().unwrap(); + assert_eq!(ex.len(), 1); + assert_eq!(ex[0]["name"], "validateToken"); + } + + #[test] + fn imports_lists_specifiers() { + let d = fixture(); + let v = inspect(d.path(), "imports", "src/api/login.ts", None, None).unwrap(); + let im = v["imports"].as_array().unwrap(); + assert_eq!(im.len(), 1); + assert_eq!(im[0]["source"], "../utils/auth"); + } + + #[test] + fn unknown_symbol_is_honest_not_found() { + let d = fixture(); + let v = inspect(d.path(), "callers", "nope", None, None).unwrap(); + assert_eq!(v["kind"], "not_found"); + } +} diff --git a/crates/splus-engine/src/lib.rs b/crates/splus-engine/src/lib.rs index 602fcda..464636b 100644 --- a/crates/splus-engine/src/lib.rs +++ b/crates/splus-engine/src/lib.rs @@ -6,6 +6,7 @@ pub mod analysis; pub mod collectors; pub mod diff; +pub mod inspect; pub mod model; pub mod pipeline; pub mod render; diff --git a/crates/splus-engine/src/main.rs b/crates/splus-engine/src/main.rs index 3a11a42..8789cbb 100644 --- a/crates/splus-engine/src/main.rs +++ b/crates/splus-engine/src/main.rs @@ -27,6 +27,30 @@ struct Cli { enum Commands { /// Review a diff: staged changes, all working changes, or base..HEAD. Review(ReviewArgs), + /// Inspect code intelligence on demand — the engine "on tap". Answers one + /// question (definition / callers / blast_radius / complexity / exports / + /// imports) as JSON, so the agent can investigate instead of triaging a list. + Inspect(InspectArgs), +} + +#[derive(Args)] +struct InspectArgs { + /// Repository root. + #[arg(long, default_value = ".")] + root: PathBuf, + /// What to ask: definition | callers | blast_radius | complexity | exports | imports. + #[arg(long)] + kind: String, + /// The subject: a symbol name (definition/callers/blast_radius) or a file path + /// (complexity/exports/imports). + #[arg(long)] + target: String, + /// Pin the defining file for a symbol query (disambiguates same-named symbols). + #[arg(long)] + file: Option, + /// SCIP index for the precise blast-radius tier (else auto-detected). + #[arg(long)] + scip: Option, } #[derive(Args)] @@ -73,9 +97,40 @@ fn main() { exit(2); } }, + Commands::Inspect(args) => match run_inspect(args) { + Ok(()) => exit(0), + Err(e) => { + eprintln!("splus: error: {e:#}"); + exit(2); + } + }, } } +/// Auto-detect a SCIP index at the conventional locations under `root`. +fn detect_scip(root: &PathBuf, explicit: Option) -> Option { + if let Some(p) = explicit { + return p.exists().then_some(p); + } + ["index.scip", ".splus-cache/index.scip"] + .into_iter() + .map(|c| root.join(c)) + .find(|p| p.exists()) +} + +fn run_inspect(args: InspectArgs) -> Result<()> { + let scip = detect_scip(&args.root, args.scip); + let value = splus_engine::inspect::inspect( + &args.root, + &args.kind, + &args.target, + args.file.as_deref(), + scip.as_deref(), + )?; + println!("{}", serde_json::to_string_pretty(&value)?); + Ok(()) +} + fn run_review(args: ReviewArgs) -> Result { if !is_git_repo(&args.root) { anyhow::bail!("{} is not a git repository", args.root.display()); diff --git a/docs/TOOLS.md b/docs/TOOLS.md index 9211164..ebb8fd7 100644 --- a/docs/TOOLS.md +++ b/docs/TOOLS.md @@ -1,16 +1,22 @@ # MCP tools reference Splus is a local **MCP server** (`splus-mcp`). Your coding agent connects to it -over stdio and calls these six tools. Everything runs on your machine — there is -no API key and no cloud step; the coding agent connected over stdio is the reviewer. +over stdio and calls these tools. Everything runs on your machine — there is no +API key and no cloud step; the coding agent connected over stdio is the reviewer. | Tool | Mutates? | What it's for | |---|---|---| -| [`review`](#review) | no | Review the diff — grounded findings + drive the agent's discovery pass. | +| [`review`](#review) | no | Read `splus.md`, return the floor, drive the agent's review. | +| [`inspect`](#inspect) | no | The engine **on tap** — one code-intelligence question on demand. | +| [`floor`](#floor) | no | Re-ground on the deterministic finding floor for a scope. | +| [`preferences`](#preferences) | no | Show the merged `splus.md` contract. | +| [`recall`](#recall) | no | Surface confirmed findings / conventions for a hunk. | +| [`note`](#note) | yes | Remember a discovered repo convention (→ `recall`). | | [`dismiss`](#dismiss) | yes | Teach Splus a finding is **noise** (suppresses close variants). | -| [`accept`](#accept) | yes | Teach Splus a finding was **real** (reinforces close variants). | +| [`accept`](#accept) | yes | Teach Splus a finding was **real** (reinforces; becomes recallable). | | [`mute`](#mute) | yes | Silence an entire rule for this repo. | | [`learnings`](#learnings) | no | List what's been learned on this repo. | +| [`report`](#report) | no | Render the review as a standalone offline HTML report. | | [`index`](#index) | yes | Build a SCIP index for compiler-grade blast radius. | ## Typical flow @@ -41,10 +47,12 @@ Returns findings grouped `must-fix` / `concern` / `nit`, each with `file:line`, rule id, severity, confidence, a deterministic provenance **anchor**, an optional fix, and cross-file **blast radius**. Learned suppressions are applied first. -There is **one flow** and you are the driver: the response always ends with a -**discovery directive** that drives *you* (the agent) through the full protocol -(triage → discover → verify) over the changed files. No API key — Splus grounds -you with precise anchors; you do the reasoning. Run the protocol; don't relay. +There is **one flow** and you are the driver: the response begins with the repo's +[`splus.md`](#preferences) contract (preferences injected, binding `mute:`/`skip:` +rules already enforced) and ends with a **discovery directive** that drives *you* +(the agent) through the full protocol (triage → investigate → verify) over the +changed files. No API key — Splus grounds you with precise anchors and a toolbelt +(`inspect`, `floor`, `recall`); you do the reasoning. Run the protocol; don't relay. | Param | Type | Default | Description | |---|---|---|---| @@ -76,6 +84,91 @@ and the discovery directive that drives your review. --- +## `inspect` + +The engine **on tap** — ask one deterministic question while you investigate, +instead of triaging a list. JS/TS-aware (the same honest name+import tier as blast +radius, SCIP-precise for `blast_radius` when an index exists); non-JS/TS symbols +return an honest empty answer. + +| Param | Type | Description | +|---|---|---| +| `kind` | `definition` \| `callers` \| `blast_radius` \| `complexity` \| `exports` \| `imports` | Which question to ask. | +| `target` | string | A **symbol** (definition/callers/blast_radius) or a **file path** (complexity/exports/imports). | +| `file` | string | Pin the defining file for a symbol query (disambiguates same-named symbols). | +| `root` | string | Repo root. | + +```jsonc +// inspect(kind: "callers", target: "validateToken") +{ "kind": "callers", "symbol": "validateToken", "defFile": "src/utils/auth.ts", + "directCallers": ["src/api/login.ts"], "transitiveCallers": 0, "crossesApiBoundary": true } +``` + +> Use it: for every changed export, `inspect blast_radius` and open the call sites +> before you conclude the change is safe. Recurse when a hunk smells off. + +--- + +## `floor` + +Return the engine's deterministic finding **floor** for a scope as JSON — the same +grounded set `review` starts from, without the directive. The repo's `splus.md` +binding rules are applied; learned suppression is not. Use it to re-check a scope +mid-investigation. + +| Param | Type | Default | Description | +|---|---|---|---| +| `root` | string | server CWD | Repo root. | +| `mode` | `working` \| `staged` \| `base` \| `all` | `working` | Scope. | +| `base` | string | — | Base ref — required when `mode: "base"`. | + +--- + +## `preferences` + +Return the merged [`splus.md`](#) review contract for this repo (`./splus.md` +layered over `~/.splus/splus.md`), including its binding `mute:`/`skip:` rules. +`review` already injects it; call this to read it directly. + +`splus.md` is the repo's review contract, read **first** on every review: prose +preferences/nits guide the reviewer; `mute: ` and `skip: ` lines are +**binding** (matching findings are dropped and reported, never silently). The +`prefs` skill scaffolds one. + +| Param | Type | Description | +|---|---|---| +| `root` | string | Repo root. | + +--- + +## `recall` + +Surface past confirmed-real findings (`accept`) and discovered conventions +(`note`) most relevant to a hunk, symbol, or question — so a reviewer's diligence +compounds across sessions. Semantic (embedding) match over `.splus-cache/memory.json`. + +| Param | Type | Default | Description | +|---|---|---|---| +| `root` | string | server CWD | Repo root. | +| `query` | string | — | A hunk, symbol, error, or question to recall memories for. | +| `limit` | number | `5` | Max memories to return. | + +--- + +## `note` + +Remember a repo convention you discovered while reviewing (e.g. "this module uses +`Result`, never throws") so future reviews `recall` it. Complements `accept`. +Written to `.splus-cache/memory.json`; promotable into `splus.md` for a binding rule. + +| Param | Type | Description | +|---|---|---| +| `root` | string | Repo root. | +| `text` | string | The convention/context to remember, in one sentence. | +| `file` | string | The file/area it applies to, if specific. | + +--- + ## `dismiss` Teach Splus a finding is a false positive / noise on **this repo**, by its `id` @@ -137,6 +230,16 @@ List everything learned on this repo (dismissals, mutes, accepts) from --- +## `report` + +The final step of the review flow. Returns a self-contained HTML template (all +CSS/JS inline, opens offline) plus fill instructions; you fill the marked slots +with the verdict, your verified findings, and the file-level impact graph, and +write `splus-report.html` — the shareable artifact a dev keeps next to the diff. +Read-only; takes no parameters. + +--- + ## `index` Build a compiler-grade **SCIP index** so cross-file blast radius resolves precisely diff --git a/package.json b/package.json index 9d7d6bf..77a11d8 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "splus", - "version": "0.8.1", + "version": "0.9.0", "private": true, "description": "Splus — the precision-first, local-first code reviewer. A deterministic Rust engine your coding agent (Claude Code, Codex, OpenCode) calls over MCP. Open source, runs entirely on your machine.", "license": "MIT", diff --git a/packages/mcp/package.json b/packages/mcp/package.json index 2e2c816..bdca52d 100644 --- a/packages/mcp/package.json +++ b/packages/mcp/package.json @@ -1,6 +1,6 @@ { "name": "@splus/mcp", - "version": "0.8.1", + "version": "0.9.0", "type": "module", "description": "Splus MCP server (local) — a stdio MCP server your coding agent (Claude Code, Codex, OpenCode) connects to. Runs the deterministic review engine on your local checkout and applies this repo's learned suppressions. No account, no token, nothing leaves your machine.", "bin": { diff --git a/packages/mcp/src/index.ts b/packages/mcp/src/index.ts index 9c5e568..50b1167 100644 --- a/packages/mcp/src/index.ts +++ b/packages/mcp/src/index.ts @@ -21,13 +21,18 @@ * SPLUS_ENGINE path to the splus-engine binary (else auto-resolved / PATH) * * Tools: - * review — review staged / working / base..HEAD / whole-repo changes - * report — render the review as a standalone offline HTML report (final step) - * dismiss — teach Splus a finding is noise (generalizes semantically) - * accept — teach Splus a finding was real (reinforces close variants) - * mute — mute an entire rule for this repo - * learnings — list what's been learned on this repo - * index — build a SCIP index for the precise blast-radius tier + * review — review staged / working / base..HEAD / whole-repo changes (reads splus.md) + * inspect — the engine on tap: definition / callers / blast_radius / complexity / exports / imports + * floor — re-ground on the deterministic finding floor for a scope (no directive) + * preferences — show the merged splus.md contract (repo + ~/.splus) + * report — render the review as a standalone offline HTML report (final step) + * dismiss — teach Splus a finding is noise (generalizes semantically) + * accept — teach Splus a finding was real (reinforces + stores recallable memory) + * note — remember a discovered repo convention (→ recall) + * recall — surface confirmed findings / conventions relevant to a hunk + * mute — mute an entire rule for this repo + * learnings — list what's been learned on this repo + * index — build a SCIP index for the precise blast-radius tier * * Protocol note: on a stdio transport, stdout IS the MCP channel. Everything * human-facing goes to stderr or into a tool result — we never touch stdout. @@ -36,10 +41,22 @@ import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"; import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js"; import type { CallToolResult } from "@modelcontextprotocol/sdk/types.js"; import { z } from "zod"; -import { listChangedFiles, runEngine, type DiffMode, type Finding, type Report } from "@splus/shared"; +import { + applyPolicy, + inspect as engineInspect, + listChangedFiles, + loadSplusConfig, + runEngine, + type DiffMode, + type Finding, + type InspectKind, + type Report, + type SplusConfig, +} from "@splus/shared"; import { applySuppression, candidateText, + FileMemoryStore, FileSuppressionStore, type SuppressedFinding, } from "@splus/suppression"; @@ -60,15 +77,15 @@ const VERSION = typeof __SPLUS_VERSION__ !== "undefined" ? __SPLUS_VERSION__ : " * driver, never ask about an API key" contract has to live: a directive inside * a tool *result* arrives too late to stop the agent from asking the user first. */ -const SERVER_INSTRUCTIONS = `Splus turns the coding agent already in this session into a disciplined, precision-first code reviewer. There is ONE way to use it and you are the driver — the review is agent-driven and needs NO API key. +const SERVER_INSTRUCTIONS = `Splus turns the coding agent already in this session into a disciplined, precision-first code reviewer. You are the reviewer in the chair — the engine is yours to interrogate, not a list to relay. There is no API key and no clock: curiosity and verification are the job, and a wrong comment costs more than a slow review. When the user asks you to review code: -1. Call \`review\` (mode: working/staged/base/all; add precise:true for compiler-grade blast radius). Do NOT ask whether to run a "deterministic-only" review, and do NOT ask about an ANTHROPIC_API_KEY — neither exists in this flow; the complete review requires neither. -2. The result gives you the deterministic FLOOR of findings plus a directive. Execute it: read the changed files and run the protocol yourself — TRIAGE each finding (keep/suppress), DISCOVER what determinism missed (logic / security / intent / concurrency), and VERIFY every finding you would post by trying to refute it against the cited line. Drop anything you can't defend — a wrong comment costs more than a missed nit. -3. Report the survivors as must-fix / concern / nit with file:line and a concrete fix. Don't just relay the engine's findings — running the protocol IS the review. -4. Teach the repo as you go: \`dismiss \` when a finding is noise, \`accept \` when it was real. +1. \`review\` (mode: working/staged/base/all; precise:true for compiler-grade blast radius). It reads the repo's \`splus.md\` contract (its preferences + binding nits — honor them; they come first), returns the deterministic FLOOR of findings, and hands you a directive. Do NOT ask about a "deterministic-only" mode or an ANTHROPIC_API_KEY — neither exists here. +2. INVESTIGATE, don't triage a list. The floor is grounding; the review is what you find. Pull deterministic signal on demand with \`inspect\` (kind: definition | callers | blast_radius | complexity | exports | imports) — when an export looks risky, open its callers and confirm the blast radius; recurse when something smells off. Use \`floor\` to re-ground a file subset. Read the changed code for what determinism can't see: logic, security, intent, concurrency. +3. VERIFY every finding by trying to refute it against the cited line. Drop any you can't defend. Then REPORT survivors as must-fix / concern / nit with file:line and a concrete fix. +4. TEACH the repo: \`dismiss \` for noise, \`accept \` for real, \`note\` to record a convention. \`preferences\` shows the active \`splus.md\`; \`recall\` surfaces what was learned here before. -Other tools: \`mute\` silences a rule for this repo, \`learnings\` lists what's been taught, \`index\` builds a SCIP index for precise blast radius.`; +Other tools: \`report\` renders the offline HTML deliverable, \`mute\` silences a rule, \`learnings\` lists what's been taught, \`index\` builds a SCIP index for precise blast radius.`; const server = new McpServer({ name: "splus", version: VERSION }, { instructions: SERVER_INSTRUCTIONS }); @@ -94,6 +111,11 @@ function learningsPath(root: string): string { return join(root, ".splus-cache", "learnings.json"); } +/** The repo-local compounding-memory store (one file per repo). */ +function memoryPath(root: string): string { + return join(root, ".splus-cache", "memory.json"); +} + function toMode(mode: ReviewMode, base: string | null): DiffMode { switch (mode) { case "staged": @@ -176,6 +198,36 @@ function toAgentReport(report: Report, suppressed: SuppressedFinding[]) { }; } +/** + * The `splus.md` contract block, prepended to the review so the agent reads the + * repo's standing preferences BEFORE planning — and an honest record of any + * finding the binding `mute:`/`skip:` rules dropped (never silent). + */ +function prefsBlock(cfg: SplusConfig, dropped: { ruleId: string; file: string; reason: string }[]): string { + const lines: string[] = ["=== splus.md · the repo's review contract (read first) ==="]; + if (cfg.source === "none") { + lines.push( + "No splus.md found. Reviewing with engine defaults. If the user has standing preferences or", + "repo nits, offer to scaffold one (the `prefs` skill) — it makes the next review serve their taste.", + ); + } else { + lines.push( + `These are standing preferences (${cfg.source}); they override engine defaults and your own taste:`, + "", + cfg.raw.trim(), + ); + } + if (dropped.length) { + lines.push( + "", + `Dropped by splus.md policy (${dropped.length}): ` + + dropped.map((d) => `${d.ruleId}@${d.file} (${d.reason})`).join("; "), + ); + } + lines.push("=== end splus.md ==="); + return lines.join("\n"); +} + function summaryLine(report: Report, suppressedCount: number): string { const s = report.summary; const supp = suppressedCount > 0 ? ` · ${suppressedCount} suppressed by learnings` : ""; @@ -196,19 +248,19 @@ function discoveryDirective(files: string[]): string { (shown.map((f) => ` - ${f}`).join("\n") || " (no changed files)") + (more > 0 ? `\n …and ${more} more` : ""); return [ - "=== Splus · you are the reviewer (one flow — no API key) ===", - "The findings above are the DETERMINISTIC FLOOR — high-precision, each anchored to a pattern, metric, or cross-file graph edge. They are NOT the review. You are the senior reviewer in the chair: run the full protocol over the changed code yourself.", + "=== Splus · you are the reviewer (one flow — no API key, no clock) ===", + "The findings above are the DETERMINISTIC FLOOR — high-precision, each anchored to a pattern, metric, or cross-file graph edge. They are NOT the review. You are the senior reviewer in the chair, seeing this code for the first time: run the full protocol over the changed code yourself. Take the time it takes — curiosity is the job.", "", "Changed files:", list, "", "1. TRIAGE — for each finding above, decide keep vs suppress. Optimize for signal: suppress test fixtures, idiomatic patterns for the file's role, and pure style; keep what a senior reviewer would genuinely want fixed before merge. A noisy comment costs more than a missed nit.", - "2. DISCOVER — read the changed code and find what determinism cannot, each grounded in a line that exists:", + "2. DISCOVER — read the changed code and find what determinism cannot. Don't guess — INTERROGATE THE ENGINE with `inspect` (kind: definition | callers | blast_radius | complexity | exports | imports); when a hunk smells off, open its callers and confirm the blast radius before you move on. Each finding must be grounded in a line that exists:", " • correctness — off-by-one, missing await / unhandled error path, wrong condition, null/undefined deref, resource leak, broken invariant", " • security — injection / path-traversal / SSRF reachable from input, authz/IDOR gaps, unsafe deserialization, secret & credential handling, command or eval", " • intent — does the code do what its name, comments, and the change claim? dead, contradictory, or silently fail-open logic", " • failure & concurrency — races, partial writes, retries, fail-open where it must fail-closed", - " • blast radius — for any changed export with callers (see findings above), open each call site and confirm it still holds", + " • blast radius — for any changed export, `inspect callers` / `inspect blast_radius` and open each call site to confirm it still holds", "3. VERIFY — before posting anything, re-read each candidate's cited line and try to REFUTE it. Drop any you can't defend (already handled nearby, speculative, the line doesn't actually demonstrate it). A wrong comment costs more than a missed nit.", "4. REPORT — the survivors as must-fix / concern / nit with file:line and a concrete fix. Never invent a finding; every claim cites a real line.", "5. TEACH — `dismiss ` when the user agrees something is noise, `accept ` when they act on a real one — Splus learns this repo both ways.", @@ -334,6 +386,13 @@ server.registerTool( ); } + // The repo contract (`splus.md`): inject its prose, enforce its binding + // `mute:`/`skip:` rules. This runs BEFORE learned suppression so a stated + // preference always wins over the engine's defaults. + const cfg = loadSplusConfig(repo); + const policy = applyPolicy(report.findings, cfg); + report = withFindings(report, policy.kept, report.summary.suppressed); + // Learned suppression (on by default): drop findings already dismissed on // this repo (exact, rule-mute, or semantically similar). Best-effort. let suppressed: SuppressedFinding[] = []; @@ -357,10 +416,121 @@ server.registerTool( const payload = toAgentReport(report, suppressed); const body = `${summaryLine(report, suppressed.length)}\n\n${JSON.stringify(payload, null, 2)}${reinforcedNote}${preciseNote}`; - // The handoff, and the whole product: ground the agent with the deterministic - // floor, then drive it through the protocol. The directive is ALWAYS appended - // — there is no other path, no key, no headless mode to choose between. - return ok(`${body}\n\n${discoveryDirective(listChangedFiles(repo, dmode))}`); + // The handoff, and the whole product: read the repo contract first, ground the + // agent with the deterministic floor, then drive it through the protocol. The + // directive is ALWAYS appended — there is no other path, no key, no headless + // mode to choose between. + return ok( + `${prefsBlock(cfg, policy.dropped)}\n\n${body}\n\n${discoveryDirective(listChangedFiles(repo, dmode))}`, + ); + }, +); + +// --- inspect (the engine on tap) ------------------------------------------- + +server.registerTool( + "inspect", + { + title: "Inspect code intelligence on demand", + description: + "The engine ON TAP — ask ONE deterministic question instead of triaging a list. Use it to " + + "investigate while you review: when a changed export looks risky, pull its callers and blast " + + "radius and open the call sites; recurse when a hunk smells off. Local, instant, grounded. " + + "Kinds: 'definition' (where a symbol is defined), 'callers' (files importing it), " + + "'blast_radius' (full cross-file impact, SCIP-precise when an index exists else the name " + + "heuristic), 'complexity' (cognitive complexity per function in a file), 'exports' / 'imports' " + + "(a file's surface). `target` is a SYMBOL for definition/callers/blast_radius and a FILE PATH " + + "for complexity/exports/imports. Resolution is JS/TS-aware; non-JS/TS symbols return an honest " + + "empty answer.", + inputSchema: { + kind: z + .enum(["definition", "callers", "blast_radius", "complexity", "exports", "imports"]) + .describe("Which question to ask."), + target: z + .string() + .describe("Symbol name (definition/callers/blast_radius) or file path (complexity/exports/imports)."), + file: z + .string() + .optional() + .describe("Pin the defining file for a symbol query (disambiguates same-named symbols)."), + root: z.string().optional().describe("Repo root (default: server CWD)."), + }, + annotations: { readOnlyHint: true, openWorldHint: false }, + }, + async ({ kind, target, file, root }) => { + try { + const value = await engineInspect({ root: rootOf(root), kind: kind as InspectKind, target, file }); + return ok(JSON.stringify(value, null, 2)); + } catch (e) { + return fail(`inspect failed: ${e instanceof Error ? e.message : String(e)}`); + } + }, +); + +// --- floor (re-ground on the deterministic findings) ----------------------- + +server.registerTool( + "floor", + { + title: "Re-ground on the deterministic finding floor", + description: + "Return the engine's deterministic finding FLOOR for a scope as JSON — the same grounded set " + + "`review` starts from, but without the directive. Use it to re-check a file subset or a " + + "different scope mid-investigation. The repo's `splus.md` binding rules are applied; learned " + + "suppression is not (this is the raw floor).", + inputSchema: { + root: z.string().optional().describe("Repo root (default: server CWD)."), + mode: z + .enum(["working", "staged", "base", "all"]) + .optional() + .describe("Scope (default 'working')."), + base: z.string().optional().describe("Base git ref — used when mode='base'."), + }, + annotations: { readOnlyHint: true, openWorldHint: false }, + }, + async ({ root, mode, base }) => { + const repo = rootOf(root); + const m = (mode ?? "working") as ReviewMode; + if (m === "base" && !base) return fail("mode='base' requires a `base` ref."); + try { + const report = await runEngine({ root: repo, mode: toMode(m, base ?? null) }); + const cfg = loadSplusConfig(repo); + const policy = applyPolicy(report.findings, cfg); + const filtered = withFindings(report, policy.kept, report.summary.suppressed); + return ok(JSON.stringify(toAgentReport(filtered, []), null, 2)); + } catch (e) { + return fail(`Could not run the Splus engine: ${e instanceof Error ? e.message : String(e)}.`); + } + }, +); + +// --- preferences (the splus.md contract) ----------------------------------- + +server.registerTool( + "preferences", + { + title: "Show the active splus.md contract", + description: + "Return the merged `splus.md` review contract for this repo (repo `./splus.md` layered over " + + "`~/.splus/splus.md`), including its binding `mute:`/`skip:` rules. This is the repo's standing " + + "preferences + nits — `review` already injects it, but call this to read it directly.", + inputSchema: { + root: z.string().optional().describe("Repo root (default: server CWD)."), + }, + annotations: { readOnlyHint: true, openWorldHint: false }, + }, + async ({ root }) => { + const cfg = loadSplusConfig(rootOf(root)); + if (cfg.source === "none") { + return ok( + "No splus.md found (repo root or ~/.splus/splus.md). Reviewing with engine defaults — the " + + "`prefs` skill scaffolds one so the next review serves the repo's taste.", + ); + } + return ok( + `splus.md (${cfg.source})\nmuted rules: ${cfg.mutedRules.join(", ") || "—"}\n` + + `skip paths: ${cfg.skipPaths.join(", ") || "—"}\n\n${cfg.raw.trim()}`, + ); }, ); @@ -488,16 +658,21 @@ server.registerTool( } } + const memText = text ?? (found ? candidateText(found) : ""); await store.record({ fingerprint: id, rule_id: found?.rule_id ?? ruleId ?? "unknown", - text: text ?? (found ? candidateText(found) : ""), + text: memText, scope: "fingerprint", signal: "accepted", }); + // Also store it as compounding memory so future reviews can `recall` it. + if (memText) { + await new FileMemoryStore(memoryPath(repo)).remember({ kind: "accepted", text: memText }); + } return ok( - `Accepted ${id}. Splus will reinforce findings like this one on this repo going forward.`, + `Accepted ${id}. Splus will reinforce findings like this one — and \`recall\` it on future reviews.`, ); }, ); @@ -551,6 +726,59 @@ server.registerTool( }, ); +// --- recall (compounding memory) ------------------------------------------- + +server.registerTool( + "recall", + { + title: "Recall what was learned here before", + description: + "Surface past confirmed-real findings (`accept`) and discovered conventions (`note`) most " + + "relevant to a hunk, symbol, or question — so a reviewer's diligence compounds across sessions " + + "instead of starting cold. Semantic (embedding) match over .splus-cache/memory.json. Call it " + + "while investigating a risky area: 'have we been burned here before?'", + inputSchema: { + root: z.string().optional().describe("Repo root (default: server CWD)."), + query: z.string().describe("A hunk, symbol, error, or question to recall memories for."), + limit: z.number().optional().describe("Max memories to return (default 5)."), + }, + annotations: { readOnlyHint: true, openWorldHint: false }, + }, + async ({ root, query, limit }) => { + const store = new FileMemoryStore(memoryPath(rootOf(root))); + const hits = await store.recall(query, { limit }); + if (!hits.length) { + return ok("No relevant memories yet. Use `accept` on real findings and `note` for conventions to build them."); + } + return ok(JSON.stringify(hits, null, 2)); + }, +); + +// --- note (teach the repo a convention) ------------------------------------ + +server.registerTool( + "note", + { + title: "Record a convention the review discovered", + description: + "Remember a repo convention or context you discovered while reviewing (e.g. 'this module uses " + + "Result, never throws' or 'auth/ requires every handler to call requireSession first'), so " + + "future reviews `recall` it. Complements `accept` (which remembers confirmed findings). Written " + + "to .splus-cache/memory.json; promotable into splus.md for a binding rule.", + inputSchema: { + root: z.string().optional().describe("Repo root (default: server CWD)."), + text: z.string().describe("The convention/context to remember, in one sentence."), + file: z.string().optional().describe("The file/area this convention applies to, if specific."), + }, + annotations: { readOnlyHint: false, idempotentHint: true }, + }, + async ({ root, text, file }) => { + const store = new FileMemoryStore(memoryPath(rootOf(root))); + await store.remember({ kind: "note", text, file }); + return ok(`Noted. Splus will recall this on future reviews of this repo. (Promote it into splus.md to make it a binding rule.)`); + }, +); + // --- index (precise blast-radius tier) ------------------------------------- interface IndexResult { @@ -654,7 +882,7 @@ async function main(): Promise { await server.connect(transport); process.stderr.write( "splus-mcp ready (stdio) — local engine, no network · you are the reviewer; " + - "tools: review, report, dismiss, accept, mute, learnings, index\n", + "tools: review, inspect, floor, preferences, report, dismiss, accept, note, recall, mute, learnings, index\n", ); } diff --git a/packages/shared/src/index.ts b/packages/shared/src/index.ts index a452fae..cb72def 100644 --- a/packages/shared/src/index.ts +++ b/packages/shared/src/index.ts @@ -181,6 +181,52 @@ export async function runEngine(opts: RunOptions): Promise { return Report.parse(parsed); } +/** The on-demand questions the engine answers via `inspect`. */ +export type InspectKind = + | "definition" + | "callers" + | "blast_radius" + | "complexity" + | "exports" + | "imports"; + +export interface InspectOptions { + root: string; + kind: InspectKind; + /** Symbol name (definition/callers/blast_radius) or file path (complexity/exports/imports). */ + target: string; + /** Pin the defining file for a symbol query (disambiguates same-named symbols). */ + file?: string; + enginePath?: string; +} + +/** + * The engine "on tap": answer one code-intelligence question as parsed JSON. + * Mirrors `runEngine`'s honest failure handling (surface engine stderr, never a + * cryptic JSON-parse error). + */ +export async function inspect(opts: InspectOptions): Promise { + const bin = resolveEngine(opts.enginePath); + const args = [ + "inspect", + "--root", + resolve(opts.root), + "--kind", + opts.kind, + "--target", + opts.target, + ...(opts.file ? ["--file", opts.file] : []), + ]; + const { stdout, stderr, code } = await exec(bin, args); + try { + return JSON.parse(stdout); + } catch { + throw new Error( + `splus-engine inspect failed (exit ${code}): ${stderr.trim() || stdout.slice(0, 200) || "no output"}`, + ); + } +} + /** Stream the engine's pretty output straight to this process (CLI default). */ export function runEnginePretty(opts: RunOptions & { failOn?: string; noColor?: boolean }): Promise { const bin = resolveEngine(opts.enginePath); @@ -254,3 +300,6 @@ export function exceedsThreshold(report: Report, failOn: Severity): boolean { const t = severityRank(failOn); return report.findings.some((f) => severityRank(f.severity) >= t); } + +// The per-repo review contract (`splus.md`): loader + binding policy. +export * from "./splusMd.js"; diff --git a/packages/shared/src/splusMd.test.ts b/packages/shared/src/splusMd.test.ts new file mode 100644 index 0000000..fe19ab6 --- /dev/null +++ b/packages/shared/src/splusMd.test.ts @@ -0,0 +1,84 @@ +import { test } from "node:test"; +import assert from "node:assert/strict"; +import { mkdtempSync, writeFileSync, mkdirSync } from "node:fs"; +import { tmpdir } from "node:os"; +import { join } from "node:path"; +import { loadSplusConfig, applyPolicy, matchGlob, type SplusConfig } from "./splusMd.js"; +import type { Finding } from "./index.js"; + +function tmpRepo(splusMd?: string): string { + const dir = mkdtempSync(join(tmpdir(), "splus-md-")); + if (splusMd !== undefined) writeFileSync(join(dir, "splus.md"), splusMd); + return dir; +} + +function finding(over: Partial): Finding { + return { + id: "f1", + rule_id: "hygiene.console-log", + category: "hygiene", + severity: "low", + tier: "nit", + confidence: 0.8, + file: "src/api/users.ts", + region: { start_line: 1, start_col: 0, end_line: 1, end_col: 0 }, + title: "t", + message: "m", + anchor: { kind: "heuristic", detail: "d" }, + introduced: true, + source: "heuristics", + ...over, + }; +} + +test("loads repo contract and parses mute/skip directives", () => { + const repo = tmpRepo( + [ + "# splus.md", + "## nits", + "- console.log is fine in scripts.", + "- mute: hygiene.console-log", + "skip: generated/**", + ].join("\n"), + ); + // Isolate from any real ~/.splus by pointing home at the temp dir. + const cfg = loadSplusConfig(repo, repo); + assert.equal(cfg.source, "repo"); + assert.ok(cfg.raw.includes("console.log is fine")); + assert.deepEqual(cfg.mutedRules, ["hygiene.console-log"]); + assert.deepEqual(cfg.skipPaths, ["generated/**"]); +}); + +test("missing file yields an empty, non-throwing contract", () => { + const repo = tmpRepo(); + const cfg = loadSplusConfig(repo, repo); + assert.equal(cfg.source, "none"); + assert.equal(cfg.raw, ""); + assert.deepEqual(cfg.mutedRules, []); +}); + +test("applyPolicy drops muted rules and skip paths, keeps the rest", () => { + const cfg: SplusConfig = { + raw: "", + mutedRules: ["hygiene.console-log"], + skipPaths: ["generated/**"], + source: "repo", + }; + const findings = [ + finding({ id: "a", rule_id: "hygiene.console-log" }), // muted + finding({ id: "b", rule_id: "sec.sqli", file: "generated/db.ts" }), // skipped path + finding({ id: "c", rule_id: "sec.sqli", file: "src/api/users.ts" }), // kept + ]; + const { kept, dropped } = applyPolicy(findings, cfg); + assert.deepEqual(kept.map((f) => f.id), ["c"]); + assert.equal(dropped.length, 2); + assert.ok(dropped.every((d) => d.reason)); +}); + +test("matchGlob handles ** across separators and * within a segment", () => { + assert.ok(matchGlob("generated/**", "generated/a/b.ts")); + assert.ok(matchGlob("**/*.pb.go", "proto/gen/user.pb.go")); + assert.ok(matchGlob("examples/keys.sample.env", "examples/keys.sample.env")); + assert.ok(!matchGlob("src/*.ts", "src/a/b.ts")); + assert.ok(matchGlob("src/*.ts", "src/a.ts")); +}); diff --git a/packages/shared/src/splusMd.ts b/packages/shared/src/splusMd.ts new file mode 100644 index 0000000..170fcc2 Binary files /dev/null and b/packages/shared/src/splusMd.ts differ diff --git a/packages/suppression/src/index.ts b/packages/suppression/src/index.ts index 32b0d33..1820969 100644 --- a/packages/suppression/src/index.ts +++ b/packages/suppression/src/index.ts @@ -273,3 +273,6 @@ export async function applySuppression( } return { kept, suppressed, reinforced }; } + +// Compounding review memory (accept/note → recall). +export * from "./memory.js"; diff --git a/packages/suppression/src/memory.test.ts b/packages/suppression/src/memory.test.ts new file mode 100644 index 0000000..825ab53 --- /dev/null +++ b/packages/suppression/src/memory.test.ts @@ -0,0 +1,33 @@ +import { test } from "node:test"; +import assert from "node:assert/strict"; +import { mkdtempSync } from "node:fs"; +import { tmpdir } from "node:os"; +import { join } from "node:path"; +import { FileMemoryStore } from "./memory.js"; + +function storePath(): string { + return join(mkdtempSync(join(tmpdir(), "splus-mem-")), "memory.json"); +} + +test("remembers and recalls the most relevant memory first", async () => { + const store = new FileMemoryStore(storePath()); + await store.remember({ kind: "accepted", text: "double-charge on retry of settleInvoice without idempotency key" }); + await store.remember({ kind: "note", text: "we use Result, never throw in the billing module" }); + + const hits = await store.recall("retry settleInvoice idempotency double charge"); + assert.ok(hits.length >= 1); + assert.match(hits[0]!.text, /double-charge/); + assert.ok(hits[0]!.score > hits[hits.length - 1]!.score || hits.length === 1); +}); + +test("dedupes near-identical memories of the same kind", async () => { + const store = new FileMemoryStore(storePath()); + const a = await store.remember({ kind: "note", text: "tests may use any; fixtures are not reviewed" }); + const b = await store.remember({ kind: "note", text: "tests may use any; fixtures are not reviewed" }); + assert.equal(a.id, b.id); +}); + +test("recall on an empty store returns nothing, never throws", async () => { + const store = new FileMemoryStore(storePath()); + assert.deepEqual(await store.recall("anything"), []); +}); diff --git a/packages/suppression/src/memory.ts b/packages/suppression/src/memory.ts new file mode 100644 index 0000000..07c4440 --- /dev/null +++ b/packages/suppression/src/memory.ts @@ -0,0 +1,115 @@ +/** + * Compounding review memory — the other half of learning. + * + * Where the suppression store teaches Splus what to STOP flagging, the memory + * store remembers what mattered: confirmed-real findings (`accept`) and + * conventions the reviewer discovered (`note`). `recall` then surfaces them for a + * new hunk, so a reviewer's diligence persists across sessions instead of + * evaporating. Embedding-based via the shared `Embedder` seam (the deterministic, + * offline `hashEmbedder` by default; a transformer model drops in unchanged). + */ + +import { mkdir, readFile, writeFile } from "node:fs/promises"; +import { dirname } from "node:path"; +import { cosine, hashEmbedder, type Embedder } from "./embed.js"; + +export type MemoryKind = "note" | "accepted"; + +export interface Memory { + id: string; + kind: MemoryKind; + text: string; + scope: "repo" | "user"; + file?: string; + ts: string; + embedding: number[]; +} + +export interface RecallHit { + id: string; + kind: MemoryKind; + text: string; + file?: string; + /** Cosine similarity to the query (0..1). */ + score: number; +} + +export interface RememberInput { + kind: MemoryKind; + text: string; + scope?: "repo" | "user"; + file?: string; +} + +/** Two memories of the same kind closer than this are treated as duplicates. */ +const DUP_THRESHOLD = 0.97; + +/** A repo-local memory store (one JSON file per repo). */ +export class FileMemoryStore { + constructor( + private readonly path: string, + private readonly embedder: Embedder = hashEmbedder(), + ) {} + + private async load(): Promise { + try { + const parsed = JSON.parse(await readFile(this.path, "utf8")); + return Array.isArray(parsed) ? (parsed as Memory[]) : []; + } catch { + return []; + } + } + + private async save(memories: Memory[]): Promise { + await mkdir(dirname(this.path), { recursive: true }); + await writeFile(this.path, JSON.stringify(memories, null, 2)); + } + + /** Persist a memory; returns an existing near-duplicate instead of duplicating. */ + async remember(input: RememberInput): Promise { + const memories = await this.load(); + const embedding = this.embedder.embed(input.text); + const dup = memories.find( + (m) => m.kind === input.kind && cosine(m.embedding, embedding) >= DUP_THRESHOLD, + ); + if (dup) return dup; + + const memory: Memory = { + id: "m_" + hash(input.kind + "\0" + input.text), + kind: input.kind, + text: input.text, + scope: input.scope ?? "repo", + file: input.file, + ts: new Date().toISOString(), + embedding, + }; + memories.push(memory); + await this.save(memories); + return memory; + } + + /** The memories most semantically similar to `query`, best first. */ + async recall(query: string, opts: { limit?: number; minScore?: number } = {}): Promise { + const memories = await this.load(); + const q = this.embedder.embed(query); + return memories + .map((m) => ({ m, score: cosine(m.embedding, q) })) + .filter((x) => x.score >= (opts.minScore ?? 0.2)) + .sort((a, b) => b.score - a.score) + .slice(0, opts.limit ?? 5) + .map((x) => ({ id: x.m.id, kind: x.m.kind, text: x.m.text, file: x.m.file, score: round(x.score) })); + } +} + +function hash(s: string): string { + let h = 0x811c9dc5; + for (let i = 0; i < s.length; i++) { + h ^= s.charCodeAt(i); + h = Math.imul(h, 0x01000193); + } + return (h >>> 0).toString(16).padStart(8, "0"); +} + +function round(n: number): number { + return Math.round(n * 1000) / 1000; +} diff --git a/skills/prefs/SKILL.md b/skills/prefs/SKILL.md new file mode 100644 index 0000000..68e6303 --- /dev/null +++ b/skills/prefs/SKILL.md @@ -0,0 +1,64 @@ +--- +name: Splus Preferences +description: This skill should be used when the user wants to set up or change how Splus reviews this repo — "add a splus nit", "splus should never flag X", "set up splus.md", "tell splus to skip generated files", "splus review preferences". +user-invocable: true +allowed-tools: + - mcp__splus__preferences + - mcp__splus__learnings + - Read + - Write + - Edit + - Glob + - Grep +--- + +# Splus Preferences — author `splus.md` + +`splus.md` is the repo's review contract. Splus reads it **first** on every +review: the prose guides the reviewer; a small structured subset is **binding**. +Repo `./splus.md` layers over the user's `~/.splus/splus.md`. + +## What goes in it +- **Prose** (injected into the reviewer, soft-binding): policy, conventions, nits, + hotspots, voice. e.g. "we use `Result`, never throw", "tests may use `any`", + "lead with the worst thing, no praise padding". +- **Binding directives** (enforced — matching findings are dropped at review time, + and reported as dropped, never silently): + - `mute: ` — drop every finding from that rule (e.g. `mute: hygiene.console-log`). + - `skip: ` — drop findings under a path (`**` spans dirs, `*` within a + segment), e.g. `skip: generated/**`, `skip: examples/keys.sample.env`. + +## To create one +1. Check for an existing `./splus.md` (Read it) and run `preferences` to see what's + active. Also glance at `learnings` — patterns already dismissed/muted are good + candidates to promote into a durable `mute:`. +2. Infer the repo's conventions from its code (test layout, error style, generated + dirs) and from what the user tells you. +3. Write `./splus.md` using the template below. Confirm the binding rules with the + user before adding them — they silently drop findings. + +## Template +```markdown +# splus.md — how this repo wants to be reviewed + +## policy +- signal budget: keep it tight; lead with the worst thing. +- scrutiny: billing/** and auth/** get max scrutiny. + +## nits & conventions +- We use Result, not exceptions — don't flag missing try/catch. +- Tests may use `any`; fixtures are not reviewed. +- mute: hygiene.console-log + +## skip +- skip: generated/** +- skip: **/*.pb.go +- skip: examples/keys.sample.env + +## voice +terse. no praise padding. +``` + +When you add a `mute:`/`skip:`, tell the user exactly what will stop being flagged. +To make a one-off correction instead, prefer `dismiss`/`mute` from the review flow; +use `splus.md` for the standing contract. diff --git a/skills/review/SKILL.md b/skills/review/SKILL.md new file mode 100644 index 0000000..8a7e1ee --- /dev/null +++ b/skills/review/SKILL.md @@ -0,0 +1,92 @@ +--- +name: Splus Review +description: This skill should be used when the user asks to "review this code", "review my changes", "review the PR", "review staged", "code review", or "blast radius / impact of this change". Drives an agent-led, precision-first review on top of the local Splus engine — no API key. +user-invocable: true +allowed-tools: + - mcp__splus__review + - mcp__splus__inspect + - mcp__splus__floor + - mcp__splus__preferences + - mcp__splus__recall + - mcp__splus__report + - mcp__splus__dismiss + - mcp__splus__accept + - mcp__splus__note + - mcp__splus__mute + - Task + - Read + - Grep + - Glob + - Bash +--- + +# Splus Review + +You are the senior reviewer in the chair. The Splus engine gives you a grounded, +deterministic floor and a toolbelt you can interrogate on demand — but **the +review is what you find, not the list you're handed.** There is no API key and no +clock. Curiosity and verification are the job; a wrong comment costs more than a +slow review. + +## The flow + +### 0. Read the contract — FIRST, always +Before anything else, read `splus.md` (the repo's review contract). `review` +injects it and `preferences` returns it, but treat it as step zero: it encodes +the repo's standing preferences, nits, and binding `mute:`/`skip:` rules. **It +overrides the engine's defaults and your own taste.** If there is none, you may +offer to scaffold one (see the `prefs` skill) — don't block on it. + +### 1. Ground — get the floor +Call `review` (mode: `working` | `staged` | `base` | `all`; `precise:true` for +compiler-grade blast radius). You get: the contract, the deterministic finding +floor, and a directive. The floor is where you *start*, not where you stop. + +### 2. Investigate — fan out fresh, unbiased reviewers +**This is the heart, and bias is the enemy.** The context that wrote the code is +the worst judge of it. So review in **fresh sub-agents** that see the diff cold: + +- Partition the changed files into a few coherent **units** (by directory / + subsystem). For a small diff, one unit is fine. +- For each unit, spawn a **fresh `Task` sub-agent** with the `references/investigate.md` + protocol. Hand it only: the unit's files, the `splus.md` contract, the floor for + those files (`floor`), and the stated intent (PR title / commit message) — **not** + this session's implementation narrative. +- Each sub-agent INVESTIGATES with the toolbelt instead of guessing: `inspect` + (`callers` / `blast_radius` / `definition` / `complexity` / `exports` / `imports`), + `recall` ("have we been burned here before?"), `floor` to re-ground. It recurses + when a hunk smells off, and returns *candidate* findings — each grounded in a real + line, with its investigation trail. +- See `references/dispatch.md` for how to fan out (and how to degrade to a single + sequential pass on hosts without sub-agents). + +### 3. Verify — a different agent tries to refute +Never let the finder grade its own homework. For the surviving candidates, run an +**independent** verification pass (a separate fresh `Task`, or at minimum a +distinct refutation pass) following `references/verify.md`: re-read each cited +line and try to REFUTE the claim. Drop anything that can't be defended. + +### 4. Report — the survivors +Synthesize the verified findings into must-fix / concern / nit, each with +`file:line` and a concrete fix. Honor the contract's voice. Then call `report` +and fill the returned HTML template — the offline artifact the dev keeps next to +the diff. + +### 5. Teach — make diligence compound +- `dismiss ` when the user agrees something is noise (generalizes). +- `accept ` when they act on a real one (reinforces + becomes recallable). +- `note ""` for anything you learned about the repo (→ future `recall`). +- `mute ` when a whole class is unwanted here. + +## Lenses +Within a unit, a thorough reviewer applies every lens — correctness, security, +intent, failure/concurrency, blast-radius. See `references/lenses.md`. For a large +unit, fan out one sub-agent per lens (each blind to the others) so different +failure modes get found instead of one pass. + +## The standard you're held to +Coverage, not speed. For every changed export, you should have `inspect`ed its +blast radius and opened the call sites that matter. Every posted finding cites a +real line and survived a refutation. Every floor finding was explicitly kept or +suppressed — never silently dropped. That is what a great review looks like; take +the time it takes. diff --git a/skills/review/references/dispatch.md b/skills/review/references/dispatch.md new file mode 100644 index 0000000..2cfe0f1 --- /dev/null +++ b/skills/review/references/dispatch.md @@ -0,0 +1,44 @@ +# Dispatch — fan out fresh reviewers (and degrade gracefully) + +Fresh context per reviewer is the point: no author bias, no session clutter, small +windows. The orchestrator (you) holds only `splus.md` + the unit list + each +sub-agent's compact result — never every file at once. This is what lets a huge +diff get a careful review. + +## On Claude Code (sub-agents available) +1. **Partition** the changed files into a few coherent units (by directory / + subsystem). Keep a unit small enough to review deeply. +2. For each unit, spawn a fresh `Task` with the `investigate.md` protocol. Give it + ONLY: + - the unit's file list, + - the `splus.md` contract, + - the floor for those files (call `floor` with the scope, or pass the relevant + slice of the `review` floor), + - the stated intent (PR title / commit message). + Do **not** pass this session's implementation narrative — that reintroduces the + bias you forked to escape. +3. Collect each unit's candidate findings (compact structured results, not file + dumps). +4. Spawn an **independent** verifier `Task` (the `verify.md` refuter) over the + candidates. The verifier must be a different agent than any finder. +5. You synthesize: dedup, signal-budget against `splus.md`, assign tiers, then + `report` + teach. + +For a large unit, fan out again **by lens** (one sub-agent per lens in +`lenses.md`), each blind to the others. + +## On hosts without sub-agents (Codex, OpenCode, …) +The same protocol runs **sequentially** in one context: +- Review units one at a time; summarize and clear between units so the window + stays small. +- Run the verifier as a **distinct pass** after discovery — re-read each candidate + cold and try to refute it. The finder≠verifier discipline becomes finder-pass ≠ + verifier-pass; still drop anything you can't defend. + +The tools are identical on every host — only the dispatch primitive changes. Never +hard-depend on forking; degrade, don't fail. + +## Don't over-fan +One unit + one verifier is the right shape for a small diff. Fan out for breadth +(big diffs) and depth (risky units), not for its own sake. Each spawn costs +context; spend it where the risk is. diff --git a/skills/review/references/investigate.md b/skills/review/references/investigate.md new file mode 100644 index 0000000..58e1f8e --- /dev/null +++ b/skills/review/references/investigate.md @@ -0,0 +1,47 @@ +# Investigate — the per-unit reviewer protocol + +You are a **fresh reviewer** seeing this code for the first time. You have no +attachment to it and no memory of why it was written — that is your advantage. +Judge the code against its *stated contract* (its names, comments, the change's +intent, and `splus.md`), not against any author's rationalization. + +There is no clock. Your reputation is precision. Curiosity is the job: when a hunk +smells off, open every call site before you move on. + +## Your inputs +- The unit's changed files (read them). +- The `splus.md` contract — binding preferences and nits. They come first. +- The deterministic floor for these files (from `floor`). +- The stated intent (PR title / commit message). + +## What to do +1. **Read the diff** for this unit end to end. Form a hypothesis of what it's + trying to do. +2. **Recall** — `recall("")`. Has this repo been burned + here before? Is there a convention you must respect? +3. **Triage the floor** — for each engine finding, keep or suppress. Optimize for + signal: suppress fixtures, role-idiomatic patterns, pure style; keep what a + senior reviewer genuinely wants fixed before merge. +4. **Investigate — don't guess.** Use the engine on tap: + - `inspect callers ` / `inspect blast_radius ` — for every + changed export, find who depends on it and **open those call sites** to + confirm the change still holds for them. + - `inspect definition ` — when you need to see what something actually is. + - `inspect complexity ` — where the risk concentrates. + - `inspect exports|imports ` — the surface and dependencies of a file. + - Recurse: if a call site looks wrong, inspect *its* callers. Follow the smell. +5. **Apply every lens** (see `lenses.md`): correctness, security, intent, + failure/concurrency, blast-radius. + +## What to return +A list of **candidate** findings. Each one MUST have: +- `file:line` pointing at a real line in the diff. +- a one-line claim (what's wrong and why it matters). +- the tier you'd assign (must-fix / concern / nit). +- a concrete fix. +- the trail: what you inspected to reach it (so the verifier can check your work). + +Do **not** post anything yet — candidates go to an independent verifier. Return the +candidates plus a short coverage note: which changed exports you inspected and +which call sites you opened. If you found nothing, say so plainly — a clean unit, +honestly covered, is a real result. diff --git a/skills/review/references/lenses.md b/skills/review/references/lenses.md new file mode 100644 index 0000000..222977c --- /dev/null +++ b/skills/review/references/lenses.md @@ -0,0 +1,39 @@ +# Lenses — the failure modes a thorough reviewer checks + +Apply every lens to a unit. For a large unit, fan out one fresh sub-agent per +lens (each blind to the others) so redundancy doesn't crowd out coverage — a +reviewer looking only for races finds races a generalist skims past. + +Every finding, in every lens, must cite a real changed line and survive a +refutation. Ground it with `inspect` rather than asserting it. + +## Correctness +Off-by-one and boundary errors; missing `await` / unhandled rejection; a swallowed +or mis-handled error path; wrong condition or inverted boolean; null/undefined +deref; resource leak (unclosed handle, unbounded growth); a broken invariant the +surrounding code relies on; an early return that skips required cleanup. + +## Security +Injection (SQL / command / template) reachable from input; path traversal; SSRF; +auth/authz gaps and IDOR; unsafe deserialization; secret or credential handling; +`eval`/dynamic require of attacker-influenced data; missing output encoding. Trace +from the input to the sink — is the path actually reachable? + +## Intent +Does the code do what its name, comments, and the change's stated purpose claim? +Look for: dead or unreachable branches; logic that contradicts a comment; +fail-OPEN where it should fail-CLOSED; a "fix" that doesn't address the described +problem; a renamed thing whose behavior silently changed. + +## Failure & concurrency +Races and check-then-act gaps; partial writes / non-atomic updates; retries that +aren't idempotent (double-charge, double-send); shared mutable state across async; +timeouts and cancellation; what happens when the dependency is down — does it +degrade safely? + +## Blast radius +For every changed export or signature, `inspect callers` / `inspect blast_radius` +and **open each call site**: does the change still hold for it? Did a default +change, a field disappear, a thrown error newly escape, an argument order shift? +Cross-API-boundary changes (routes/handlers/public API) get the most scrutiny. +This is the lens determinism grounds best — use the engine, then verify by reading. diff --git a/skills/review/references/verify.md b/skills/review/references/verify.md new file mode 100644 index 0000000..646f2ea --- /dev/null +++ b/skills/review/references/verify.md @@ -0,0 +1,31 @@ +# Verify — the adversarial refuter (finder ≠ verifier) + +The reviewer who found a candidate *wants* to have found a bug — that's motivated +reasoning. So verification is done by a **different, fresh agent** whose only job +is to **refute**. A finding is true only if it survives someone trying to kill it. + +## Your stance +Assume each candidate is WRONG until the cited line proves otherwise. You are not +looking for reasons it might be right; you are looking for the reason it's noise. + +## For each candidate +1. Read the cited `file:line` and enough surrounding code to judge it in context. +2. Try every refutation: + - **Already handled** — is the case guarded nearby (a check above, a wrapper, a + type that makes it impossible)? + - **Not reachable** — for a security/▸failure claim, is the dangerous path + actually reachable from real input? + - **Line doesn't show it** — does the cited line actually demonstrate the + claim, or is it speculation about code that isn't there? + - **Role-appropriate** — is this idiomatic and fine for this file's role (test, + fixture, script, generated)? + - **Contract says so** — does `splus.md` explicitly accept this? + - Use `inspect` to check: e.g. `inspect callers` to see whether a "breaking" + change actually has any callers; `inspect definition` to see what a symbol + truly is. +3. Verdict: **survives** (real, keep — confirm tier) or **dropped** (with the + one-line reason it failed). + +## Output +For each candidate: `survives` or `dropped: `. When in doubt, **drop it** — +a missed nit is cheaper than a wrong comment. Only survivors go in the report. diff --git a/splus.md b/splus.md new file mode 100644 index 0000000..5e8b6f2 --- /dev/null +++ b/splus.md @@ -0,0 +1,22 @@ +# splus.md — how the Splus repo wants to be reviewed + +Splus reviews itself. This contract is read first on every review. + +## policy +- Precision-first: lead with the worst thing; a wrong comment costs more than a missed nit. +- Max scrutiny on the engine's analysis tier (`crates/splus-engine/src/analysis/**`, + `collectors/**`) and the MCP tool surface (`packages/mcp/src/**`) — these are the moat. + +## nits & conventions +- The engine is zero-inference and deterministic. Flag anything that adds nondeterminism + to a collector or analysis pass. +- Honest confidence is mandatory: a blast radius or finding must never be presented as + more certain than its resolution method warrants. +- Tests build fixtures in-memory or under a tempdir; they are not production paths. + +## skip +- skip: dist-release/** +- skip: **/*.test.ts + +## voice +terse, technical, no praise padding.