From e9749428b6376ce2a6ab9eeb01c88371165d1deb Mon Sep 17 00:00:00 2001 From: ojowwalker77 Date: Fri, 5 Jun 2026 17:20:07 -0300 Subject: [PATCH] docs: AGENTS.md + CLAUDE.md, greenrun gate in SPLUS.md, skills-first docs refresh MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - AGENTS.md: canonical agent instructions (layout, build/test, the non-negotiable greenrun verification gate, lockstep conventions, bench-only triage rule, mcp directive ↔ skills mirroring, release flow); CLAUDE.md imports it - SPLUS.md: every change set must pass greenrun before it ships; reviews flag work that skipped it (also fix stale lowercase heading) - ARCHITECTURE.md: drop the stale mcp → triage edge (bench-only since 0.9.2), add skills/ to the diagram and pieces table - README: skills section reflects per-agent installation, repo layout gains skills/ + AGENTS.md, docs list links AGENTS.md - CONTRIBUTING: bench-only triage description, skills/ in layout, pnpm -r test, greenrun in the develop loop, release notes mention skills packaging + lockstep version bumps - TOOLS.md: document the CHANGED SYMBOLS block + contract-trace stage --- AGENTS.md | 65 ++++++++++++++++++++++++++++++++++++++++++++ CHANGELOG.md | 13 +++++++++ CLAUDE.md | 1 + CONTRIBUTING.md | 16 +++++++---- README.md | 11 ++++++-- SPLUS.md | 4 ++- docs/ARCHITECTURE.md | 10 +++++-- docs/TOOLS.md | 11 ++++++-- 8 files changed, 116 insertions(+), 15 deletions(-) create mode 100644 AGENTS.md create mode 100644 CLAUDE.md diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 0000000..80f6013 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,65 @@ +# Splus — agent instructions + +Splus makes the coding agent in your editor a disciplined, precision-first code +reviewer: a deterministic Rust engine (the grounding) + a thin TS layer (MCP +server, memory) + the review protocol shipped as skills. 100% local. + +## Layout + +``` +crates/splus-engine/ # deterministic engine (Rust) — the source of truth, zero inference +packages/ + shared/ # canonical Finding model (TS ↔ Rust serde lockstep) + runEngine/inspect + suppression/ # per-repo memory — suppress (dismiss) + reinforce (accept) + triage/ # headless review pipeline — BENCH-ONLY, the MCP path never calls it + mcp/ # the local stdio MCP server — the one and only usage path +skills/ # the review protocol (review, prefs) — installed per agent by install.sh +bench/ # run.mjs (regression gate) + martian/ (competitive benchmark) +install.sh # curl|sh installer: binaries → ~/.splus, wires MCP + skills into agents +``` + +## Build & test + +```sh +cargo build --release # engine → target/release/splus-engine +cargo test --locked # engine tests +pnpm install +pnpm -r build && pnpm -r typecheck +pnpm -r test # unit tests (shared + suppression + triage) +pnpm build:release # bundle MCP server → dist-release/mcp.cjs +node bench/run.mjs # engine regression gate — MUST stay green +``` + +## Verification — non-negotiable + +**Always run `greenrun` after code changes** — it executes the repo's GitHub +Actions locally and must PASS before work is called done or pushed: + +```sh +greenrun --plain # ~/.greenrun/bin/greenrun if not on PATH +``` + +Exit 0 = passed; treat anything else as a failure to fix, and never describe a +partial run as green. + +## Conventions + +- The Rust model (`crates/splus-engine/src/model.rs`) and the TS model + (`packages/shared/src/index.ts`) stay in lockstep — change them together. +- Versions bump in lockstep across ALL `package.json` files + `Cargo.toml` + (then `cargo build` to refresh `Cargo.lock`), with a `CHANGELOG.md` entry. +- The MCP server is agent-led by design: it grounds and directs, the session + agent reasons. Never wire `@splus/triage` into it — that pipeline exists only + so the benchmark can measure the protocol headlessly. +- The protocol lives in `skills/` — a directive change in + `packages/mcp/src/index.ts` (`discoveryDirective`) must be mirrored in + `skills/review/` and vice versa. +- `SPLUS.md` (repo root) is the review contract; Splus reviews itself with it. +- The engine is zero-inference and deterministic; anything nondeterministic in a + collector or analysis pass is a bug. + +## Release + +Tag `vX.Y.Z` and push — `.github/workflows/release.yml` cross-compiles the +engine (macOS/Linux, arm64/x64), bundles `mcp.cjs`, packages `skills/`, and +publishes tarballs + `SHA256SUMS` that `install.sh` consumes. diff --git a/CHANGELOG.md b/CHANGELOG.md index f3e264b..8286eb0 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,19 @@ this project uses [semantic versioning](https://semver.org) (pre-1.0: minor vers ## [Unreleased] +### Added +- `AGENTS.md` (canonical instructions for coding agents working on this repo — + layout, build/test, the greenrun verification gate, conventions, release) and + `CLAUDE.md` (imports it). + +### Changed +- `SPLUS.md` contract: every change set must pass `greenrun` (full CI, locally) + before it ships — reviews flag work that skipped it. +- Docs refreshed for skills-first delivery: ARCHITECTURE diagram drops the stale + `mcp → triage` edge (bench-only) and adds `skills/`; README/CONTRIBUTING cover + skill installation, `pnpm -r test`, and the greenrun gate; TOOLS.md documents + the CHANGED SYMBOLS block and the contract-trace stage. + ## [0.9.2] — 2026-06-05 ### Changed diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 0000000..43c994c --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1 @@ +@AGENTS.md diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 128e950..6d4ef05 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -9,8 +9,9 @@ stand up. Start with **[docs/ARCHITECTURE.md](docs/ARCHITECTURE.md)** for the fu - `crates/splus-engine/` — the deterministic engine (Rust). The source of truth for findings. - `packages/shared/` — the canonical `Finding`/`Report` model (TS, mirrors the Rust serde model) + `runEngine`, which shells out to the engine binary. - `packages/suppression/` — per-repo memory: suppress (`dismiss`) + reinforce (`accept`). -- `packages/triage/` — the optional LLM layer: the multi-pass review (detect→…→verify), downstream of the engine. +- `packages/triage/` — the headless review pipeline, **bench-only**: it exists so the Martian benchmark can measure the protocol without a human agent. The MCP path never calls it. - `packages/mcp/` — the local stdio MCP server your agent connects to. Tools: see [docs/TOOLS.md](docs/TOOLS.md). +- `skills/` — the review protocol as agent skills (`review`, `prefs`); `install.sh` installs them into Claude Code / Codex / OpenCode. A directive change in `packages/mcp` must be mirrored here, and vice versa. - `bench/` — `run.mjs` (the regression gate) + `martian/` (the competitive benchmark). The splus.sh marketing site lives in its own repo, [kiwi-init/splus-lp](https://github.com/kiwi-init/splus-lp). @@ -23,8 +24,9 @@ cargo test # engine tests pnpm install pnpm -r build # build the TS packages pnpm -r typecheck -node --test packages/suppression/dist/*.test.js packages/triage/dist/*.test.js # unit tests +pnpm -r test # unit tests (shared + suppression + triage) node bench/run.mjs # the regression gate — MUST stay green +greenrun --plain # the full CI, locally — must PASS before pushing ``` Run the freshly built engine directly: @@ -57,13 +59,15 @@ big orchestration changes so quality is measured, not asserted. Tag a version and push: ```sh -git tag v0.4.1 && git push --tags +git tag v0.9.2 && git push --tags ``` `.github/workflows/release.yml` cross-compiles the engine for macOS/Linux (arm64 + x64), -bundles the MCP server into a single `.cjs` file (`scripts/build-release.mjs`), and -publishes a GitHub Release with per-platform tarballs + `SHA256SUMS`. `install.sh` pulls these -from the stable `releases/latest/download/splus--.tar.gz` URL. +bundles the MCP server into a single `.cjs` file (`scripts/build-release.mjs`), packages +`skills/`, and publishes a GitHub Release with per-platform tarballs + `SHA256SUMS`. +`install.sh` pulls these from the stable `releases/latest/download/splus--.tar.gz` +URL. Versions bump in lockstep across all `package.json` files + `Cargo.toml`, with a +`CHANGELOG.md` entry. ## Principles diff --git a/README.md b/README.md index 269ef73..9b55564 100644 --- a/README.md +++ b/README.md @@ -110,9 +110,13 @@ Drop a `SPLUS.md` at the repo root (layered over your personal `~/.splus/SPLUS.m ### Skills -The `skills/` bundle drives the agent-led flow: `review` (fans out **fresh, unbiased sub-agents** per +The `skills/` bundle IS the review protocol: `review` (fans out **fresh, unbiased sub-agents** per unit — finder ≠ verifier — and degrades to a sequential pass where sub-agents aren't available) and -`prefs` (author `SPLUS.md`). +`prefs` (author `SPLUS.md`). The installer puts them directly into every agent it finds — Claude Code +(`~/.claude/skills/splus-review`, `splus-prefs`), Codex (`/splus-review`, `/splus-prefs` prompts), +OpenCode (`/splus-review`, `/splus-prefs` commands) — with the canonical copy at `~/.splus/skills`, +refreshed on every `splus update`. The protocol is loaded explicitly, never inferred from tool +descriptions. **Full reference: [`docs/TOOLS.md`](docs/TOOLS.md)** — every tool, parameter, and return shape. @@ -190,6 +194,7 @@ pulls from. See [`CONTRIBUTING.md`](CONTRIBUTING.md). - **[docs/ARCHITECTURE.md](docs/ARCHITECTURE.md)** — how the engine + review protocol work (with diagrams). - **[docs/TOOLS.md](docs/TOOLS.md)** — the MCP tools your agent calls (every param + return). - **[CONTRIBUTING.md](CONTRIBUTING.md)** — build, test, and the release process. +- **[AGENTS.md](AGENTS.md)** — working on this repo with a coding agent (build, verify, conventions). - **[bench/martian/](bench/martian/)** — score Splus on the independent Martian Code Review Bench. ## Repo layout @@ -201,8 +206,10 @@ packages/ suppression/ # per-repo memory — suppress (dismiss) + reinforce (accept) triage/ # benchmark harness — runs the protocol headlessly to measure it (not a usage path) mcp/ # the local MCP server your agent talks to — the one and only way to use Splus +skills/ # the review protocol as skills — installed into your agents by install.sh bench/ # regression gate (run.mjs) + the Martian benchmark adapter (martian/) docs/ # ARCHITECTURE.md · TOOLS.md +AGENTS.md # instructions for coding agents working ON this repo (CLAUDE.md imports it) install.sh # the one-line installer ``` diff --git a/SPLUS.md b/SPLUS.md index 5e8b6f2..4ba4ceb 100644 --- a/SPLUS.md +++ b/SPLUS.md @@ -1,4 +1,4 @@ -# splus.md — how the Splus repo wants to be reviewed +# SPLUS.md — how the Splus repo wants to be reviewed Splus reviews itself. This contract is read first on every review. @@ -13,6 +13,8 @@ Splus reviews itself. This contract is read first on every review. - Honest confidence is mandatory: a blast radius or finding must never be presented as more certain than its resolution method warrants. - Tests build fixtures in-memory or under a tempdir; they are not production paths. +- Every change set runs `greenrun` (the full CI, locally) before it ships. Flag any work + presented as done without a passing greenrun. ## skip - skip: dist-release/** diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md index 3329368..c47bed5 100644 --- a/docs/ARCHITECTURE.md +++ b/docs/ARCHITECTURE.md @@ -8,16 +8,19 @@ Nothing leaves your machine. ```mermaid flowchart TB agent["🤖 Your coding agent
Claude Code · Codex · OpenCode"] + skills["skills/
the review protocol (installed per agent)"] mcp["packages/mcp
local MCP server (stdio)"] engine["crates/splus-engine
deterministic floor (Rust)"] - triage["packages/triage
multi-pass LLM review"] + bench["bench/martian
benchmark"] + triage["packages/triage
headless pipeline (bench-only)"] supp["packages/suppression
per-repo memory"] shared["packages/shared
Finding model + runEngine"] + skills -.->|drives| agent agent <-->|MCP tools| mcp mcp -->|grounds with| engine - mcp -->|optional · key or claude -p| triage mcp -->|dismiss / accept| supp + bench -->|measures| triage engine --> shared triage --> shared shared -.->|spawns the binary| engine @@ -39,7 +42,8 @@ model than the one you already run — it makes that model disciplined. | `packages/shared` | TS | The canonical `Finding` / `Report` model (mirrors the Rust serde model) + `runEngine`, which shells out to the binary and validates its JSON. | | `packages/suppression` | TS | Per-repo learned memory: suppress what you `dismiss`, reinforce what you `accept`. The compounding moat. | | `packages/triage` | TS | The **benchmark harness** — runs the review protocol headlessly (key or `claude -p`) so the Martian bench can measure it without a human agent. Strictly downstream of the engine. **Not a usage path** — products only ship the MCP flow. | -| `packages/mcp` | TS | The local stdio MCP server your agent connects to — **the one and only way to use Splus**. Wires the engine + suppression together and exposes the tools. | +| `packages/mcp` | TS | The local stdio MCP server your agent connects to — **the one and only way to use Splus**. Wires the engine + suppression together and exposes the tools. It never calls `packages/triage`. | +| `skills/` | md | The review protocol as first-class skills. `install.sh` installs them into every detected agent (Claude Code skills, Codex prompts, OpenCode commands; canonical copy at `~/.splus/skills`) so the protocol doesn't depend on MCP tool descriptions being read. | ## The deterministic floor (the engine) diff --git a/docs/TOOLS.md b/docs/TOOLS.md index df2fa99..952e72a 100644 --- a/docs/TOOLS.md +++ b/docs/TOOLS.md @@ -50,9 +50,14 @@ fix, and cross-file **blast radius**. Learned suppressions are applied first. There is **one flow** and you are the driver: the response begins with the repo's [`SPLUS.md`](#preferences) contract (preferences injected, binding `mute:`/`skip:` rules already enforced) and ends with a **discovery directive** that drives *you* -(the agent) through the full protocol (triage → investigate → verify) over the -changed files. No API key — Splus grounds you with precise anchors and a toolbelt -(`inspect`, `floor`, `recall`); you do the reasoning. Run the protocol; don't relay. +(the agent) through the full protocol (triage → trace contracts → investigate → +verify) over the changed files. The directive includes a deterministic +**CHANGED SYMBOLS** block — the exported symbols whose bodies the diff touches +(engine tree-sitter exports ∩ diff hunks) — so the contract-trace stage starts +aimed: enumerate each one's return/throw shape on every path, open every caller, +report every assumption that no longer holds. No API key — Splus grounds you with +precise anchors and a toolbelt (`inspect`, `floor`, `recall`); you do the +reasoning. Run the protocol; don't relay. | Param | Type | Default | Description | |---|---|---|---|