Progressive Spec-Driven Development (SDD) toolkit for AI coding agents
Slash-command Skills · structured AI Knowledge · MCP server — for Claude Code, Copilot, Codex
繁體中文 • Quickstart • Why Prospec? • How it works
This project is a fork of ci-yang/prospec
Prospec is a Skills-driven Spec-Driven Development (SDD) toolkit for AI coding agents. You drive day-to-day work through slash-command Skills inside your agent (Claude Code, Antigravity, Copilot, Codex); a thin CLI only bootstraps the project and regenerates Skills/Knowledge. The payoff: your agent follows a consistent story → plan → tasks → implement → review → verify → archive workflow, grounded in structured, version-controlled project knowledge.
Three pieces work together:
You ⇄ AI agent
│
├─ Skills .......... run the workflow: story → plan → tasks →
│ implement → review → verify → archive
│ ▲
│ │ read & grow
├─ AI Knowledge .... structured project memory (modules, specs, lessons)
│ ▲
│ │ generated / regenerated by
└─ CLI (prospec) ... bootstrap only: init, agent sync, knowledge init / re-scan structure
- Skills run the workflow inside your agent — the day-to-day surface.
- AI Knowledge is progressive project memory the Skills read and grow with each change.
- CLI is a one-time/occasional tool: it scaffolds the project and regenerates Skills + Knowledge — it is not in the runtime loop.
Who is it for? Developers using an AI coding agent who want repeatable, reviewable workflows on a new project (greenfield) or an existing codebase (brownfield).
| Challenge | How Prospec helps |
|---|---|
| AI doesn't know your codebase | prospec knowledge init + /prospec-knowledge-generate auto-scan and generate AI-readable docs |
| Context window limits | Progressive disclosure: load a summary first, details on-demand (70%+ token saving vs full-dump) |
| Inconsistent AI workflows | Structured Skills enforce story → plan → tasks → implement → review → verify → archive |
| Vendor lock-in | Works with 4+ AI CLIs; knowledge stored as universal Markdown |
| No design-to-code bridge | /prospec-design generates visual + interaction specs with MCP tool integration |
| Knowledge becomes stale | Archive's Entry Gate enforces a Knowledge Update every change |
| Verify passes but subtle bugs ship | /prospec-review — independent adversarial review between implement and verify |
| Lessons don't persist across sessions | /prospec-learn — recurring fixes promote (human-gated) into versioned team rules |
Each row maps to a Skill or command below — see AI Skills and CLI Commands.
From zero to your first AI-driven change in about five minutes.
- Node.js >= 22.13.0
- An AI CLI (one or more): Claude Code (recommended), Antigravity CLI, GitHub Copilot CLI, or Codex CLI
Prospec is a bootstrap/update CLI — once prospec quickstart has run (it chains init + agent sync), your agent works from the committed Skills and Knowledge (Markdown); the binary isn't needed again until you regenerate. So install it once, globally.
npm install -g github:benwu95/prospec # or: pnpm add -g github:benwu95/prospec
prospec --help # verifyOther install options (npx, or pin per-project)
Prospec is an unpublished fork — npm/pnpm clones the repo, installs dev deps, and builds it via the prepare script.
Run on demand with npx (clones + builds each time):
npx github:benwu95/prospec quickstartPin the version per-project so re-running agent sync regenerates identical Skills across contributors — and so downstream developers can run the deterministic prospec knowledge init --raw-scan-only (invoked by /prospec-knowledge-generate and /prospec-archive to keep raw-scan.md current) via pnpm exec / npx without a global install. For Node.js projects, install as a devDependency (other ecosystems: a global install is the path):
npm install -D github:benwu95/prospec # or: pnpm add -D github:benwu95/prospecOne command does the deterministic setup — it chains init + agent sync, skipping any step already done:
cd my-project # a new or existing project
prospec quickstart # → select AI assistants, choose doc language; creates .prospec.yaml + per-agent config + Skillsprospec quickstart runs agent sync, which writes Claude Code → CLAUDE.md + .claude/skills/; Antigravity / Codex / Copilot → AGENTS.md + .agents/skills/. Then finish onboarding inside your AI agent:
/prospec-quickstart # localize skill triggers, re-sync config, generate AI KnowledgeThis one-time finisher is re-runnable and self-terminating; on an existing codebase it reads your modules into AI Knowledge so the agent understands them before your first change.
You don't have to remember the steps — describe the change in plain language and the agent drives the SDD loop, pausing only to ask you questions and to confirm each handoff:
You ▸ Use prospec to add a dark-mode toggle
The agent picks up the request and runs /prospec-ff:
• asks a few scoping / acceptance questions — you answer in plain language
• writes story → plan → tasks, then hands off at each stage:
"Run /prospec-implement now? (Y/n)" → Y
implement → "Run /prospec-review now? (Y/n)" → Y
review → "Run /prospec-verify now? (Y/n)" → Y
verify reaches grade A → prompts you to commit → Y
→ "Run /prospec-archive now? (Y/n)" → Y ✓ archived
Every stage ends by telling you what's next and waiting for your Y — answer n to stop and the suggestion stays, so you can resume later without tracking where you left off. /prospec-verify is the commit boundary: at grade S/A it prompts you to commit (it never commits for you), then offers to archive.
Prefer to drive each step yourself? Run them explicitly:
/prospec-explore # (optional) clarify the requirement first
/prospec-new-story add-my-feature # capture it as a structured story
/prospec-design # (optional) UI / interaction specs
/prospec-plan # design the implementation (a `quick`-scale change skips this)
/prospec-tasks # break the plan into an ordered task checklist
# ↑ collapse story → plan → tasks in one pass with: /prospec-ff add-my-feature
/prospec-implement # implement task-by-task (no commit yet)
/prospec-review # adversarial review → fix loop
/prospec-verify # validate; prompts you to commit at grade S/A
/prospec-archive # archive + sync specs & knowledge
/prospec-learn # (periodic) promote recurring lessons → team rulesThat's the full SDD loop. Because /prospec-quickstart already seeded AI Knowledge, the agent starts from an understanding of your modules. The full greenfield & brownfield walkthroughs below break down every step prospec quickstart automates.
Greenfield vs. brownfield bootstrap — what the two commands expand to
prospec quickstart → /prospec-quickstart is the whole bootstrap:
mkdir my-project && cd my-project
prospec quickstart --name my-project # init + agent sync (interactive assistant + language selection)
# then, inside your AI agent:
/prospec-quickstart # localize triggers · re-sync · generate AI KnowledgeThose two commands expand to:
# `prospec quickstart` runs:
prospec init --name my-project # → select AI assistants (interactive checkbox)
# → choose the doc language (default: English, or
# --language "Traditional Chinese (Taiwan)"); a [MUST]
# Language Policy rule is seeded into CONSTITUTION.md —
# code and git commit messages stay in English
# → creates .prospec.yaml + directory structure
prospec agent sync # → per-agent config + Skills (Claude Code → CLAUDE.md +
# .claude/skills/; Antigravity / Codex / Copilot →
# AGENTS.md + .agents/skills/)
# `/prospec-quickstart` then, inside your AI agent:
# • non-English doc language? proposes native trigger words for `skill_triggers`
# in .prospec.yaml and re-runs agent sync once you confirm — skills then match
# requests phrased in your language
# • prospec knowledge init → /prospec-knowledge-generate (seeds AI Knowledge)On a fresh repo, /prospec-knowledge-generate produces a minimal Knowledge base that fills in as you ship changes. Then run your first change exactly as in step 3 above.
same two commands; /prospec-quickstart reads your existing code into AI Knowledge:
cd existing-project
prospec quickstart # auto-detects tech stack; runs init + agent sync
# then, inside your AI agent:
/prospec-quickstart # localize triggers · re-sync · knowledge init · /prospec-knowledge-generateThose two commands expand to:
# `prospec quickstart` runs:
prospec init # → auto-detect tech stack; select AI assistants; choose doc
# language (default: English; --language to skip the prompt)
prospec agent sync # → per-agent config + Skills
# `/prospec-quickstart` then, inside your AI agent:
prospec knowledge init # → generates raw-scan.md + empty skeletons (_index.md, _conventions.md, module-map.yaml)
/prospec-knowledge-generate # → AI reads raw-scan.md, decides module partitioning,
# creates modules/*/README.md + fills _index.mdHere knowledge init reads your existing code, so /prospec-knowledge-generate produces a rich Knowledge base up front. Then run your first change exactly as in step 3 above — the develop loop is identical to greenfield.
knowledge init captures how your code is structured, but brownfield modules usually still lack a Feature Spec describing what they do. Closing that WHAT-layer gap is its own first-class flow — see Backfill: document existing code into the trust zone below. It is not part of bootstrap, so run it whenever you choose.
Directory layout after completing the Quickstart (prospec quickstart + /prospec-quickstart)
your-project/
├── .prospec.yaml # Prospec config
├── CLAUDE.md # Claude Code config (Layer 0, <100 lines)
├── AGENTS.md # Antigravity / Codex / Copilot config (agents.md standard)
├── {base_dir}/
│ ├── CONSTITUTION.md # Project rules (user-defined)
│ ├── specs/
│ │ ├── product.md # Product Spec (PRD entry point)
│ │ └── features/ # Living Feature Specs (accumulated)
│ └── ai-knowledge/
│ ├── _index.md # Module index (Markdown table)
│ ├── _conventions.md # Project conventions
│ ├── _playbook.md # Team lessons promoted by /prospec-learn (human-gated)
│ ├── _lessons-ledger.md # Accumulating lessons ledger, auto-fed at Archive (version-controlled)
│ ├── raw-scan.md # Auto-generated project scan data
│ ├── module-map.yaml # Module dependencies
│ ├── feature-map.yaml # Feature→module index (optional; bootstrapped at Archive)
│ └── modules/
│ └── {module}/
│ └── README.md # Module-specific docs
├── .prospec/ # Change management (not committed)
│ ├── changes/
│ │ └── {change-name}/
│ │ ├── proposal.md # User Story + acceptance criteria
│ │ ├── design-spec.md # Visual spec (optional, UI changes)
│ │ ├── interaction-spec.md # Interaction spec (optional)
│ │ ├── plan.md # Implementation plan
│ │ ├── tasks.md # Task breakdown (checkbox format)
│ │ ├── delta-spec.md # Patch Spec (ADDED/MODIFIED/REMOVED)
│ │ └── metadata.yaml # Change lifecycle metadata
│ └── archive/ # Archived completed changes
├── .claude/skills/ # Skills for Claude Code (one dir per skill)
│ ├── prospec-explore/
│ ├── prospec-new-story/
│ ├── prospec-design/
│ ├── prospec-plan/
│ ├── prospec-tasks/
│ ├── prospec-ff/
│ ├── prospec-implement/
│ ├── prospec-review/
│ ├── prospec-verify/
│ ├── prospec-archive/
│ ├── prospec-learn/
│ ├── prospec-knowledge-generate/
│ ├── prospec-knowledge-update/
│ ├── prospec-backfill-spec/
│ ├── prospec-promote-backfill/
│ ├── prospec-quickstart/ # one-time onboarding finisher (on disk, excluded from entry config)
│ └── prospec-upgrade/ # version-upgrade finisher (on disk, excluded from entry config)
└── .agents/skills/ # Same skills, agents.md format (Antigravity / Codex / Copilot)
└── prospec-*/
Prospec runs one linear flow, wrapped in two feedback loops that make it compound rather than merely repeat.
flowchart TD
E([Explore]) --> S([Story]) --> D(["Design (optional)"]) --> P([Plan]) --> T([Tasks]) --> I([Implement]) --> R([Review]) --> V([Verify]) --> KU([Knowledge Update]) -- Entry Gate --> A([Archive]) -- periodic --> L([Learn])
V -. quality_log .-> L
R -. findings .-> L
L -- human-approved --> RULES[("Constitution + _playbook<br/>team rules accumulate")]
KU --> AK[("AI Knowledge<br/>more complete every change")]
A -- Spec Sync --> FS[("Feature Specs<br/>graduate at archive")]
AK -.-> NEXT["next change starts from a<br/>richer, smarter baseline"]
FS -.-> NEXT
RULES -.-> NEXT
NEXT -. context .-> P
classDef asset fill:#eef7ff,stroke:#2b6cb0,stroke-width:2px;
classDef gain fill:#e9f9ee,stroke:#2f855a,stroke-width:2px;
class AK,FS,RULES asset;
class NEXT gain;
Every Archive enriches AI Knowledge (more complete with each change), and recurring lessons — review findings, the cross-stage quality_log, session corrections — promote, only with human approval, into an accumulating body of team rules (Constitution + _playbook). So the next change doesn't start from scratch; it starts from a richer, smarter baseline.
The flow is also scale-aware: a user-confirmed quick change skips the Plan stage entirely (story → tasks), with archive-time backstops — see Right-Sized Process.
Prospec enforces 6 principles over the assets it injects into your project — the generated Skills, configs, and directory structure:
- Progressive Disclosure First — never load all info at once; index → details
- Spec is Source of Truth — changes documented in specs before code
- Zero Startup Cost for Brownfield — no need to document the entire codebase upfront
- AI Agent Agnostic — works with any AI CLI via Markdown adapters
- User Controls the Rules — Constitution is user-defined, the tool enforces
- Language Policy — AI-generated docs in the language you choose at
prospec init(default: English); code, technical terms, and git commit messages always in English
Brownfield projects accumulate behavior that no Feature Spec describes. Backfill is a first-class, two-skill path that reverse-extracts that behavior from the code and graduates it into the spec trust zone (prospec/specs/features/) — and it never writes the trust zone by hand (archive stays the sole writer).
flowchart TD
CODE[("existing<br/>brownfield code")] --> BF([Backfill]) -- "draft + human review" --> PR([Promote]) -- "scale: backfill<br/>(no plan/tasks)" --> V([Verify]) -- "spec-fidelity → S/A" --> A([Archive])
A -- Spec Sync --> FS[("Feature Specs<br/>graduate into trust zone")]
classDef asset fill:#eef7ff,stroke:#2b6cb0,stroke-width:2px;
class CODE,FS asset;
- Extract —
/prospec-backfill-specreads the code (and tests, git history, docs) and stages a route-compatiblebackfill-draft.md; intent it cannot infer from code is marked[NEEDS CLARIFICATION], never fabricated. - Review — resolve every
[NEEDS CLARIFICATION](the So that value, target role, ambiguous AC) and confirm the candidate feature slug. This is the human gate. - Promote —
/prospec-promote-backfillturns the reviewed draft into the change scaffold (proposal + delta-spec + metadata) markedscale: backfill,status: implemented.backfillis a light scale likequick— no hollowplan.md/tasks.md, because the code already exists. - Verify —
/prospec-verifygrades spec-fidelity (each REQ'sfile:linemust resolve), records pre-existing code-quality gaps (e.g. untested brownfield code) as informational tech debt, and only applies that relaxation when abackfill-draft.mdproves provenance — so a faithful draft reaches S/A instead of being blocked by debt it merely documents, and the marker can't bypass quality gates for new code. - Archive —
/prospec-archivegraduates the requirements intoprospec/specs/features/{slug}.md. That is the only step that writes the trust zone.
When a new prospec version ships, re-run the install to pull the latest (it's an unpublished GitHub fork, so this re-clones + rebuilds the current commit):
npm install -g github:benwu95/prospec # or: pnpm add -g github:benwu95/prospec
# pinned per-project devDependency: npm install -D github:benwu95/prospecThen bring an existing project up to date in two steps — a deterministic CLI step, then a consent-gated AI step:
prospec upgrade # CLI (zero-LLM): record the new version + re-sync agents/prospec-upgrade # in your AI agent: refresh init-doc formats + localize new-skill triggers (asks before each change)
prospec upgrade(CLI) records the running prospec version in.prospec.yamlversion(rewritten in canonical format), re-runsagent syncso your per-agent config and Skills match the new templates, and prints a migration report (version delta + any newly-added skills missing native-language triggers). It writes no docs — it never touchesCONSTITUTION.md,_conventions.md,_index.md, the canonical convention docs, or any module README./prospec-upgrade(Skill) finishes the judgment work the CLI can't do safely: it scans the filesprospec initcreated, compares them to the latest templates, and offers to update any whose format has drifted — asking for your confirmation per file (it migrates format only, never your authored content). It then localizes triggers for any newly-added skills into yourartifact_language(filling only the missing ones) and re-runsagent sync.
.prospec.yamlversionis the prospec version the project last upgraded to (a legacyversion: "1.0"is treated as stale and bumped on firstprospec upgrade). Need to (re-)localize triggers after adding a skill? Just re-runprospec agent sync— it names any skill missing askill_triggersentry, so you fill only the gap. You never need to delete.prospec.yaml.
Prospec generates 17 Skills — 15 guide AI through the full SDD lifecycle, plus two periodic finishers: /prospec-quickstart (onboarding) and /prospec-upgrade (version upgrade):
| Skill | Slash Command | Description |
|---|---|---|
| Explore | /prospec-explore |
Think partner for requirement clarification |
| New Story | /prospec-new-story |
Create structured change story |
| Design | /prospec-design |
Generate visual + interaction specs (Generate/Extract modes) |
| Plan | /prospec-plan |
Generate implementation plan + delta-spec |
| Tasks | /prospec-tasks |
Break down into executable tasks |
| Fast-Forward | /prospec-ff |
Generate story → plan → tasks in one go |
| Implement | /prospec-implement |
Implement tasks one-by-one with MCP-first design reading |
| Review | /prospec-review |
Adversarial review → fix loop; verifier-confirmed criticals auto-fixed, spec-aware lens |
| Verify | /prospec-verify |
5+1 dimension audit with quality grade (S/A/B/C/D); prompts commit at S/A |
| Archive | /prospec-archive |
Archive changes + Spec Sync + Knowledge sync Entry Gate |
| Learn | /prospec-learn |
Feedback promotion: recurring lessons → team _playbook / Constitution (auditable, human-gated) |
| Knowledge Generate | /prospec-knowledge-generate |
AI-driven module analysis and knowledge creation |
| Knowledge Update | /prospec-knowledge-update |
Incremental knowledge update from delta-spec |
| Backfill Spec | /prospec-backfill-spec |
Reverse-extract a Feature Spec draft from existing brownfield code (stages a draft, never writes the trust zone) |
| Promote Backfill | /prospec-promote-backfill |
Formalize a reviewed backfill draft into the backfill change scaffold (proposal + delta-spec + metadata, scale: backfill, status: implemented; a light scale — no plan/tasks); never writes the trust zone |
| Quickstart | /prospec-quickstart |
After prospec quickstart runs init + agent sync, localize skill triggers into your artifact language, prepare the Knowledge scan, and chain into /prospec-knowledge-generate to seed AI Knowledge; never writes the trust zone |
| Upgrade | /prospec-upgrade |
After prospec upgrade refreshes the canonical docs, localize triggers for newly-added skills (fill-missing only) + migrate flagged curated-doc formats — each with confirmation + a diff preview; never auto-writes the trust zone |
Periodic finishers —
/prospec-quickstart(run once afterprospec quickstart) and/prospec-upgrade(run afterprospec upgradeon a version bump) finish the judgment steps the CLI cannot do deterministically. Both are deployed as Skills on disk but kept out of the always-loaded entry config, so they add no recurring token cost.
Beyond the linear flow, every workflow Skill carries built-in quality machinery:
- Output Contract — each Skill self-reports
Met N/M | Overall: PASS|WARN|FAILagainst objective criteria, so you don't hand-check artifacts. - Entry / Exit gates — a Skill checks preconditions before running (Entry) and Constitution compliance after (Exit); WARN/FAIL records persist to a cross-stage
quality_logso an earlier stage's concern surfaces at the next. - Skill instruction quality — per-phase gate checklists (finer-grained than the skill-level Entry/Exit gates); a status-aware next-step handoff at the end of each linear-flow Skill (plan→tasks→implement→review→verify→archive) (
Run <next-step> now? (Y/n)— your Y is the trigger, never a silent auto-run); new-session detection of in-progress changes to resume;/prospec-implementre-anchorsProgress X/Y | Goal | Nextafter each task; and/prospec-explore//prospec-knowledge-generatewarn when the Constitution is still substantively empty (its gates would otherwise be no-ops). - Executable Constitution — rules carry RFC-2119 severity (MUST→FAIL / SHOULD→WARN / MAY→advisory);
/prospec-verifygrades against them. - Deterministic drift gate —
prospec checkmachine-verifies spec ↔ code ↔ knowledge referential integrity with zero tokens;/prospec-verifyconsumes its report at dev time and the scaffolded CI workflow enforces it on every PR. With an optionalfeature-map.yaml(feature→module index, bootstrapped at archive) it adds two governance checks: REQ-prefix legality (WARN) and the feature→module edge (FAIL). - Adversarial review —
/prospec-reviewsits between implement and verify: an independent fresh-context reviewer audits the whole change diff; only verifier-confirmed, drop-in criticals are auto-fixed, the rest escalate to you. The commit boundary is after verify reaches grade S/A, so implement + review + verify fixes land in one atomic commit (prospec prompts; it never auto-commits). - Feedback promotion — every Archive auto-harvests a change's recurring lessons into a version-controlled ledger (
_lessons-ledger.md);/prospec-learnthen scores them with an explicit reproducible rule (frequency + impact modules) and — only with explicit human approval — promotes them into the team_playbook.mdor the Constitution.
Not every change deserves the full ceremony. At story time, /prospec-new-story (or /prospec-ff) assesses complexity against explicit criteria and proposes a scale — you confirm before it is written to metadata.yaml:
| Scale | What changes |
|---|---|
quick |
Slim proposal (single story, no FR/SC enumeration), plan phase skipped entirely (story → tasks), no module-README loading; review/verify report their delta-spec dimensions as not-applicable (never a fake PASS) |
standard (default; absent on existing changes) |
The current concise flow — plan ≤ 120 lines |
full |
Complete architecture analysis — expanded Technical Summary, per-entry-point Call Chains |
Two honest backstops keep quick from becoming a spec-drift hole: a change expected to touch spec-covered behavior is vetoed out of quick at assessment time, and the /prospec-archive Entry Gate re-checks the actual diff — spec impact blocks archiving until a minimal Spec Impact section is added, and the knowledge-sync gate derives affected modules from diff paths instead of the absent delta-spec. Engineering discipline is not scaled down: TDD, adversarial review, and Constitution audits run at every scale.
Tasks also carry a kind marker ([M] manual, [V] verification, unmarked = code): completion rates count code tasks only, so an unchecked "run this command manually" reminder never blocks or distorts a gate.
Cache-Stable Prefix Ordering (advanced internals)
Every skill's Startup Loading section is ordered static-first so provider prompt caches
(Anthropic explicit cache_control, OpenAI/Gemini automatic prefix caching) can reuse the
longest possible prefix across triggers. Each loading item carries one of two markers:
[STABLE]— changes only onagent syncor governance edits: the skill's ownreferences/format specs, the Constitution,_conventions.md. These load first.[DYNAMIC]— changes per knowledge update, per change, or per trigger:_index.md(first after the cache boundary), module READMEs,_playbook.md, Feature/Product Specs, and.prospec/changes/artifacts. These load last.
The classification criterion is cross-request prefix stability, not "is it generated":
the entry config's Available Skills list is per-project fixed (it changes only when the
skill set changes), so it is [STABLE]. Extension authors adding skills must follow the
same ordering — static loads before the boundary, dynamic after — or they break the cache
prefix for every trigger. What the harness measures is the prospec assembly pipeline
(its corpus assembles knowledge files, not the skill templates themselves) — see Token
Measurement below. The template-level reorder takes effect at the agent deployment layer,
outside the harness's observable scope (a deliberate exclusion): its benefit follows from
the providers' documented prefix-caching semantics, not from a direct before/after measurement.
| Command | Description |
|---|---|
prospec quickstart [options] |
One-command onboarding: runs init + agent sync (skipping completed steps), then hands off to /prospec-quickstart in your AI agent for trigger localization + Knowledge generation. Same --name/--agents/--language options as init |
prospec upgrade [--cwd <dir>] |
After a prospec version bump: record the prospec version in .prospec.yaml (canonical format), re-run agent sync, and print a migration report, then hand off to /prospec-upgrade. Writes no docs — init-created / canonical-doc format updates are the consent-gated skill's job |
prospec init [options] |
Initialize Prospec project structure (--language sets the AI-generated document language; default English) |
prospec knowledge init [--depth <n>] [--dry-run] [--raw-scan-only] |
Scan project → generate raw-scan.md + curated skeletons (module-map.yaml / _index.md / _conventions.md, only if absent). --raw-scan-only regenerates only raw-scan.md (deterministic, no LLM), leaving curated files untouched — run after code changes or before /prospec-knowledge-generate to refresh the structure snapshot |
prospec agent sync [--cli <name>] |
Sync AI agent configs + generate Skills (reads skill_triggers from .prospec.yaml for native-language trigger words) |
Agent config layout —
agent syncwrites each detected agent's entry config + Skills:
- Claude Code →
CLAUDE.md+.claude/skills/- Antigravity / Codex / GitHub Copilot →
AGENTS.md+.agents/skills/(the shared agents.md open standard; written once even when several are enabled)Your edits are safe: entry configs carry
prospec:auto/prospec:userblocks.agent sync(andinitforAGENTS.md) refresh only the auto block and preserve whatever you write in the user block; a pre-existing hand-writtenCLAUDE.md/AGENTS.mdis migrated into the user block on first sync rather than clobbered.Upgrading from an older Prospec? After re-syncing, remove the now-unused
GEMINI.md,.gemini/skills/,.codex/skills/,.github/copilot-instructions.md, and.github/instructions/.
prospec knowledge init (incl. --raw-scan-only) detects the following into raw-scan.md. Detection is deterministic (no LLM, no network) and best-effort; coverage differs by section:
| Language | Tech Stack | Dependencies | Entry Points | Config Files |
|---|---|---|---|---|
| JavaScript / TypeScript | ✅ (+ framework) | ✅ package.json |
✅ | ✅ |
| Python | ✅ | ✅ pyproject.toml / requirements.txt |
✅ | ✅ |
| Go | ✅ | ✅ go.mod |
✅ | ✅ |
| Rust | ✅ | ✅ Cargo.toml |
✅ | ✅ |
| Java / Kotlin | ✅ Maven / Gradle | ✅ pom.xml ¹ |
✅ | ✅ |
| C# | ✅ | ✅ *.csproj |
✅ | ✅ |
| Ruby | ✅ | — ² | ✅ | ✅ |
| PHP | ✅ | ✅ composer.json |
— | ✅ |
| C | ✅ ³ | ✅ vcpkg.json / conanfile.txt ⁴ |
✅ | ✅ |
| C++ | ✅ ³ | ✅ vcpkg.json / conanfile.txt ⁴ |
✅ | ✅ |
| Swift | ✅ Package.swift |
— ⁵ | ✅ | ✅ |
¹ Java dependencies are read from Maven pom.xml only — the Gradle Groovy/Kotlin DSL is not statically parsed. ² Ruby dependencies are not parsed (Gemfile is a Ruby DSL). ³ C vs C++ is inferred from source-file extensions; set tech_stack in .prospec.yaml to override. ⁴ C/C++ dependencies are read from declarative manifests only — CMakeLists.txt and conanfile.py are imperative and not parsed. ⁵ Swift dependencies are not parsed (Package.swift is imperative Swift). Any unrecognized language still appears in the Directory Tree and File Stats sections.
A language outside this table? It still scans — the Directory Tree and File Stats sections are always populated, and /prospec-knowledge-generate reads the source directly. The Tech Stack line falls back to unknown; declare it authoritatively in .prospec.yaml tech_stack (free-form — it overrides auto-detection and is reported with Source: config):
tech_stack:
language: zig
package_manager: zig buildEntry Points, Dependencies, and Config Files have no per-language override — they stay empty for an unrecognized language until detection patterns are added (the scan never invents them).
| Command | Description |
|---|---|
prospec change story <name> |
Create change story (scaffold) |
prospec change plan [--change <name>] [--force] |
Generate implementation plan (scaffold); refuses to overwrite an existing plan/delta-spec unless --force |
prospec change tasks [--change <name>] [--force] |
Break down tasks (scaffold); refuses to overwrite an existing tasks.md unless --force |
Note: These commands scaffold empty change artifacts. The Skills (
/prospec-new-story,/prospec-ff, …) now create.prospec/changes/<name>/and its files directly, so the workflow doesn't call them — they remain available for manual or scripted scaffolding.
A read-only, stdio MCP server that exposes the project's truth — architecture, specs, dependency direction, promoted playbook, and knowledge freshness — to any MCP-capable agent, even one without Prospec Skills installed.
| Command | Description |
|---|---|
prospec mcp serve [--cwd <path>] |
Start a read-only MCP server on stdio — any MCP-capable agent (even one without Prospec Skills installed) can query the project's architecture truth, spec truth, dependency direction, promoted playbook, and knowledge freshness. --cwd pins the project root so one agent can run several project servers regardless of where it was launched |
Resources (re-read from disk on every request — clients always see current file state):
| URI | Content |
|---|---|
knowledge://index |
AI Knowledge module index (_index.md) |
knowledge://module/{name} |
One module's Recipe-First README |
knowledge://module-map |
Module boundaries + depends_on (module-map.yaml) |
knowledge://feature-map |
feature → module index + REQ prefixes (feature-map.yaml) |
knowledge://playbook |
Human-approved team lessons (_playbook.md) |
knowledge://health |
Per-module staleness + coverage — same pure function as prospec check |
spec://product |
Product spec — PRD entry point + feature map (product.md) |
spec://feature/{name} |
Feature specs (REQ source of truth); archived specs are excluded by the same rule prospec check uses |
Tools: search_modules (which module owns a concept — normalized term-OR match over the curated
index columns, so drift checker finds drift-checker) and get_dependency_direction (may from
import to? — answered from module-map depends_on, or the Constitution chain when no map exists;
the answer states which source it used).
Registering — point your agent's MCP config at prospec mcp serve --cwd <project-root>. --cwd
pins the project so the server resolves its .prospec.yaml no matter where the agent was launched —
which also lets one agent register several projects at once. Assumes the recommended global install
(prospec on PATH).
Claude Code:
claude mcp add project-name -- prospec mcp serve --cwd /path/to/projectOther agents — the same command in the agent's JSON MCP config:
{
"mcpServers": {
"project-name": {
"command": "prospec",
"args": ["mcp", "serve", "--cwd", "/path/to/project"]
}
}
}To serve several projects from any directory, register one entry per project — each with a unique
name and its own --cwd (Claude Code: add -s user so it's available everywhere):
claude mcp add -s user prospec-a -- prospec mcp serve --cwd /path/to/A
claude mcp add -s user prospec-b -- prospec mcp serve --cwd /path/to/BPinned prospec as a devDependency rather than installed globally? Route through npx: prefix the
Claude Code command (… -- npx prospec mcp serve --cwd /path/to/project), or in JSON set
"command": "npx" with "prospec" as the first arg (["prospec", "mcp", "serve", "--cwd", "/path/to/project"]).
Honest boundaries: the server is read-only (no tool or resource can modify files), serves one project
per process (the root given by --cwd), and is a pure add-on — no Skill or CLI command depends on it,
so everything works unchanged when it is not running. Transport is stdio only; HTTP/SSE is
deliberately not included in this version.
Token Measurement — make the token-efficiency claim verifiable
| Command | Description |
|---|---|
pnpm measure:tokens [-- --provider <p>] [-- --budget <usd>] |
Run the offline benchmark: assemble full-dump / naive-rag / prospec contexts from the live repo and record real provider API usage (requires an API key; default budget US$10 per provider) |
prospec measure [--report <path>] |
Display the measurement report (read-only — never calls an API, never burns tokens) |
The harness makes the token-efficiency claim verifiable instead of asserted: for each corpus task
(tests/fixtures/token-corpus/, version-controlled task descriptions only — contexts are assembled
at run time) it sends each assembled context twice (cold + warm) and reads the provider's real usage.
Agent → measured provider (copilot/codex have no public benchmark API; they are measured via their model provider, not the agent harness itself):
| Agent | Provider API | Default model |
|---|---|---|
| claude | Anthropic | claude-haiku-4-5 |
| codex, copilot | OpenAI | gpt-4.1-mini |
| antigravity | gemini-2.5-flash |
How to read the numbers (honest boundaries):
- The efficiency claim is input-token cost vs the full-dump baseline; the naive-rag baseline is always shown alongside, where the margin is smaller. Output tokens are unaffected and listed honestly.
- warm* numbers are synthetic cache hits (two back-to-back calls); production hit rates depend on
whether triggers land within the provider's cache TTL. Providers also enforce a minimum cacheable
prefix (e.g. 4,096 tokens on
claude-haiku-4-5) — a small prospec assembly below that floor honestly records a 0% hit rate even though the mechanism works at production context sizes. - Cache discount structures differ per provider (Anthropic explicit
cache_control, OpenAI/Gemini automatic prefix caching) — numbers are comparable only within the same provider, never across providers or repo snapshots (the report records the git commit it measured). - No thresholds, no CI gating: the report informs humans; it does not pass or fail anything.
- Any "token saving" figure quoted in this project must come from this harness — estimates are not data.
Drift Check (CI gate) — deterministic spec ↔ code ↔ knowledge integrity
| Command | Description |
|---|---|
prospec check [--json] [--strict] |
Deterministic, zero-LLM drift check across spec ↔ code ↔ knowledge: dangling REQ references, broken markdown links, module-map-driven import direction, knowledge freshness (git commit timestamps, WARN-only), kind-aware task completion, README declared-count veracity (e.g. "registers N resources" vs the code it names, WARN-only), and — when feature-map.yaml is present — REQ-prefix legality (WARN) and the feature→module edge (FAIL). --json writes machine-readable prospec-report.json; --strict exits 1 on any FAIL (warn/skipped never affect the exit code) |
prospec check --init-ci |
Scaffold a supply-chain-hardened GitHub Actions gate (.github/workflows/prospec-check.yml): SHA-pinned actions, least-privilege permissions, report artifact upload, and a sticky PR comment posted from a job that never checks out source |
Honesty rules: an unavailable source degrades the check to skipped with an explicit reason —
never a fake PASS — and semantic spec↔code consistency stays with /prospec-review (the report
permanently marks it not-checked). /prospec-verify consumes the same report at dev time, so
the developer and the CI gate always see the same facts, token-free.
Prospec uses Pragmatic Layered Architecture for CLI development best practices:
src/
├── cli/ — Commander.js commands + formatters
├── services/ — Business logic (14 services)
├── lib/ — Pure utility functions (config, fs, logger, etc.)
├── types/ — Zod schemas + TypeScript types
└── templates/ — Handlebars templates (56 .hbs files)
└── skills/ — 17 Skill templates + 19 reference templates
- CLI Framework: Commander.js 14 + @inquirer/prompts 8
- Validation: Zod 4
- Templating: Handlebars 4.7
- File Scanning: fast-glob 3.3
- YAML: eemeli/yaml 2.x (preserves comments)
- Testing: Vitest 4.0 + memfs
- TypeScript: 5.9
# Run all tests (1748 tests)
pnpm test
# Watch mode
pnpm run test:watch
# Type check
pnpm run typecheck
# Lint
pnpm run lint
# End-to-end check: build, run `init` + `agent sync` in a throwaway
# project, and assert the generated Skills / system md are well-formed
pnpm run verify:skillsTest Coverage: 1748 tests across 4 categories:
- Unit tests (types + lib + services + cli): 1137 tests
- Contract tests (CLI output + Skill format): 554 tests
- Integration tests: 16 tests
- E2E tests: 41 tests
verify:skills complements the suite with a real init + agent sync run, asserting agent-specific reference paths, no dangling references, canonical convention docs, base_dir-relative spec paths, and Copilot inlining.
We welcome contributions! Please see CONTRIBUTING.md for guidelines.
Development uses pnpm (Node 22.13+, pnpm 11+).
# Clone and install
git clone https://github.com/benwu95/prospec.git
cd prospec
pnpm install
# Run in dev mode
pnpm run dev
# Build
pnpm run build
# Test
pnpm testLocal install — test the prospec CLI globally
# First time: install deps, build, then register the bin globally
pnpm install && pnpm run build && pnpm add -g .
# After making changes, just rebuild — the global bin picks up the new dist/
pnpm run build
# Remove it when finished
pnpm uninstall -g prospecFirst-time global install needs
pnpm setuprun once (configures the global bin directory).The single lockfile is
pnpm-lock.yaml; after changing dependencies runpnpm installand commit it. See CONTRIBUTING.md.
MIT License - see LICENSE for details.
Prospec is a fork of ci-yang/prospec by Ci Yang — the upstream project this codebase originates from.
Beyond that lineage, Prospec draws inspiration from:
- OpenSpec — Delta Specs, Fast-Forward, Archive
- Spec-Kit — Constitution validation
- cc-sdd — Steering analysis, template customization
- BMAD — Analyst role (prospec-explore)
Prospec's unique contribution: Skills-driven SDD with a thin CLI — Skills run the workflow inside your AI agent; the CLI only bootstraps and regenerates. Plus AI Knowledge as Context Engineering — structured, versioned, progressive project memory for AI agents.
prospec-verify and prospec-review adapt engineering heuristics (failure-recovery triage, and security / performance / maintainability lens criteria) from addyosmani/agent-skills (MIT) — vendored into prospec's own self-contained reference templates, so no plugin install is required for prospec to work. If you want the fuller standalone treatment, that plugin is worth a look as optional further reading: marketplace addy-agent-skills, plugin agent-skills (invocable as agent-skills:*). Attribution: see THIRD-PARTY-NOTICES.
Made with care for the AI-powered development community