Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
6253940
refactor(v3.0.1): extract jsonl_store amendment-overlay primitives
sachinshelke May 28, 2026
c1352d7
fix(v3.0.1): session_id default is unique per call, not literal 'ad-hoc'
sachinshelke May 28, 2026
618710a
feat(v3.1.0): M1 Phase A origin tagging — storage layer
sachinshelke May 28, 2026
ff06b3d
feat(v3.1.0): M1 Phase A origin tagging — IDE config injection
sachinshelke May 28, 2026
2a7b3ad
feat(v3.1.0): M2 Phase 1 — working_store storage layer
sachinshelke May 29, 2026
9640c4a
feat(v3.1.0): M2 Phase 2 — working memory MCP tools
sachinshelke May 29, 2026
9dcf561
feat(v3.1.0): M2 Phase 3 — engine fanout + get_session_context panel …
sachinshelke May 29, 2026
972ee1a
feat(v3.1.0): M3 Phase 1 — skills_store storage layer
sachinshelke May 29, 2026
d0f2798
feat(v3.1.0): M3 Phase 2 — FTS5 skills table + 6 MCP tools
sachinshelke May 29, 2026
96bf32f
feat(v3.1.0): M4 Phase 1 — activity_store + memory_fanout/decisions i…
sachinshelke May 29, 2026
d10181f
feat(v3.1.0): M4 Phase 2 — neighborhoods + affordances + 4 spatial MC…
sachinshelke May 29, 2026
02f179d
feat(v3.1.0): M5 — skill induction wired to outcomes_writer
sachinshelke May 29, 2026
96f2639
feat(v3.1.0): M6 Phase B — cross-IDE consensus check (read-only)
sachinshelke May 29, 2026
5b1f421
feat(v3.1.0): M7 — consensus Phase C handshake (opt-in, default off)
sachinshelke May 29, 2026
2b7e4e5
feat(v3.1.0): M8 — reflections (storage + sanitization + CLI/MCP surf…
sachinshelke May 29, 2026
b28ce39
docs(v3.1.0): M9 — CLAUDE.md memory catalog + CHANGELOG entry + versi…
sachinshelke May 29, 2026
e3b127b
feat(graph): interactive viewer — pan/zoom/drag + hover focus + clutt…
sachinshelke May 29, 2026
db4518b
feat(graph): multi-lens interactive viewer + memory subsystem overlays
sachinshelke May 29, 2026
7614d8b
fix(memory): close 4 product bugs + comprehensive M1-M8 test sweep
sachinshelke May 29, 2026
d811e3e
docs: sync AGENTS.md decision block (D00001G, D00001H, +109 more)
sachinshelke May 29, 2026
7a7021d
fix(memory): hardening sweep - sanitize all stores + 4 bug fixes + sc…
sachinshelke May 30, 2026
6d2a6d6
tune(engine): bump relevance_inject _DEFAULT_MIN_SCORE 0.10 -> 0.25
sachinshelke May 30, 2026
6d05f48
refactor(graph): extract template.html from cli_graph.py monolith
sachinshelke May 30, 2026
48fed50
docs: sync AGENTS.md decision-tail count (109 -> 230)
sachinshelke May 30, 2026
b4416a9
fix(release): release-verify-version sed \s not portable on BSD/macOS
sachinshelke May 30, 2026
dddffef
docs: sync AGENTS.md decision-tail count (230 -> 290)
sachinshelke May 30, 2026
aedc2ae
feat(graph): ranked search + Q&A + outcome lens + lineage trace
sachinshelke May 30, 2026
35ae6d1
fix(graph): paranoia pass — debounce search, clarify outcome legend, …
sachinshelke May 30, 2026
aac00fd
docs: sync AGENTS.md decision-tail (290 -> 310) post viewer-overhaul
sachinshelke May 30, 2026
da8c0e5
revert(engine): _DEFAULT_MIN_SCORE 0.25 -> 0.10 — broke cross-tool wedge
sachinshelke May 30, 2026
50a3027
docs: sync AGENTS.md after threshold revert + e2e gate widening
sachinshelke May 30, 2026
9af20bc
feat(release): G3 real-IDE smoke — implement the last stubbed gate
sachinshelke May 30, 2026
78af055
feat(sync): auto-classify outcomes via observe-git tail step
sachinshelke May 30, 2026
6874fcf
test(engine): pin the cross-tool wedge at the unit level
sachinshelke May 30, 2026
947164d
docs: sync AGENTS.md after G3+sync+wedge-test commits
sachinshelke May 30, 2026
3bef507
release(v3.1.1): docs + version bump + CHANGELOG freshness gate
sachinshelke May 30, 2026
df5dd06
fix(release): drop multi-line comments inside recipe — shell syntax err
sachinshelke May 30, 2026
0f87052
docs: sync AGENTS.md after v3.1.1 docs + gauntlet
sachinshelke May 30, 2026
d5eb67f
feat(engine): v3.2.0 opener — session_log_enforcer (warn-mode)
sachinshelke May 31, 2026
6c567e0
feat(graph): v3.2.0 — Q&A vocab expansion (who/when/compare)
sachinshelke May 31, 2026
329254e
feat(reflect): v3.2.0 — real MCP sampling/createMessage path
sachinshelke May 31, 2026
c1a1107
feat(decisions): v3.2.0 — do_not_revert soft-expire + reaffirm
sachinshelke May 31, 2026
34a4eb9
release(v3.2.0): version bump + CHANGELOG date stamp
sachinshelke Jun 1, 2026
724ceea
docs: sync AGENTS.md decision-tail count (650 -> 670)
sachinshelke Jun 1, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,10 @@
- **D00001D** 3.0.0 BUILD COMPLETE (this session). All enhancements are built, tested, committed to local main, and the full release …
- **D00001E** fix: Python 3.10 CI failure — agents_md_generator._project_name() and cli_init used bare `import tomllib` (stdlib 3.11+…
- **D00001F** RESOLVED the known release-smoke tooling bug: replaced `head -1` with `sed -n '1p'` in Makefile release-smoke + wheel-n… · `Makefile` · _makefile, release, tooling_
- **D00001G** v3.0.x storage prereq IMPLEMENTATION COMPLETE on branch release/3.0.1 (commits 6253940 + c1352d7). Patches 1+2+3 done: … · _memory, prereq, storage, v3.0.1_
- **D00001H** M1 Phase A origin tagging IMPLEMENTATION COMPLETE on release/3.0.1 (commits 618710a storage + ff06b3d ide_inject). orig… · _consensus, m1, memory, origin, v3.1.0_

_+670 more decision(s) — full log in `.codevira/decisions.jsonl`._


For the full decision log + outcomes + reverts, see `.codevira/decisions.jsonl` or run `codevira list-decisions`.
Expand Down
470 changes: 470 additions & 0 deletions CHANGELOG.md

Large diffs are not rendered by default.

76 changes: 74 additions & 2 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ Call these MCP tools at the moments the description matches your action — they

- For commits that fix a bug, prefer commit messages starting with `fix:`, `bug:`, `hotfix:`, or `fixes #N`. Codevira scans these into a fix-history database used by the Anti-Regression hero to block silent re-introduction of fixed bugs.

### Before you finish a meaningful unit of work — NON-NEGOTIABLE
### Before you finish a meaningful unit of work — STRONG RECOMMENDATION

Codevira's promise is "the project remembers what you did." That promise breaks if you don't write to it. Before you respond to the user with the final result of a meaningful change, **call ONE of these**:

Expand All @@ -36,7 +36,9 @@ Codevira's promise is "the project remembers what you did." That promise breaks
- **`complete_changeset(changeset_id, decisions=[...])`** — if you closed a multi-file fix
- **`write_session_log(...)`** — at minimum, at the end of any session that produced commits or non-trivial output

A session that ships code WITHOUT a codevira write call leaves the project's memory stale for the next AI. That's the most common way the wedge breaks. Treat it as part of the definition-of-done, not optional.
A session that ships code WITHOUT a codevira write call leaves the project's memory stale for the next AI. That's the most common way the wedge breaks. Treat it as part of the definition-of-done.

**Engine enforcement (v3.2.0+):** The `session_log_enforcer` policy fires on `Stop` events. If the session shipped commits AND no `write_session_log` was called between `SESSION_START` and now, it emits a `warn` via Claude Code's `systemMessage` channel. Default mode is `warn` (non-blocking nudge); set `CODEVIRA_SESSION_LOG_ENFORCER_MODE=block` to force the AI to retry, or `off` to disable. v3.2.1 plans to default to `block` once warn-mode instrumentation confirms low noise. Logging is still your judgment call for what counts as "meaningful" — if you only answered a question with no commits, the policy stays silent.

### When you see "Roadmap drift detected" in your SessionStart context

Expand All @@ -51,6 +53,76 @@ That warning fires when codevira's claimed phase hasn't been updated for several

- **`search_decisions(query="X")`** is the answer. Don't guess — surface the actual decision log.

## Memory subsystems (v3.1.0)

v3.1.0 added five memory subsystems on top of the existing decision log. Each has a specific moment to call it; together they cover the gap between "episodic" (decisions) and "the agent's day-to-day state."

### Working memory — intra-session scratchpad

`.codevira-cache/working.jsonl` (per-machine, ephemeral, gitignored). Capacity-bounded, decay-scored.

- **`working_add(content, kind="observation"|"goal", importance=5, links=[])`** — record an observation (something you saw) or a goal (something you're trying). `Edit`/`Write`/`Bash` calls auto-populate this via the post_tool_use hook; explicit calls add narrative + intent the auto path can't see.
- **`working_get(top_k=10, kind=?)`** — top-K live entries by decay score (importance × exp(-Δt/τ=6h) + 0.5 × access_count). Tombstoned entries excluded.
- **`working_promote(entry_id, to="decision"|"skill"|"playbook", ...)`** — move an observation/goal into LTM. Calls `check_conflict` first; tombstones the source on success.
- **`get_working_context(top_k=5)`** — compact markdown for ReAct-loop injection.

Working memory persists into `get_session_context` (top-3 panel) so the next call sees your recent scratchpad.

CLI escape hatch: `codevira working commit <session_id>` archives a session's live entries to `.codevira/working_archived/<session_id>.jsonl` (canonical, team-shareable).

### Skill library — procedural memory

`.codevira/skills.jsonl` (canonical, team-shareable). FTS5-backed retrieval with composite ranking (BM25 + tag-Jaccard + recency).

- **`record_skill(name, procedure, summary, triggers, do_not_revert, force)`** — author a reusable procedure ("how we rebase in this repo", "the project's commit-message convention"). Conflict-checked against existing skills.
- **`get_skill(query, top_k=5, file_path=?)`** — composite-ranked search. Returns `score_breakdown` so you can see WHY each skill surfaced.
- **`apply_skill_outcome(skill_id, success)`** — manual reinforcement. The *canonical* signal comes from git via `outcomes_writer` fan-out (M5) — this tool is the override.
- **`list_skills(status="active"|"archived"|"superseded"|"all", source, tags)`** — daily-driver `active` filter by default.
- **`supersede_skill(old_id, name, procedure, ...)`** — version a skill; amendment chain preserves audit.
- **`promote_skill_to_playbook(skill_id, task_type, name?, force)`** — write a skill's procedure as a playbook markdown so `get_playbook(task_type)` finds it.

Auto-archive at 5 consecutive failures OR `unused_days ≥ 90` (configurable). Skills with `do_not_revert=true` are exempt.

CLI: `codevira induce-skills [--apply] [--yes]` — cluster productive sessions (≥80% kept, tag-Jaccard ≥ 0.5) and propose induced skills. Without `--apply`: writes to `.codevira/induction_proposals.jsonl` for review.

### Spatial memory — code-as-space

Activity heatmap (`.codevira-cache/activity.jsonl`, per-machine) + folder-tree neighborhoods + affordances.

- **`spatial_nearby(file_path, k=5)`** — files topologically near a file (BFS ≤ 2 hops over import/call edges + same-neighborhood), ranked by recent activity. Use when navigating unfamiliar code.
- **`spatial_heat(top_k=20, since_days=?)`** — where attention has concentrated. Use for "what changed this week?".
- **`spatial_neighborhood(file_path)`** — the folder-tree-derived (or yaml-overridden) neighborhood + members.
- **`spatial_affordances(file_path)`** — what task_types apply here. E.g., a file under `mcp_server/tools/` typically affords `{add_tool, write_test}`. Combine with `get_playbook(task_type)` for relevant rules.

Override files: `.codevira/neighborhoods.yaml` (re-label folder mapping); `.codevira/affordances.yaml` (project-specific affordances on top of `mcp_server/data/affordances.yaml`).

### Consensus — cross-IDE awareness

Tracks which IDE wrote each decision so contradictions across IDEs surface.

- **`consensus_check()`** — run a scan (read-only) for cross-IDE conflicts since this IDE's last checkpoint. Materializes matches to `.codevira/pending_conflicts.jsonl`.
- **`consensus_status(top_k=3)`** — count + top-K pending conflicts (`get_session_context` also surfaces a panel).
- **`origin_of(decision_id)`** — provenance lookup (always available — provenance is M1).

Phase C (opt-in handshake, default off) — gated by `memory.consensus.handshake_enabled` in `.codevira/config.yaml`:
- **`consensus_propose_supersession(target_decision_id, new_decision, reason)`** — open a proposal against a foreign IDE's `do_not_revert` decision. Same-IDE fast-path bypasses the handshake.
- **`consensus_resolve(proposal_id, action="approved"|"rejected"|"withdrawn", comment?)`** — record the response.
- 14-day timeout default; expired proposals can be force-finalized via `expired_unilateral=True` (with audit row).

CLI: `codevira consensus check`.

### Reflections — episodic abstraction

`.codevira/reflections.jsonl` (committed). LLM-generated abstractions over recent decisions + sessions.

- **`reflect(period_days=7, dry_run=True)`** — build the source context + render the prompt. v3.1.0 returns `sampling_supported: False` + `rendered_prompt` (the MCP sampling/createMessage RPC ships in v3.2). Use the CLI to commit an LLM response.
- **`get_reflections(top_k=5)`** — most recent reflections.
- **`list_reflections(since?, tags?, limit=50)`** — filtered list.

CLI: `codevira reflect [--period 7d] [--from-file PATH] [--apply] [--yes]`.

Sanitization pass strips api keys / Bearer tokens / passwords / AWS AKIA / long hex / long base64 from the source context before the LLM sees it.

## Tool budget discipline

Codevira is **token-efficient by design**:
Expand Down
16 changes: 14 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ test-unit:
$(PYTHON) -m pytest tests/ -q --ignore=tests/e2e --ignore=tests/integration

test-e2e:
$(PYTHON) -m pytest tests/e2e/test_first_contact.py tests/e2e/test_product_invariants.py -v
$(PYTHON) -m pytest tests/e2e/test_first_contact.py tests/e2e/test_product_invariants.py tests/e2e/test_cross_tool_universality.py -v

# v2.1.2 hardening — integration suite (slower; runs in gauntlet):
# MCP round-trip, help-text linter, sandboxed-parent. Skipped from
Expand Down Expand Up @@ -252,7 +252,7 @@ release-verify-version:
@echo " ✓ In sync with origin (or no upstream tracking)"
@# 4. Cross-check version in __init__.py if it declares __version__.
@if [ -f mcp_server/__init__.py ]; then \
INIT_VER=$$(grep -E "^__version__" mcp_server/__init__.py 2>/dev/null | sed -E 's/.*=\s*"([^"]+)".*/\1/'); \
INIT_VER=$$(grep -E "^__version__" mcp_server/__init__.py 2>/dev/null | sed -E 's/.*= *"([^"]+)".*/\1/'); \
if [ -n "$$INIT_VER" ] && [ "$$INIT_VER" != "$(VERSION)" ]; then \
echo " ✗ Version drift: pyproject.toml=$(VERSION) but mcp_server/__init__.py=$$INIT_VER"; \
exit 1; \
Expand All @@ -268,6 +268,18 @@ release-verify-version:
echo " Promote the [Unreleased] section to [$(VERSION)] before releasing."; \
exit 1; \
fi; \
\
NEWER=$$(find mcp_server indexer -type f \( -name "*.py" -o -name "*.html" \) -newer CHANGELOG.md 2>/dev/null | wc -l | tr -d ' '); \
if [ "$$NEWER" -gt "0" ]; then \
echo " ✗ CHANGELOG.md is OLDER than $$NEWER source file(s) under mcp_server/ + indexer/."; \
echo " The current $(VERSION) entry is probably stale relative to the wheel."; \
echo " Either: (a) update the entry to cover the new commits, OR"; \
echo " (b) bump the patch version + add a new entry."; \
echo " First offenders:"; \
find mcp_server indexer -type f \( -name "*.py" -o -name "*.html" \) -newer CHANGELOG.md 2>/dev/null | head -5 | sed 's/^/ /'; \
exit 1; \
fi; \
echo " ✓ CHANGELOG.md is fresh (newer than every tracked source file)"; \
fi
@# 6. Tag check: if tag exists, must point at HEAD.
@if git rev-parse "v$(VERSION)" >/dev/null 2>&1; then \
Expand Down
22 changes: 22 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,28 @@ every AI tool, on every project, on your local machine.**

---

## What's new in v3.1.1 — hardening + interrogable memory

> 3.1.1 supersedes the briefly-published 3.1.0 (which shipped
> without this README/CHANGELOG entry). Same code shape; this
> release is the documented one. Brings five memory subsystems
> (M1–M9 from 3.1.0) plus the v3.1.1 hardening + viewer overhaul.

| Area | What you get |
|---|---|
| **Five memory subsystems** | Origin tagging (M1), working memory (M2), skill library with FTS5 ranking (M3), spatial memory + activity heatmap (M4), skill induction wired to outcomes (M5), cross-IDE consensus check + handshake (M6/M7), reflections (M8). 22 new MCP tools. |
| **Secret scrubbing everywhere** | Decisions, sessions, working memory, skills, reflections — every store scrubs api-key / Bearer / password / AWS AKIA / long hex / long base64 at the write boundary. One shared `mcp_server/storage/sanitize.py`. |
| **Counter-decision discipline** | `record_decision` now accepts `alternatives_considered: list[str]` and `would_re_examine_if: str` — losing options + invalidation trigger surface in the viewer's rich-detail panel. Optional + back-compat. |
| **Interrogable graph viewer** | `codevira graph` is no longer a passive force-layout. Free-text search → top-K ranked panel with score + outcome badge. Q&A intent detection ("what did we decide about X", "what got reverted", "what's protected"). Outcome lens (kept/modified/reverted). Lineage trace mode for supersession chains. |
| **Auto outcome classification** | `codevira sync` now runs `observe-git` as a best-effort tail step — outcomes flow into the viewer's outcome lens automatically. |
| **G3 real-IDE smoke** | The last permanently-skipped gauntlet gate now ships as a real check. Verifies codevira is registered in each detected IDE config + MCP `tools/list` round-trips in <1s. |
| **AGENTS.md no more churn** | `regenerate()` is now idempotent — no rewrite when content unchanged, no perpetual uncommitted-drift loop. |
| **4 silent bugs fixed** | `commit_session("../escape")` rejected; `triggers.tags="git"` rejected; `list_all(limit=0)` returns `[]`; spatial BFS catches query-time sqlite errors. |

Full v3.1.1 release notes: [CHANGELOG.md](CHANGELOG.md#311--2026-05-30--hardening-viewer-overhaul-g3-sync-observe-git).

---

## What's new in v3.0.0 — audited, lean, opinionated

> Major version. v3.0.0 is the biggest API contraction since v2.0
Expand Down
Loading
Loading