Skip to content

fix: snapshot git-pollution (#625), structural-tool steering (#626), convergence governor (#624)#631

Open
justrach wants to merge 4 commits into
mainfrom
fix/issues-624-625-626
Open

fix: snapshot git-pollution (#625), structural-tool steering (#626), convergence governor (#624)#631
justrach wants to merge 4 commits into
mainfrom
fix/issues-624-625-626

Conversation

@justrach

Copy link
Copy Markdown
Owner

Fixes the remaining cluster of issues filed in the last 2 days. Each carries a test; full zig build test is green.

#625codedb.snapshot pollutes git

The 22.8 MB in-tree index showed up in git status and corrupted working-tree diffs. After writing the in-tree snapshot, codedb.snapshot is appended to .git/info/exclude (local, untracked — leaves the user's .gitignore alone). Best-effort + idempotent; only the in-tree write triggers it, not the central ~/.codedb store.

  • Test: issue-625: in-tree snapshot is added to .git/info/exclude (test_snapshot.zig)

#626 — agents skip the structural tools

Tool descriptions + the MCP initialize instructions now steer agents to symbol/callers/deps/outline first and frame search as a fallback; codedb_deps is surfaced as the impact/blast-radius tool. (Cherry-picked from the issue-626-structural-steering work.)

  • Tests in test_mcp.zig.

#624 — non-convergent nav runaways (3–5× tokens)

New per-session ConvergenceGovernor: an 8-deep ring buffer of recent nav call signatures. When the same search/find/word/read/outline call recurs ≥3× in the window, an in-band nudge is appended steering the agent to a structural tool, a direct read, or a refined query. The nudge never alters a tool's result; write/admin tools aren't governed.

  • Test: issue-624: convergence governor flags a repeated identical call (test_mcp.zig)

Closes #624
Closes #625
Closes #626

🤖 Generated with Claude Code

justrach and others added 4 commits June 19, 2026 12:11
The index was written to {root}/codedb.snapshot and showed up in
`git status` (22.8 MB in one real repo), so it was easy to commit by
accident and it corrupted any tooling that diffs the working tree
(a plain `git add -A && git diff` swept the binary into the patch).

After writing the in-tree snapshot, append `codedb.snapshot` to the
repo's `.git/info/exclude` — a local, untracked ignore file — so git
never sees it, without touching the user's tracked `.gitignore`.
Best-effort and idempotent: not-a-git-repo, worktrees where `.git` is a
file, or any I/O error are silently skipped so indexing never fails.
isRootSnapshot guards so only the in-tree write triggers it, not the
central ~/.codedb store.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Agents on the codedb MCP surface default to search -> read -> edit and skip
the structural tools (symbol/callers/deps/outline), so the code graph goes
unexercised. Make the structural path the path of least resistance.

- Reframe tool descriptions + server instructions to prescribe the structural
  tools first and cast codedb_search as a substring/phrase fallback.
- Runtime nudge on search: a bare identifier that resolves to an indexed
  symbol prepends a one-line pointer to codedb_symbol/codedb_callers
  (text output only, skipped for format=json).
- Runtime nudge on read: a whole-file read (>=400 lines, no range) prepends a
  pointer to codedb_outline; wired into both the cached and uncached paths.
- Tests (issue-626) cover the gating logic: isBareIdentifier and
  fullFileReadHint.

Closes #626. (#623 closed separately as a duplicate; its distinct
loop/redundancy-detection guardrail is not addressed here.)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
#626)

Follow-up to the #626 structural steering. Auditing the tool surface showed
mcpGenerateGuidance already steers most graph tools as "-> next" hints
(callers->callpath, edit->changes, hot->outline, the symbol/search/outline/word
chain). The single genuine gap is codedb_deps: nothing points to it and it has
no next-hint.

- Add depsHint: after a single-definition codedb_symbol hit (the moment before
  an edit, when blast radius matters), prepend a one-line pointer to codedb_deps.
  Pure + count-gated (results.len == 1), text-only, mirrors fullFileReadHint.
- Upgrade three passive differentiator descriptions to prescriptive: codedb_deps
  (impact/blast-radius), codedb_hot (orientation), codedb_changes (what-changed).

No callpath nudge: codedb_callers already emits "-> next: codedb_callpath", so an
inline one would duplicate it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Large-repo trajectories showed high variance with occasional
non-convergent runaways (3–5× tokens) — an agent firing the same
search/read over and over without progress. Add a per-session
ConvergenceGovernor: an 8-deep ring buffer of recent navigation
call signatures (tool name + argument values). When the same nav call
(search/find/word/read/outline) recurs >= 3 times in the window,
handleCall appends a one-line in-band nudge steering the agent to a
structural tool (symbol/callers/deps), a direct read, or a refined
query.

The nudge is appended to the assistant-visible output only — it never
changes a tool's result, and write/admin tools are not governed.
Session-less callers pass a null governor (no-op).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions

Copy link
Copy Markdown

👋 Thanks for the contribution! Quick heads-up: this repo lands changes on the current release/* branch, not main.

Please retarget this PR via Edit → base branch to the active release branch (currently release/0.2.5825).

(Automated hint — reply here if you need a hand.)

@github-actions

Copy link
Copy Markdown

Benchmark Regression Report

Thresholds: 10.00% and 50,000 ns absolute delta

NOISE means the percentage threshold was exceeded, but the absolute delta was too small to fail CI.

Tool Base (ns) Head (ns) Delta Abs Delta (ns) Status
codedb_bundle 106140 116880 +10.12% +10740 NOISE
codedb_changes 9931 10725 +8.00% +794 OK
codedb_context 800620 810542 +1.24% +9922 OK
codedb_deps 306 333 +8.82% +27 OK
codedb_edit 68231 37428 -45.15% -30803 OK
codedb_find 2559 3022 +18.09% +463 NOISE
codedb_hot 24497 24917 +1.71% +420 OK
codedb_outline 35662 38532 +8.05% +2870 OK
codedb_read 16074 20138 +25.28% +4064 NOISE
codedb_search 13959 53298 +281.82% +39339 NOISE
codedb_snapshot 81910 76207 -6.96% -5703 OK
codedb_status 9294 9589 +3.17% +295 OK
codedb_symbol 52413 59079 +12.72% +6666 NOISE
codedb_tree 19909 30634 +53.87% +10725 NOISE
codedb_word 11645 11834 +1.62% +189 OK

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6e30cc2676

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/mcp.zig
Comment on lines +1255 to +1256
if (occurrences >= ConvergenceGovernor.WARN_AT) {
out.appendSlice(alloc, "\n\n[codedb] You have issued this exact call several times — repeating it will not surface anything new. Change strategy: use a structural tool (codedb_symbol for a definition, codedb_callers for usages, codedb_deps for impact), open the file directly with codedb_read, or refine the query.") catch {};

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve JSON responses when adding loop hints

When a governed tool that supports structured output is repeated with format=json (for example the third identical codedb_search call), this appends a plain-text convergence hint after the handler has already written the JSON payload. The MCP response is still marked successful, but the assistant-visible text is no longer parseable JSON, defeating the advertised format=json contract; the governor should skip or separate hints for JSON-formatted tool calls.

Useful? React with 👍 / 👎.

Comment thread src/snapshot.zig
defer info_dir.close(io);

const needle = "codedb.snapshot";
const existing: ?[]u8 = info_dir.readFileAlloc(io, "exclude", allocator, .limited(1024 * 1024)) catch null;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Do not overwrite unreadable exclude files

If .git/info/exclude already exists but cannot be read here (for example it exceeds the 1 MiB limit), existing becomes null and the later createFile path rewrites the file as if it were absent, dropping the user's existing local ignore rules. Since this helper is best-effort, it should only create a new exclude on FileNotFound and otherwise return without modifying the file.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant