Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
40f5aa2
Current final tree for codex/restore-ast-first-retrieval
TheGreenCedar Jun 12, 2026
4a108b9
fix review branch checks
TheGreenCedar Jun 12, 2026
9040bdd
measure agent ab harness
TheGreenCedar Jun 12, 2026
3cd999e
measure generalized packet gate
TheGreenCedar Jun 13, 2026
615befb
generalize packet source evidence
TheGreenCedar Jun 13, 2026
0c27647
remove packet benchmark steering
TheGreenCedar Jun 13, 2026
0e35c0f
centralize language support registry
TheGreenCedar Jun 13, 2026
fa3a1fe
wire language support registry
TheGreenCedar Jun 13, 2026
57dc7cb
surface packet sidecar gaps
TheGreenCedar Jun 13, 2026
8fee9af
clarify files count semantics
TheGreenCedar Jun 13, 2026
898fe1f
test files summary truncation label
TheGreenCedar Jun 13, 2026
de6d1b2
document retrieval remediation boundaries
TheGreenCedar Jun 13, 2026
182e464
add remediation planning artifacts
TheGreenCedar Jun 13, 2026
44d57b6
scrub local review evidence paths
TheGreenCedar Jun 13, 2026
99e47e7
condense remediation plan artifact
TheGreenCedar Jun 13, 2026
dbf286e
log remediation e2e stats
TheGreenCedar Jun 13, 2026
59cd1f2
remove remediation spec artifacts
TheGreenCedar Jun 13, 2026
907374f
fail closed packet sidecar resolution
TheGreenCedar Jun 13, 2026
af6c64e
fix language support claims
TheGreenCedar Jun 13, 2026
5f7ff52
fix semantic doc language claim
TheGreenCedar Jun 13, 2026
aecf65c
tighten language support invariants
TheGreenCedar Jun 13, 2026
0b68390
add branch remediation plan
TheGreenCedar Jun 13, 2026
02e9d97
gate benchmark probes
TheGreenCedar Jun 13, 2026
ca062fa
mark packet steering task done
TheGreenCedar Jun 13, 2026
10da559
prove language regressions
TheGreenCedar Jun 13, 2026
d854965
mark language regression task done
TheGreenCedar Jun 13, 2026
e9a9270
bound semantic doc file cache
TheGreenCedar Jun 13, 2026
36309ce
mark semantic cache task done
TheGreenCedar Jun 13, 2026
7aaccfb
clean durable docs
TheGreenCedar Jun 13, 2026
ba745f3
clean benchmark task docs
TheGreenCedar Jun 13, 2026
f2a8508
log final review stats
TheGreenCedar Jun 13, 2026
68bf856
plan second pass merge cleanup
TheGreenCedar Jun 13, 2026
c8cff3d
remove production benchmark family gates
TheGreenCedar Jun 13, 2026
41071cc
harden compact marker lint
TheGreenCedar Jun 13, 2026
571a34e
clarify language evidence limits
TheGreenCedar Jun 13, 2026
fbe8b61
log second pass verification
TheGreenCedar Jun 13, 2026
af26a56
mark final review complete
TheGreenCedar Jun 13, 2026
b1849bf
move holdout claims behind eval probes
TheGreenCedar Jun 13, 2026
294c430
log third pass verification
TheGreenCedar Jun 13, 2026
b0159ad
align language filter registry
TheGreenCedar Jun 13, 2026
e0bf15f
tighten review proof trail
TheGreenCedar Jun 13, 2026
12ebbf9
clarify final proof row
TheGreenCedar Jun 13, 2026
20a5539
tighten language support audit
TheGreenCedar Jun 13, 2026
2871790
tighten packet sufficiency semantics
TheGreenCedar Jun 14, 2026
69c033c
tighten packet evidence support
TheGreenCedar Jun 14, 2026
0f7020e
abstract packet output cleanup
TheGreenCedar Jun 14, 2026
3291c4f
Improve grounding and retrieval pipeline
TheGreenCedar Jun 14, 2026
bafb3db
Harden and stabilize
TheGreenCedar Jun 14, 2026
f83cd2e
remove specs folder
TheGreenCedar Jun 14, 2026
c075813
consolidate documentation
TheGreenCedar Jun 14, 2026
e6cd2a2
bump version to 0.7.0
TheGreenCedar Jun 14, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 12 additions & 3 deletions .agents/skills/codestory-grounding/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,9 +41,11 @@ checkout is only the tool artifact unless the user is editing CodeStory itself.
- When `packet` reports `sufficient` and `follow_up_commands` is empty, answer
from the packet; budget truncation alone is not a gap. Preserve supported-claim
wording and include a compact "Support files" list from `answer.citations` and
`sufficiency.avoid_opening`. Do not run ordinary source reads, `rg`, `grep`, or
`git show` only to verify packet citations; run more commands only for a named
unresolved gap, an edit target, or a user-requested worktree proof.
`sufficiency.avoid_opening_paths`. The older `sufficiency.avoid_opening` field
is human-readable compatibility prose, not the raw path contract. Do not run
ordinary source reads, `rg`, `grep`, or `git show` only to verify packet
citations; run more commands only for a named unresolved gap, an edit target,
or a user-requested worktree proof.
- When `packet` reports `partial`, read `sufficiency.follow_up_commands` and run
those commands in order. Prefer listed targeted `search --why` commands before
escalating to a larger packet budget. As soon as a follow-up packet becomes
Expand All @@ -61,6 +63,13 @@ checkout is only the tool artifact unless the user is editing CodeStory itself.
failed, treat product retrieval as unavailable until `retrieval_mode=full` is
restored. Repo-text output is diagnostic only; do not use it as a substitute
for mandatory sidecar evidence.
- Under `graph_first_v1`, `retrieval_mode=full` means graph and lexical sidecars
are complete, generated `symbol_search_doc` and component-report virtual docs
are current, and Qdrant is complete only for selected dense anchors. A zero
dense-anchor manifest is valid only when reported explicitly; otherwise
Qdrant mismatch or unavailability is fail-closed. Search evidence should name
provenance such as `exact`, `lexical_source`, `symbol_doc`, `graph_neighbor`,
`component_report`, or `dense_anchor`.

## Command Routing

Expand Down
4 changes: 2 additions & 2 deletions .agents/skills/codestory-grounding/references/doctor.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ Reads project/cache/index/retrieval health without mutating the index. Use it at
| Path | Command | Expected result |
|------|---------|-----------------|
| Normal path | `<codestory-cli> doctor --project <target-workspace>` | Reports project root, cache path, indexed stats, retrieval state, sidecar embedding setup, environment hints, and next commands. |
| Failure path | If cache or index checks warn, run `index --project <target-workspace> --refresh full`; if mandatory sidecars are missing or stale, run the setup/index commands surfaced by `doctor`; if semantic reports `semantic partial`, `semantic stale`, or `semantic failed`, rebuild before trusting broad packet/search evidence. | Separates missing index, stale semantic docs, partial semantic docs, and mandatory retrieval setup failures. |
| Failure path | If cache or index checks warn, run `index --project <target-workspace> --refresh full`; if mandatory sidecars are missing or stale, run the setup/index commands surfaced by `doctor`; if symbol docs, dense anchors, policy version, Qdrant counts, or semantic health report partial/stale/failed state, rebuild before trusting broad packet/search evidence. | Separates missing index, stale symbol docs, partial dense anchors, and mandatory retrieval setup failures. |
| Integration edge | Use doctor before `ground`, `search --why`, `explore`, `context`, or `serve`; its next commands are the safe follow-up loop. | Prevents read commands from silently querying the wrong or empty cache. |

## Notes
Expand All @@ -31,5 +31,5 @@ Reads project/cache/index/retrieval health without mutating the index. Use it at
- The `attention:` block repeats warnings first so agents do not miss semantic partial/stale/failure messages buried in the full check list.
- Environment rows report retrieval-related variables such as `CODESTORY_EMBED_BACKEND`, `CODESTORY_EMBED_LLAMACPP_URL`, and sidecar enablement flags.
- The embedding checks distinguish product llama.cpp sidecar state from hash, ONNX, disabled, or stale diagnostic states.
- Treat `semantic ok` plus `retrieval_mode=full` as the health state suitable for broad repository explanation prompts. Treat `semantic partial`, `semantic stale`, `semantic failed`, and non-`full` retrieval modes as instructions to repair setup or rebuild before trusting agent-facing evidence.
- Treat `semantic ok` plus `retrieval_mode=full` as the health state suitable for broad repository explanation prompts. Under `graph_first_v1`, `full` may explicitly skip Qdrant only when dense-anchor count is zero and graph/lexical artifacts are current. Treat `semantic partial`, `semantic stale`, `semantic failed`, Qdrant count mismatch, and non-`full` retrieval modes as instructions to repair setup or rebuild before trusting agent-facing evidence.
- Prefer JSON for CI or doc-contract checks.
2 changes: 1 addition & 1 deletion .agents/skills/codestory-grounding/references/files.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ claims about what the graph can see.

- `files` reads persisted `FileInfo`; it does not scan the repo live unless `--refresh` asks for an index refresh.
- Treat `index usable` with incomplete or error counts as a partial-coverage signal, not a failure.
- `summary.framework_route_coverage` is the support matrix for framework route extraction. It includes `status`, `fixture_status`, `confidence_floor`, `handler_link_support`, `unsupported_patterns`, `known_gaps`, and `promotable`. Treat `partial`, `heuristic`, text-only handler support, and `promotable=false` as review prompts, not proof of full framework parity.
- `summary.framework_route_coverage` is the support matrix for framework route extraction. It includes `status`, `coverage_evidence`, `confidence_floor`, `handler_link_support`, `unsupported_patterns`, `known_gaps`, and `promotable`. Treat `partial`, `heuristic`, text-only handler support, and `promotable=false` as review prompts, not proof of full framework parity.
- Route coverage statuses:
- `supported`: fixture-backed behavior is passing and documented coverage is met.
- `heuristic`: pattern-backed evidence that needs source review.
Expand Down
29 changes: 17 additions & 12 deletions .agents/skills/codestory-grounding/references/index.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
# `index` - Build or Refresh the Symbol Index

Discovers project files, extracts symbols and edges, persists graph/search state
to SQLite, and synchronizes semantic docs when embedding assets are available.
to SQLite, writes graph-native symbol docs and component reports, and
synchronizes selected dense anchors when embedding assets are available.

## Usage

Expand All @@ -15,7 +16,7 @@ to SQLite, and synchronizes semantic docs when embedding assets are available.
|--------|---------|-----|
| `--project <path>` / `--path <path>` | `.` | Target repository root. Always pass this explicitly. |
| `--cache-dir <path>` | auto | Override the per-project cache root. |
| `--refresh <auto|full|incremental|none>` | `auto` | Choose the graph/snapshot/semantic refresh mode. |
| `--refresh <auto|full|incremental|none>` | `auto` | Choose the graph/snapshot/symbol-doc/dense-anchor refresh mode. |
| `--format <markdown|json>` | `markdown` | Use JSON for automation and timing analysis. |
| `--output-file <path>` | stdout | Write output to a file with an existing parent directory. |
| `--dry-run` | off | Show workspace discovery and planned adds/removals without writing storage. |
Expand All @@ -28,19 +29,21 @@ to SQLite, and synchronizes semantic docs when embedding assets are available.
| Mode | Behavior |
|------|----------|
| `auto` | Use `full` for an empty cache and `incremental` otherwise. |
| `full` | Rebuild the project graph and semantic docs from the discovered workspace. |
| `incremental` | Reindex changed/new/unindexed files, remove disappeared files, and prune touched semantic docs. |
| `full` | Rebuild the project graph, symbol docs, component reports, and dense anchors from the discovered workspace. |
| `incremental` | Reindex changed/new/unindexed files, remove disappeared files, and prune touched symbol docs or dense anchors. |
| `none` | Inspect the existing cache without refreshing it. Use only after a known-good same-session index. |

Use `--refresh full` for first-time indexes, cache/schema uncertainty, and fixes
for historical indexing failures. Incremental runs can leave stale error rows
when previously failing files are not touched.

## Semantic Retrieval
## Symbol Docs And Dense Anchors

There is no `index --semantic off` flag. Semantic docs are part of the default
index contract when embedding assets are ready. On a fresh machine, check the
setup plan first:
There is no `index --semantic off` flag. Graph-native `symbol_search_doc` rows
are part of the default index contract. Under `graph_first_v1`, dense vectors
are only written for selected anchors such as entrypoints, public APIs,
documented nontrivial symbols, central graph nodes, component reports, and
unstructured docs. On a fresh machine, check the setup plan first:

```text
<codestory-cli> setup embeddings --project <target-workspace> --dry-run --format json
Expand All @@ -53,25 +56,27 @@ High-signal environment toggles:

| Variable | Use |
|----------|-----|
| `CODESTORY_SEMANTIC_DOC_SCOPE=all` | Include all-symbol semantic docs. Accepted all-symbol aliases are `all`, `full`, `all-symbols`, and `all_symbols`; omitted or other values default to durable symbols. |
| `CODESTORY_SEMANTIC_DOC_SCOPE=all` | Include the broader all-symbol symbol-doc scope for diagnostics. Accepted aliases are `all`, `full`, `all-symbols`, and `all_symbols`; omitted or other values default to durable symbols. |
| `CODESTORY_EMBED_BACKEND=llamacpp` | Use the mandatory local llama.cpp embedding sidecar. |
| `CODESTORY_EMBED_LLAMACPP_URL=http://127.0.0.1:8080/v1/embeddings` | Product embedding endpoint for bge-base sidecar vectors. |
| `CODESTORY_SUMMARY_ENDPOINT=local` | Enable deterministic local summaries with `--summarize`. |

Use other embedding, alias, batch-size, tokenizer, provider, hash, ONNX, and
summary tuning variables only for focused diagnostics or historical comparisons.
Agent packet/search readiness requires retrieval status to report
`retrieval_mode=full`.
`retrieval_mode=full`. A zero dense-anchor corpus is valid only when the
manifest reports it explicitly; otherwise stale or unavailable Qdrant state
fails closed.

## Output

Markdown returns a compact index summary. JSON exposes the same data for tools:

- project and storage path
- refresh mode and discovered file/error counts
- local navigation readiness notes and semantic doc counts
- local navigation readiness notes, symbol-doc counts, dense-anchor counts, and policy reason counts
- parse, flush, resolve, cleanup, cache, and semantic timing buckets
- resolution counters and semantic reuse/embed/prune counts
- resolution counters plus symbol-doc write and dense-anchor reuse/embed/skip/prune counts

Important timing fields are `timings_ms.parse`, `timings_ms.flush`,
`timings_ms.resolve`, `timings_ms.cleanup`, `cache_ms.search_index`,
Expand Down
2 changes: 1 addition & 1 deletion .agents/skills/codestory-grounding/references/packet.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ tracing, ownership discovery, or change-impact analysis.
|------|---------|-----------------|
| Normal path | `<codestory-cli> packet --project <target-workspace> --question "How does indexing flow from CLI to storage?" --budget compact` | Markdown packet with cited claims, budget usage, gaps, and follow-up commands. |
| Failure path | If the packet reports `partial` or `insufficient`, follow its `follow_up_commands`, usually deeper packet budget or concrete `search`, `context`, `trail`, or `snippet` calls. | Broad exploration is bounded by reported gaps instead of drifting into repeated file reads. |
| Integration edge | Use JSON output for harnesses and stdio clients. If `sufficiency.status` is `sufficient` and `follow_up_commands` is empty, answer from packet supported claims and include a compact support-file list from `answer.citations` and `sufficiency.avoid_opening`; budget truncation alone is not a gap. | Makes benchmark traces and agent loops comparable across runs. |
| Integration edge | Use JSON output for harnesses and stdio clients. If `sufficiency.status` is `sufficient` and `follow_up_commands` is empty, answer from packet supported claims and include a compact support-file list from `answer.citations` and `sufficiency.avoid_opening_paths`; budget truncation alone is not a gap. Treat `sufficiency.avoid_opening` as compatibility prose only. | Makes benchmark traces and agent loops comparable across runs. |

## Notes

Expand Down
Loading
Loading