Skip to content

[codex] restore ast-first retrieval readiness#26

Closed
TheGreenCedar wants to merge 5 commits into
mainfrom
codex/restore-ast-first-retrieval
Closed

[codex] restore ast-first retrieval readiness#26
TheGreenCedar wants to merge 5 commits into
mainfrom
codex/restore-ast-first-retrieval

Conversation

@TheGreenCedar

Copy link
Copy Markdown
Owner

Summary

This branch restores the AST-first retrieval path and gives operators a clearer readiness surface for both local navigation and agent packet/search work. The core change is a new readiness model and CLI command that answers whether CodeStory is ready for a specific goal, and when it is not, gives both the minimum next command and the fuller repair command.

It also makes report handoffs explicit, threads readiness through the existing status/error surfaces, and removes Windows-only assumptions from generated commands, scripts, and docs so Linux/macOS users get POSIX examples first while PowerShell remains documented as the Windows path.

What changed

  • Added a new codestory-cli ready command with --goal local|agent and JSON/Markdown output.
  • Added readiness DTOs for goal/status/index snapshot/sidecar snapshot/verdict data so CLI and protocol surfaces share the same contract.
  • Threaded readiness verdicts through index, doctor, codestory://status, stdio recommended next calls, and API error details.
  • Split remediation guidance into minimum_next and full_repair so agents do not have to infer whether a local refresh or sidecar rebuild is required.
  • Added cache-busy and repair messaging in the runtime CLI error path so stale/locked cache failures point at actionable commands.
  • Added report --profile handoff, plus handoff metadata in report JSON, so generated reports carry freshness/readiness/sidecar caveats instead of looking like timeless source-of-truth artifacts.
  • Kept the AST-first graph_first_v1 sidecar evidence path and updated the e2e stats log with a fresh release run.
  • Made generated shell examples host-aware: PowerShell quoting on Windows and POSIX shell quoting on Unix.
  • Made display.rs and runtime recovery commands escape shell-sensitive single quotes correctly per host.
  • Updated benchmark scripts so Unix defaults use codestory-cli and llama-server, while Windows keeps the .exe path.
  • Updated the A/B benchmark prompt/analyzer so packet-first commands are recognized in both PowerShell and POSIX forms.
  • Reworked README, usage docs, contributor docs, retrieval sidecar docs, benchmark docs, and the Docker env example so they are no longer Windows-only.
  • Added/updated tests for the readiness command, report handoff metadata, stdio readiness contracts, onboarding snippets, search JSON output, and benchmark analyzer behavior.

Why

Before this branch, readiness was implicit and scattered. Operators could see index status, doctor output, sidecar status, and protocol recommendations, but there was no compact contract that said whether the repo was ready for local navigation versus agent packet/search retrieval. That made it too easy to treat a local cache as proof that agent sidecars were ready.

The docs and generated commands also leaned Windows/PowerShell-first. That created friction for Unix users and made WSL validation a real compatibility check rather than a documentation polish item.

This branch makes readiness a first-class runtime concept, keeps report handoffs honest about freshness and sidecar trust, and makes the command surfaces portable across Windows and Unix shells.

User and developer impact

  • codestory-cli ready --project . --goal local --format markdown gives a direct local-navigation verdict.
  • codestory-cli ready --project . --goal agent --format json gives machine-readable agent packet/search readiness.
  • codestory-cli report --profile handoff emits report output with handoff metadata and caveats.
  • index, doctor, and stdio status consumers get readiness verdicts without having to duplicate cache/sidecar interpretation.
  • Error payloads can now expose both a small next step and a fuller repair command.
  • Unix/macOS/Linux users can follow docs and generated benchmark prompts without translating from PowerShell.
  • Windows users keep explicit PowerShell examples and Windows-specific notes where they matter.

Verification

  • cargo build --release -p codestory-cli
  • cargo test -p codestory-cli --test ready_command --test report_export --test onboarding_contracts
  • node --test scripts/tests/codestory-agent-ab-analyzer.test.mjs
  • cargo test -p codestory-cli command_quoting_single_quotes_shell_sensitive_values
  • cargo test -p codestory-runtime recovery_commands_quote_shell_sensitive_project_paths
  • WSL/Linux quoting checks:
    • wsl -e bash -lc 'cd /mnt/c/Users/alber/source/repos/codestory && cargo --config "build.rustc-wrapper=\"\"" test -p codestory-cli command_quoting_single_quotes_shell_sensitive_values'
    • wsl -e bash -lc 'cd /mnt/c/Users/alber/source/repos/codestory && cargo --config "build.rustc-wrapper=\"\"" test -p codestory-runtime recovery_commands_quote_shell_sensitive_project_paths'
  • Release e2e stats:
    • $env:CODESTORY_ALLOW_SKIP_REAL_REPO_DRILL_CASES = '1'; cargo test -p codestory-cli --test codestory_repo_e2e_stats -- --ignored --nocapture
    • proof_tier: full_sidecar
    • warnings: none
    • index: 68.23s
    • graph phase: 10.11s
    • semantic phase: 49.85s
    • semantic embedding: 48.89s
    • symbol search docs: 11,505
    • dense anchors: 708
    • dense skips: 10,797
    • retrieval index: 10.95s
    • retrieval mode: full
    • repeat full refresh: 20.56s with 0 embedded
    • report: 2.59s total, 1.09s markdown, 1.50s JSON
    • nodes: 83,735; edges: 70,803; files: 222; index errors: 0
    • search dir unchanged: true
  • cargo fmt --check
  • git diff --check

Remaining risk and follow-up

  • The real-repo drill half of the e2e suite was intentionally skipped because CODESTORY_REAL_REPO_DRILL_CASES was not set. The stats-only release harness passed with CODESTORY_ALLOW_SKIP_REAL_REPO_DRILL_CASES=1, and the skip is recorded in the stats log.
  • Downstream consumers of doctor, index, stdio status, and API errors should review the new readiness fields and remediation command fields as an additive contract change.
  • WSL validation required disabling the inherited Windows sccache.exe rustc wrapper via cargo --config "build.rustc-wrapper=\"\""; that is an environment setup note rather than a repo code change.

@TheGreenCedar TheGreenCedar deleted the codex/restore-ast-first-retrieval branch June 13, 2026 12:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant