Canonical reference for the copilot-session-knowledge architecture, data pipeline, and coding conventions.
This repo is a set of standalone Python CLI scripts — not a package or library. Each script is independently runnable and normally duplicates its own constants. Documented helper exceptions such as _tentacle_core.py, _tentacle_goal.py, _tentacle_pr.py, _tentacle_dispatch.py, and _tentacle_review.py are allowed only when they preserve an existing CLI/module contract and are listed in the script inventory. This keeps the default goal operator-first simplicity, not framework cohesion.
sk thin front door: sk.py is a thin dispatcher that maps memorable sub-commands (sk briefing, sk query, sk index build, …) to the underlying standalone scripts. It adds nothing to the data pipeline or business logic — it is purely a routing layer. After the standard install (install.py --test), a managed cross-platform sk launcher is provisioned automatically on your PATH — no manual alias or pip install needed. Windows PowerShell users without a PATH update can invoke python ~/.copilot/tools/sk.py directly as an equivalent fallback. All standalone scripts remain directly invocable as a fallback or for advanced use. The sk front door does not change the architectural contract below. When SK_HARNESS=1, dispatches are wrapped by the harness/ middleware package (pre/post hooks, timing telemetry, dry-run). See docs/HARNESS.md for API reference.
Session files (.md / .jsonl)
│
▼
build-session-index.py ──→ SQLite FTS5 (knowledge.db)
│ │
▼ ▼
extract-knowledge.py ──→ 7 knowledge categories
│ (mistake, pattern, decision,
│ tool, feature, refactor, discovery)
│ + knowledge_relations
│ (SAME_SESSION†, SAME_TOPIC†, TAG_OVERLAP†,
│ RESOLVED_BY†, SEMANTIC_PROXIMITY*)
▼
query-session.py / briefing.py / mcp-server.py ──→ Search, recall, MCP tools
│
▼
watch-sessions.py ──→ Incremental re-indexing (adaptive polling)
SEMANTIC_PROXIMITY is populated by the native Rust TF-IDF cosine implementation (watch.rs/tfidf.rs). extract-knowledge.py --semantic-only is available for manual/fallback use only — sk watch (Rust binary) never auto-spawns Python.
†Deterministic relations (SAME_SESSION, SAME_TOPIC, TAG_OVERLAP, RESOLVED_BY) and residual helpers (backfill_affected_files, infer_task_ids, confidence decay) are extracted natively by the Rust binary (sk-rust/src/index/extract.rs). extract-knowledge.py is an intentional manual operator tool — NOT auto-called by sk watch in the default Rust build.
Phases:
build-session-index.py— Phase 1 (session metadata) + Phase 2 (event content) viaproviders/→ SQLite FTS5 (schema v8; current migration level v17)extract-knowledge.py— classifies into 7 types, deduplicates by content hash; category-aware confidence floors (pattern=0.5, others=0.4); recurrence reward (+0.03 per upsert, capped). Intentional manual operator tool —sk watch(Rust binary) runs all hot-path classification, relation extraction, and semantic proximity natively; this script is named insk watchrecovery hints and is NOT auto-called on the native watch path.query-session.py/briefing.py/mcp-server.py— BM25 keyword search + optional semantic vector search (RRF blend) exposed via CLI and MCPwatch-sessions.py/sk watch— adaptive polling (5 s / 30 s / 300 s tiers), auto re-indexes on file changes.sk watchRust binary: handles session indexing, extract, relations, semantic proximity, and first-run DB bootstrap natively; never spawns Python. On DB or extract failure, emits a structured recovery hint naming the exact manual command.learn.py— manual knowledge entry; CLI interface for agents to record learnings during a session
sk watch (Rust binary) never spawns Python — including on error paths. The following Python surfaces are permanent intentional architecture, not residual code pending deletion:
| Python surface | Role | Status | Tested by |
|---|---|---|---|
sk.py shim |
Thin launcher/dispatcher for non-binary installs; routes all sk <cmd> calls |
Intentional — the no-binary install contract | tests/test_py_rust_boundary.py::test_python_shim_dispatch_without_rust_binary; tests/test_py_rust_boundary.py::test_project_list_json_matches_python_shim_when_rust_available |
hook_runner.py |
Python hook runner for sk.py shim and non-binary installs; owns all managed hook events when no Rust binary is present |
Intentional — Python shim hook entry point | tests/test_py_rust_boundary.py::test_hook_runner_empty_payload_matches_native_exit_when_rust_available |
build-session-index.py |
Indexes session files → FTS5 DB; named in sk watch DB-failure recovery hints |
Intentional — manual operator recovery tool | tests/test_py_rust_boundary.py::test_watch_db_failure_recovery_hint_no_python_spawn_when_rust_available |
extract-knowledge.py |
Knowledge classification, relation extraction, --semantic-only fallback; named in sk watch extract-failure recovery hints |
Intentional — manual/fallback operator tool | tests/test_py_rust_boundary.py::test_watch_db_failure_recovery_hint_no_python_spawn_when_rust_available |
migrate.py |
Versioned schema migrations via schema_version table |
Intentional — canonical schema upgrade owner; Rust native bootstrap does NOT replace this | tests/test_py_rust_boundary.py::test_migrate_help_remains_manual_python_surface |
sync-daemon.py |
Push/pull sync runtime for Python sk.py shim and non-binary installs |
Intentional — shim sync path | tests/test_py_rust_boundary.py::test_python_shim_sync_run_dispatches_sync_daemon_without_rust_binary |
briefing.py, learn.py, query-session.py, project-registry.py, etc. |
Admin/operator CLI scripts | Intentional — these are the primary Python CLI surface; the Rust binary may bypass Python for measured hot read-only subcommands such as sk project list, while the direct scripts and mutating fallbacks remain supported |
tests/test_py_rust_boundary.py::test_python_shim_dispatch_without_rust_binary; tests/test_py_rust_boundary.py::test_project_list_json_matches_python_shim_when_rust_available |
What wave20 removed: auto-spawning Python subprocess on sk watch error paths. The scripts
above remain on disk, are invoked by operators manually, and are referenced by name in sk watch
recovery hints. None of these scripts are candidates for deletion as a consequence of wave20.
sk-rust/src/browse/importers/ contains a native Rust parity port for the debug-log
importers documented in docs/DEBUG-LOG-CONTRACT.md: VS Code Agent Debug Log
JSONL and OTel ReadableSpan JSONL. The port includes shared path-safety,
bounded-line reading, SHA-1 synthetic span IDs, dedup hashing, and redaction
helpers that mirror the production Python importers.
This is parser-only infrastructure. It has no Browse DB persistence and no CLI
subcommand yet, so the Python modules remain the production callable importer
surface. The Rust port is kept compiled and tested in sk-rust so a later
operator command can wire it in without changing the data contract.
This is the canonical inventory for Python Ruff coverage. Keep it in sync with
.github/workflows/ci.yml, hooks/pre-commit, and tests/test_quality_gates.py.
| Surface | CI Ruff lint/format | Local pre-commit Ruff |
Notes |
|---|---|---|---|
| Root standalone scripts | *.py |
Any staged root .py path via in_python_cleanliness_surface() |
Every root-level Python entry point is inside the blocking Ruff baseline. |
| Package/module directories | browse/, hooks/, scripts/ |
Any staged .py path under browse/, hooks/, or scripts/ |
Directory coverage applies to package-like surfaces where Ruff cleanup is already baselined. |
| Syntax gate | python3 scripts/check_syntax.py |
All staged .py files via scripts/check_syntax.py when installed |
Syntax coverage is broader than Ruff coverage. |
| Complexity advisory | Full suite path through run_all_tests.py; local hook runs scripts/check_complexity.py --json on staged .py snapshots |
All staged .py files, non-blocking and fail-open |
Advisory only; it prints findings but does not deny commits. |
| Ruff complexity/refactor advisory | Complexity advisory (Ruff C90/PLR) runs ruff check --select C90,PLR0911,PLR0912,PLR0913,PLR0915 --statistics on the same Ruff surface |
Not run by local pre-commit |
CI advisory only; continue-on-error: true records baseline counts before enforcement. |
| Out-of-surface Ruff advisory | Not run by CI | Staged .py files outside root scripts and covered package directories run ruff check as a non-blocking [advisory] scan |
Advisory only; it is fail-open when Ruff is absent and does not change the blocking Ruff surface. |
Root *.py files are now inside the blocking Ruff lint surface by default. New
root scripts still must honor the standalone-script architecture contract: keep them
self-contained and run syntax/tests for the touched behavior. Python files outside root
scripts, browse/, hooks/, and scripts/ are outside CI Ruff coverage; the local
pre-commit hook may print non-blocking [advisory] Ruff findings for those files when
Ruff is installed.
| Script | Role |
|---|---|
build-session-index.py |
Indexes session files → FTS5 DB |
extract-knowledge.py |
Classifies + deduplicates knowledge entries; category-aware confidence floors; recurrence reward |
query-session.py |
FTS5 + semantic search; JSON/markdown export |
briefing.py |
Task-scoped recall; context packs for agent injection |
mcp-server.py |
Read-only MCP stdio JSON-RPC surface for briefing and query_session |
watch-sessions.py |
File watcher; triggers incremental re-indexing |
learn.py |
Manual knowledge entry |
tentacle.py |
Multi-agent orchestration (create → todo → bundle → swarm → complete) + orchestrator goal loop (goal init/validate/status/dispatch/link/eval/resume/criteria/gate/budget/next-iter/verify-loop/coverage/context). goal.json keeps both the legacy flat tentacles list and a structured per-iteration iterations map so status/history queries can answer which tentacles belonged to each iteration without breaking older readers. Goal writes are serialized with a goal.json.lock sidecar (`O_CREAT |
_tentacle_core.py |
Low-level tentacle.py helper module for core path constants, file-locking helpers, git-root discovery, tentacle directory resolution, Windows filesystem retries, PID liveness checks, and todo parsing/rendering. tentacle.py re-exports these symbols to preserve the long-standing CLI/module contract while later extraction waves split higher-level seams. |
_tentacle_goal.py |
Extracted tentacle.py goal subsystem: goal state locking, iteration/budget helpers, goal lifecycle commands, dispatch planning, continuation context, verify-loop, coverage, loop, and resilience-status behavior. tentacle.py injects the remaining orchestration-owned helpers at import time and re-exports goal symbols so existing CLI/module callers keep working. |
_tentacle_pr.py |
Extracted tentacle.py PR automation seam for collecting tentacle handoffs/verifications, generating commit messages and PR bodies, and implementing sk tentacle pr. tentacle.py injects patchable runtime wrappers and re-exports PR symbols so tests and module callers that patch tentacle._pr_run_subprocess_safe keep working. |
_tentacle_dispatch.py |
Extracted tentacle.py dispatch/profile/recall seam for runtime bundle materialization, live briefing recall packs, specialist agent profiles, prompt sizing, resume, swarm/dispatch, next-step, and bundle. tentacle.py injects patchable runtime wrappers and re-exports dispatch symbols so existing tests/callers that patch tentacle.* keep working. |
_tentacle_review.py |
Extracted tentacle.py fresh-context reviewer and review-loop seam for reviewer bundles/findings, review-loop verification classification, blocker-resolver tentacle creation, dispatch-reviewer, and review-loop. tentacle.py injects patchable runtime wrappers and re-exports review symbols so existing tests/callers that patch tentacle.* keep working. |
embed.py |
Optional semantic search via embedding APIs (OpenAI, Fireworks, etc.) with TF-IDF fallback |
claude-adapter.py |
Parses Claude Code JSONL sessions into the common DB format |
sync-knowledge.py |
Merges knowledge.db files across environments (Windows ↔ WSL); MAX confidence semantics |
sync-config.py |
Single connection_string config; --setup, --setup-env, --status --json |
sync-daemon.py |
Local-first push/pull runtime; backlog-aware adaptive limits and automatic sync queue compaction |
sync-status.py |
Local sync diagnostics; --health-check, --audit, --json |
auto-update-tools.py |
Smart git-diff–based update pipeline; sk-update alias |
migrate.py |
Versioned schema migrations via schema_version table |
install.py |
Deploy skills/hooks; inject global AI instructions |
setup-project.py |
Full project onboarding: skills + hooks + WORKFLOW.md |
project-registry.py |
sk project add/remove/list — manage the persistent project registry (tools-managed-projects.json); Rust sk project list is a measured native read-only hot path, while add/remove and direct-script use stay on this Python owner |
host_manifest.py |
Single source of truth for supported hosts + their filesystem paths |
index-status.py |
Row counts, FTS integrity, event-offset coverage |
knowledge-health.py |
Knowledge base health + recall telemetry |
error-analysis.py |
On-demand error pattern analysis: type distribution, severity, recurrence, root causes |
benchmark.py |
Commit-keyed benchmark ledger for retro + health snapshots |
checkpoint-save.py |
Save named checkpoint |
checkpoint-restore.py |
List/restore checkpoints |
checkpoint-diff.py |
Diff two checkpoints |
browse.py |
Local web UI (127.0.0.1, token auth) with read-only diagnostics plus the authenticated /chat operator console |
project-context.py |
Deterministic project-context.md generator |
codebase-map.py |
Repo structure snapshot (auto-refreshed at session start) |
trend-scout.py |
GitHub repo discovery via multi-lane search |
copilot-cli-healer.py |
Repairs stale Copilot CLI package state |
The browse UI exposes a browser-managed Copilot CLI execution console at /chat. It is the only browse surface that actively launches Copilot CLI; the rest of the UI remains read-only diagnostics and search.
Route prefix note: The Python browse server (
browse/core/server.py) and the Firebase-hosted deployment now both serve the Next.js app at root (/*, e.g./chat,/settings). Compatibility redirects from/v2/*→/*remain for old bookmarks and deep links.
| Component | Role |
|---|---|
browse/core/operator_console.py |
Secure execution/persistence adapter. Starts Copilot CLI runs, normalizes event streams, and persists operator state under ~/.copilot/session-state/operator-console/. |
browse/api/operator.py |
Authenticated REST + SSE surface for session CRUD, prompt submission, run status/history, path suggestions, previews, and diffs. |
browse-ui/src/app/chat/ |
Next.js route wrapper for the /chat operator console. |
browse-ui/src/components/chat/ |
ChatShell, Transcript, Composer, SessionCreateDialog, MetadataBar, file review components, and CLI session adoption UX (CliSessionPicker, CliAdoptedBadge, ConfirmAdoptionPanel). |
browse-ui/src/components/chat/cli-session-picker.tsx |
Lists real CLI sessions and drives the adopt/confirm flow; composer is disabled until confirmed_at is set. |
browse-ui/src/lib/api/{types,schemas,hooks}.ts |
Stable frontend contract layer for /api/operator/*. |
All pages share a single host context. The two source-of-truth files are:
| File | Role |
|---|---|
browse-ui/src/providers/host-provider.tsx |
HostProvider React context — mounted at the root layout; exposes { host, diagnosticsEnabled } to every page via useHostState(). Listens to cross-tab storage events and same-tab BROWSE_HOST_CHANGE_EVENT to stay current without a reload. |
browse-ui/src/lib/host-profiles.ts |
localStorage persistence helpers (saveHostProfile, deleteHostProfile, setSelectedHostId, getEffectiveHost, etc.) and the immutable LOCAL_HOST sentinel. All mutating helpers dispatch BROWSE_HOST_CHANGE_EVENT after writing so HostProvider re-evaluates immediately. |
Active host resolution order (documented in getEffectiveHost()):
- Explicit selection (
browse_selected_host_idin localStorage), if the referenced profile still exists. - First remote profile with
is_default === true. LOCAL_HOSTsentinel — same-origin, no bearer token required.
The header renders a compact AWS-region-style global host dropdown that calls setSelectedHostId() on selection and links to /settings#hosts for management. The Settings page exposes the full HostManagement surface (list, add, remove, set-default, restore-local). SessionCreateDialog at /chat reads useHostState() to pre-populate the host picker when the dialog opens.
POST /api/operator/sessions → create session
GET /api/operator/sessions → list sessions
GET /api/operator/sessions/{id} → session detail
PATCH /api/operator/sessions/{id} → update session mutable fields
POST /api/operator/sessions/{id}/prompt → submit prompt → {run_id}
GET /api/operator/sessions/{id}/stream → SSE run output
GET /api/operator/sessions/{id}/status → session + active run status (?run=<run_id>)
GET /api/operator/sessions/{id}/runs → persisted run history
POST /api/operator/sessions/{id}/delete → delete session
POST /api/operator/sessions/adopt → adopt CLI session → operator session (201|200|409)
POST /api/operator/sessions/{id}/confirm → confirm adopted session → enable resume
GET /api/operator/suggest → path/workspace suggestions under ~/
GET /api/operator/preview → file preview under ~/
GET /api/operator/diff → unified diff for two files under ~/
GET /api/operator/browsers → read-only scan of allowlisted local browser candidates
GET /api/operator/cli-sessions → list real CLI sessions (Bearer/cookie; debug=True; read-only)
GET /api/operator/cli-sessions/{id} → single CLI session by UUID (Bearer/cookie; debug=True; read-only)
GET /api/session/{id}/debug-log → paginated session-scoped debug log; preserves CLI event hierarchy via span_id/parent_span_id (CLI id/parentId → SHA-1 16-hex), derives duration_ms for paired start/completion events; limit 100 (Bearer/cookie; debug=True; read-only)
GET /api/sessions/{id}/debug-log → plural alias for the route above
- Workspace confinement: every workspace or file path is normalized with
confine_path()and rejected unless it resolves underPath.home(). - Token auth: all
/api/operator/*routes require the same per-launch browse token as the rest of the UI. - Prompt cap: prompts longer than 4096 characters are rejected.
- Path cap: oversized path inputs are rejected before filesystem access.
- Separate persistence: operator run history is stored under
~/.copilot/session-state/operator-console/and replayed from disk on reload. - Same Copilot policy surface: operator-console runs still inherit the installed Copilot CLI's hooks, custom instructions, and permission system. Browser mediation does not bypass briefing, tentacle, or other active policy gates.
The operator console supports resuming an existing Copilot CLI session from the browser via the From CLI history picker. Two distinct identifiers are always in play and must never be aliased:
| ID | Where it lives | Purpose |
|---|---|---|
| Operator session ID | Browse/backend route segment; operator-console/<id>/ on disk |
Identifies the operator-side session; used in every /api/operator/sessions/<id>/* URL |
| CLI session UUID | Backend field resume_target (UUID4); never surfaced as a route key |
Identifies the real Copilot CLI session; passed to copilot as --resume=<cli_uuid> |
- discover —
GET /api/operator/cli-sessionsreads real CLI session artifacts under~/.copilot/session-state/(read-only, path-confined, Bearer/cookie auth,debug=True). - adopt —
POST /api/operator/sessions/adoptcreates an operator session withsource="cli_adopt", stores the CLI UUID inresume_target, and returnsconfirmed_at=None. The returnedidis the new operator session ID. - confirm —
POST /api/operator/sessions/<operator_id>/confirmsetsconfirmed_atandresume_ready=True. Prompts are blocked until this step completes. - prompt/resume —
POST /api/operator/sessions/<operator_id>/promptlaunches the CLI subprocess with--resume=<cli_uuid>(fromresume_target).--nameis never passed for adopted sessions. The operator session ID never appears in the CLI argv.
resume_targetis validated as UUID4 at adopt time; malformed values are rejected.- The confirmation gate in
operator_console.py · start_run()returnsNonefor any session whereconfirmed_atis not set; the UI disables the composer until confirmation. - CLI tree discovery is read-only and path-confined; file hashes and
workspace.yamlmtime are unchanged after discovery (tests/test_browse_chat_resume.py CR9). - Child subprocess env is filtered by
_ENV_ALLOWLIST; test state env vars do not leak (CR8).
If the CLI session referenced by resume_target no longer exists on disk:
- The
GET /api/operator/cli-sessions/{uuid}probe returns 404. CliSessionPicker(browse-ui) shows a warning badge on stale entries.POST /api/operator/sessions/{id}/confirmcallsget_cli_session_by_idand returnsCLI_SESSION_NOT_FOUND/ 404 when the CLI UUID is absent; there is no supported confirm-with-replacement-workspace fallback. Delete the unconfirmed operator session and adopt a different CLI session or start a fresh operator chat.- Deleting an operator session never touches the CLI session tree.
POST /api/operator/sessions/adopt for a CLI UUID that has already been adopted returns one of
two responses depending on confirmation state:
- HTTP 200 — duplicate exists but is still unconfirmed; response body includes the existing operator session object (idempotent re-adopt).
- HTTP 409 (
ALREADY_ADOPTED) — duplicate is already confirmed; response is error-only with no session object. To find the existing session, useGET /api/operator/sessions.
| Component | Role |
|---|---|
browse-ui/src/components/chat/cli-session-picker.tsx |
Picker that lists real CLI sessions from /api/operator/cli-sessions and triggers the adopt flow |
CliAdoptedBadge (in cli-session-picker.tsx) |
Badge shown when a session was adopted from CLI history |
ConfirmAdoptionPanel (in cli-session-picker.tsx / chat-shell.tsx) |
Workspace/add_dirs confirmation step before the composer is enabled |
Operator runbook for the adopt/confirm flow: docs/OPERATOR-PLAYBOOK.md — Chat Resume / CLI Session Adoption
watch-sessions.pystill tracks normal Copilot session artifacts under~/.copilot/session-state/; the operator console reads its own persisted run history directly fromoperator-console/.auto-update-tools.pycan restartwatch-sessions.py, but it does not restart the browse server or interfere with active operator runs.- UI-only
browse-ui/src/updates require regenerating the localbrowse-ui/dist/artifact before the running browse server can serve them; Python changes tobrowse/api/operator.pyorbrowse/core/operator_console.pystill require a manual browse server restart.
Two deployment modes are supported:
Mode 1 — Same-origin (Cloudflare Tunnel, default): The Python browse server serves both the static UI and the API behind a single Cloudflare Tunnel URL (e.g. browse.example.com). All API calls are same-origin; no CORS configuration needed. Key code-level constraint: browse/core/auth.py · check_origin() builds the expected CSRF origin as http://{Host}. Behind HTTPS the browser sends Origin: https://… — these do not match, causing POST mutations to return 403. Fix check_origin to accept https:// (check X-Forwarded-Proto) before enabling full remote operator console access.
Mode 2 — Firebase Hosting control plane: The static browse-ui is deployed to Firebase Hosting (operator's chosen custom domain). The operator's browse.py server is reached via a Cloudflare Tunnel. All API calls from the Firebase-hosted UI to the tunnel are cross-origin; the operator host exposes an explicit CORS allowlist, Bearer auth, and a GET /api/operator/capabilities endpoint. Host profiles let the UI target different operator machines. See docs/OPERATOR-PLAYBOOK.md — Firebase-hosted control plane for the full topology and manual console steps.
Full remote-access setup, Cloudflare Access guidance, and Firebase topology: docs/OPERATOR-PLAYBOOK.md
Hosted shell specs (localhost bootstrap, relay, version negotiation, token storage, stream reconnect, first-run UX): see docs/HOSTED-SHELL-ARCHITECTURE.md and docs/HOSTED-SHELL-RESEARCH.md.
Hooks live in hooks/ and are deployed to ~/.copilot/hooks/ (Copilot CLI only).
- Unified runner:
hook_runner.pydispatches all hook events (1 Python process per event) - Supported events:
sessionStart,sessionEnd,preToolUse,postToolUse,agentStop,subagentStop,errorOccurred - Fail-open: rule errors never block the agent
- HMAC-signed markers: tamper-resistant counter state
- Audit log:
~/.copilot/markers/audit.jsonl
Full hook architecture, rule inventory, and dispatched-subagent git guard: docs/HOOKS.md
~/.copilot/session-state/knowledge.db — SQLite with FTS5, WAL journal mode, and optional vector embeddings.
Schema versions: v1–v6 (legacy) → v7 (two-phase indexing + event_offsets) → v8 (sessions_fts contentless FTS5 + BM25) → v9–v14 (eval, provenance, recall, sync, benchmark) → v15 (confidence_backfill_wave3: raises pattern confidence floor to 0.5 and applies recurrence reward to existing entries) → v16 (error lifecycle: error_type, root_cause, severity, is_resolved, fix_steps, prevention_hook, recurrence_after_briefing on knowledge_entries) → v17 (briefing_deliveries table for tracking which entries were briefed to each session). Run python3 ~/.copilot/tools/migrate.py to upgrade.
SessionProvider ABC defines iter_sessions() and iter_events_with_offset().
CopilotProvider— handles.mdsession checkpointsClaudeProvider— handles JSONL with real byte-offset seeks for Phase 2
.octogent/ stores local tentacle state and is gitignored in this repo.
Runtime-bundle workflow: create → todo add → bundle (optional) → swarm → complete.
complete accepts an optional --auto-verify <cmd> flag (fail-open): runs the command, persists the result as verification evidence before closing. Use --auto-verify-timeout <seconds> (default: 120 s) if the command is long-running.
Sub-agents must write a structured handoff before stopping:
tentacle.py handoff <name> "<summary>" --status DONE --changed-file <path> [--changed-file ...] --learn
--status must be one of DONE, BLOCKED, TOO_BIG, AMBIGUOUS, or REGRESSED. Include one --changed-file receipt per modified file so the orchestrator can verify the handoff trail.
Quota-blocked handoffs — when a tentacle is BLOCKED due to a quota or rate-limit signal, add machine-readable metadata:
tentacle.py handoff <name> "<summary>" --status BLOCKED \
--quota-reason rate_limit \
--retry-hint 2026-05-14T00:00:00Z
--quota-reason is a short token (rate_limit, quota_exceeded, daily_quota, monthly_quota, token_quota, context_limit). --retry-hint is an optional ISO timestamp or human-readable hint. cmd_complete persists these fields into meta.json["quota_reason"] / meta.json["retry_hint"] and appends an entry to goal.json["quota_retry_queue"] for orchestrator tracking. Old BLOCKED handoffs without quota metadata are fully backward compatible.
_classify_quota_signal(text) is available to classify raw dispatch output into a quota_reason token. The pattern list is intentionally minimal pending the fuller failure-mode matrix (#183).
tentacle.py marker-cleanup (dry-run by default, --apply to act) inspects and removes stale
entries from the dispatched-subagent marker without completing a tentacle. Only entries whose
per-entry timestamp exceeds the declared TTL are eligible; live entries are never touched.
Each bundle directory contains a manifest.json listing all artifacts. The goal_context
artifact is optional — it is only present when the tentacle is linked to a goal:
| Artifact key | File | When present |
|---|---|---|
briefing |
briefing.md |
Always (placeholder when empty) |
instructions |
instructions.md |
Always |
session_metadata |
session-metadata.md |
Always |
recall_pack |
recall-pack.json |
Always (empty when no matches) |
goal_context |
goal-context.md |
Only when tentacle is linked to a goal |
The goal_context artifact is the output of _goal_render_continuation_context: a compact
markdown block with objective, iteration counter, budget limits, criteria progress, remaining
criteria IDs + descriptions, and the last N prior handoff summaries. Sub-agents must read
goal-context.md (when manifest.json lists it as populated: true) to understand the
overarching goal before making changes.
Python-only implementation — _goal_render_continuation_context, _goal_write_context_artifact,
and _cmd_goal_context live entirely in _tentacle_goal.py (Python) and are re-exported from tentacle.py. The Rust sk binary routes
sk tentacle goal context … to tentacle.py via its standard pass-through mechanism — no
Rust changes are required when adding or modifying goal context behavior. The bundle
injection path (_build_runtime_bundle) similarly calls the Python renderer directly; the
Rust binary never constructs the goal-context.md content itself.
Full tentacle workflow reference: docs/USAGE.md
Tools are validated on Copilot CLI and Claude Code only.
| Feature | Copilot CLI | Claude Code |
|---|---|---|
| Skill deployment | ✅ .github/skills/ |
✅ .claude/skills/ |
| Hook deployment | ✅ .copilot/hooks/ |
❌ not supported |
| Global instruction injection | ✅ ~/.github/copilot-instructions.md |
via CLAUDE.md |
| Session indexing | ✅ | ✅ via claude-adapter.py |
host_manifest.py is the single authoritative source for supported hosts and their filesystem paths. Do not add Codex, Cursor, or other hosts without documented session and hook formats.
Full skill deployment and host scope details: docs/SKILLS.md
Sync is local-first: knowledge.db is the authoritative read/query source; remote sync is optional transport/storage only.
- Single config key:
connection_stringin~/.copilot/tools/sync-config.json sync-config.py --setupaccepts HTTP(S) gateway URLs only (not raw Postgres/libSQL DSNs)sync-gateway.pyis reference/mock only — not a production authority- Default provider recommendation: Neon (backing Postgres) + Railway (thin gateway host)
- Missing
connection_string→ daemon stays local-only/idle (not a fatal error)
Full sync diagnostics reference: docs/USAGE.md
These conventions apply to all scripts in this repo. Follow them in every change.
- Pure stdlib Python 3.10+ — zero pip dependencies required.
scikit-learnand embedding API keys are optional. - Every script is standalone — no shared library or package imports between scripts. Each script duplicates its own DB path constants, encoding fix, etc.
- Windows encoding fix — every script starts with the same
os.name == "nt"block to reconfigure stdout/stderr to UTF-8. Preserve this pattern in every new script.
- Parameterized SQL only — all user input uses
?placeholders. Never interpolate strings into SQL. - FTS5 query sanitization — strip FTS5 operators (
OR,AND,NOT,NEAR,*,") before passing toMATCH. See_sanitize_fts_query()inquery-session.py.
- JSON serialization only — never use pickle. Legacy pickle detection exists but new code must use JSON /
struct.pack. - Atomic lock files — use
O_CREAT | O_EXCLfor process locks (no TOCTOU races).
- Title ≤ 200 chars
- Content ≤ 10 K chars
- FTS queries ≤ 500 chars
- Paths ≤ 256 chars
- Use
Path.home()andpathlibthroughout. Handle WSL path differences explicitly.
query-session.py --task --export json→entries[]briefing.py --task --json→tagged_entries[]/related_entries[]briefing.py --pack→entries.<category>[]snippet_freshnessvalues:fresh | drifted | missing | unknownrelated_entry_ids— JSON ints, confidence-ranked, capped to top 3
The file ~/.copilot/session-state/tools-managed-projects.json is the persistent registry of
projects managed by session-knowledge tools. It is written by install.py, setup-project.py,
and project-registry.py.
Schema — backward-compatible mixed format:
{
"projects": [
"/legacy/string/path",
{"name": "myproject", "path": "/richer/dict/path", "created_at": "2025-01-01T00:00:00+00:00"}
]
}- Legacy string entries (
install.py,setup-project.py): plain path strings. Written by existing scripts; always preserved on any write. - Rich dict entries (
project-registry.py):{name, path, created_at}. Written bysk project add. Both formats co-exist in the same file.
All readers (_load_project_registry() in install.py, setup-project.py, and
auto-update-tools.py) extract the path string from either format.
project-registry.py is the CLI owner of this file: sk project add|remove|list.
The Rust binary handles sk project list natively as a measured read-only hot path and
preserves Python fallback/direct-script behavior for add, remove, help, and non-native
forms.
Add new migrations to the MIGRATIONS list in migrate.py with incrementing version numbers and a descriptive name.
if __name__ == "__main__":—migrate.pyandgenerate-summary.pyare both guarded; they can be imported without side effects. New scripts that may be imported or tested should follow this pattern.TOOLS_DIRresolution — root scripts that need a reliable tools-directory path usePath(__file__).resolve().parent.hooks/rules/common.pyintentionally keeps the installed-hookPath.home() / ".copilot" / "tools"form (hooks run from~/.copilot/hooks/, not the source tree).
Agent-authored docs and operator/research outputs (tentacle handoffs, retro summaries, knowledge-health reports) must follow the four-layer QA rubric defined in docs/AGENT-RULES.md: facts, interpretation, actions, and verification evidence are kept distinct. Contributor-facing docs (CONTRIBUTING.md) use the existing concise tone and are not in scope for this rubric.
GitHub Actions runs these jobs on every push / PR:
quality-gates— syntax check, scoped Ruff lint, and the Python test suites. The Ruff lint surface is: all root*.pyscripts plusbrowse/,hooks/, andscripts/. Ruff lint is scoped to this surface; Python outside root scripts and those directories is not linted by CI.remote-terminal—npm ci,npm test, lint baseline, clean-zone lint gate, and blocking high-severity dependency audit forremote-terminal/. Legacy complexity/size warnings stay advisory innpm run lint; clean files (pty-daemon.js,test/client.test.js) promote the same rules to errors throughnpm run lint:clean.browse-ui—pnpm typecheck,pnpm lint,pnpm format:check,pnpm test,pnpm build.browse-ui/eslint.config.mjskeeps the repo-wide@typescript-eslint/no-explicit-anybaseline advisory aswarn, then promotes clean zones such assrc/lib/**/*.{ts,tsx}toerrorso strict rules can expand without breaking legacy areas.e2e-smoke— Playwrightbehavioralproject (smoke.spec.ts,shortcuts.spec.ts,chat.spec.ts,diagnostics.spec.ts, andbroker-mode.spec.ts) on Chromium.
For sk-rust/** changes, sk CI also runs Rust formatting, strict Clippy, tests across Ubuntu, Windows, and macOS, and a blocking startup benchmark regression gate. sk-rust/clippy.toml defines advisory complexity thresholds (cognitive-complexity-threshold, too-many-lines-threshold, too-many-arguments-threshold). The Complexity advisory (Rust Clippy) step runs before the strict Clippy gate with continue-on-error: true, so complexity warnings are measured before any future enforcement change. The startup benchmark gate runs benchmark.py startup --baseline-file .benchmarks/sk-startup-baseline.json --regression-threshold 20; the first run creates and uploads the baseline, and later runs fail only when median startup time exceeds the cached baseline by more than 20% and a 5ms absolute floor. The RustSec dependency audit (advisory) job installs cargo-audit in a blocking setup step, then runs cargo audit --file Cargo.lock with continue-on-error: true; the YAML TODO records the future path to remove advisory mode and make Rust dependency advisories blocking after the baseline is clean.
Playwright visual snapshot E2E is manual-dispatch only in e2e-visual because screenshot output differs across platforms. The always-on e2e-smoke job runs only the stable behavioral project and excludes visual.spec.ts.
- Trend Scout is scheduled/manual (
trend-scout.ymlor explicit CLI runs) — NOT bound topreToolUse/postToolUsehooks (avoid output spam during sessions). Multi-lane discovery (lanes[]config) and--explainare CLI/workflow-only features. - Sync browse diagnostics are read-only:
/healthzadvertises/api/sync/status;/api/sync/statusreports local queue/failure/config/cursor state only. - Cron DB maintenance (
cron-tasks.pytemplateswal-checkpointandvacuum) — optional scheduled tasks that runPRAGMA wal_checkpoint(TRUNCATE)daily andVACUUM+PRAGMA quick_checkweekly onknowledge.db; busy/missing states return a structured dict without raising;last_run_atis not advanced on busy so the task retries automatically. - Retrospective (
retro.py) aggregates knowledge health, skill/tentacle outcomes, hook audit decisions, and git history into a composite operator score. The browse server exposesGET /api/retro/summary(defaults to?mode=repo; pass?mode=localfor full multi-source data) and a minimal HTML page at/retro. Theretro.ymlworkflow isworkflow_dispatch-only; it runsretro.py --mode repo --json, writes a markdown summary artifact (including confidence, distortion flags, accuracy notes, and improvement actions when present), and appends to the job summary. No issues, commits, or DB writes are created. A collapsibleRetroSectionpanel in the browse insights dashboard consumes/api/retro/summaryand renders the richer explanation fields (score_confidence,distortion_flags,accuracy_notes,improvement_actions,scout,toward_100) when present, failing gracefully when absent or when the API is unavailable. Local mode (?mode=local) includes all sections (includingbehavior) but may emitscore_confidence=lowand distortion flags (e.g.hook_deny_dry_noise,skills_unverified) that indicate the score should be treated as a rough signal only; repo mode scores are typically cleaner but cover git signals only. The optional top-levelscoutobject provides read-only Trend Scout coverage health without affecting the composite score. The optional top-leveltoward_100array is an additive diagnostic: a list of sections scoring below 100, each withsection,score,gap(100 − score), and metric-derivedbarriers; it explains the current gap but does not change the score formula or any subscore.benchmark.pystores commit-keyed snapshots and exposesretro_gap/health_gapgap-to-target fields in compare output so measurable progress is explicit.