diff --git a/.github/skills/tentacle-orchestration/SKILL.md b/.github/skills/tentacle-orchestration/SKILL.md
old mode 100755
new mode 100644
index db1991a2..1d458df0
--- a/.github/skills/tentacle-orchestration/SKILL.md
+++ b/.github/skills/tentacle-orchestration/SKILL.md
@@ -1,132 +1,29 @@
 ---
 name: tentacle-orchestration
-description: Break complex tasks into scoped parallel work units for multi-agent execution. Always use task-step-generator first as a reviewed planning scaffold, then adapt the reviewed steps into tentacles. Use when a task spans multiple modules or layers, needs agent delegation, or the user says "orchestrate", "multi-agent", "parallel agents", "tentacle", or "swarm". Each implementation/fix tentacle runs strict-tdd-workflow internally. Features Opus Leader Council for quality-first multi-platform work.
+description: Break complex tasks into scoped parallel work units for multi-agent execution. Always use task-step-generator first as a reviewed planning scaffold, then adapt the reviewed steps into tentacles. Use when a task spans multiple modules or layers, needs agent delegation, or the user says "orchestrate", "multi-agent", "parallel agents", "tentacle", or "swarm". Each implementation/fix tentacle runs strict-tdd-workflow internally.
 ---
 
-# Tentacle Orchestration — copilot-session-knowledge
+# Tentacle Orchestration
 
 Break a complex task into scoped work units ("tentacles"), enrich each with context, then dispatch agents in parallel. Results persist in files so nothing is lost between agent boundaries.
 
-Adapted from the [OctoGent](https://github.com/hesamsheikh/octogent) tentacle pattern. Customized for **copilot-session-knowledge**: a multi-platform hybrid of Python stdlib tools, Next.js/React browse-ui, and a Rust binary — where **quality outweighs speed**.
+Adapted from the [OctoGent](https://github.com/hesamsheikh/octogent) tentacle pattern.
 
 > **Relationship with strict-tdd-workflow**: Tentacle is the **orchestrator** (splits work), strict-tdd is the **executor** (runs inside each implementation/fix tentacle). For single-module tasks, skip tentacle and use strict-tdd directly.
 >
 > **Relationship with task-step-generator**: `task-step-generator` is the **planning scaffold**. Run it before creating tentacles, then review and edit the generated steps. Do not copy generated steps blindly.
 
----
-
-## ⚡ Opus Leader Council (Project-Specific Pattern)
-
-This project uses **Opus-class Leader agents** instead of flat swarming. Each leader owns a domain and cannot be bypassed. Leaders discuss before acting and escalate to peers when stuck.
-
-### Leader Roster
-
-| Leader | Model | Domain | Scope |
-|--------|-------|--------|-------|
-| **dev-leader** | `claude-opus-4.7` | Python tools, hooks, Rust binary | `*.py`, `hooks/**/*`, `crates/**/*` |
-| **test-leader** | `claude-opus-4.7` | All test surfaces (Python + TypeScript + Rust) | `test_*.py`, `run_all_tests.py`, `browse-ui/src/**/*.test.*`, `browse-ui/e2e/**/*` |
-| **qa-leader** | `claude-opus-4.7` | Verification gates, cross-surface sync, security | All changed surfaces |
-| **browse-leader** | `claude-opus-4.7` | Next.js/React browse-ui frontend | `browse-ui/src/**/*`, `browse-ui/e2e/**/*` |
-| **research-leader** | `claude-opus-4.7` | Any question with confidence < 1.0 | Read-only, all surfaces |
-
-### Leader Dispatch Pattern
-
-```bash
-# Each leader gets its own tentacle with --model claude-opus-4.7
-sk tentacle create dev-<feature> --scope "*.py hooks/**/*" --desc "Python implementation" --briefing
-sk tentacle create test-<feature> --scope "test_*.py run_all_tests.py" --desc "Test coverage" --briefing
-sk tentacle create qa-<feature> --scope "." --desc "Verification + cross-surface sync" --briefing
-
-# Dispatch leaders in parallel — all use opus
-sk tentacle swarm dev-<feature> --agent-type general-purpose --model claude-opus-4.7 --briefing
-sk tentacle swarm test-leader --agent-type general-purpose --model claude-opus-4.7 --briefing
-sk tentacle swarm qa-leader --agent-type verification-gate --model claude-opus-4.7 --briefing
-```
-
-### Leader Peer-Discussion Protocol
-
-When a leader is stuck or has confidence < 1.0, it does NOT stop — it requests peer input:
-
-1. **dev-leader stuck on architecture** → write question to `handoff.md` with `STATUS: PEER_DISCUSS`, dispatch `research-leader` tentacle with the question
-2. **test-leader unsure of coverage strategy** → write question, dispatch `dev-leader` review, loop until both agree
-3. **qa-leader finds gate failure** → write failure report, dispatch `dev-leader` fix tentacle, loop back through qa-leader
-
-```bash
-# Peer discussion: dev-leader writes question, research-leader answers
-sk tentacle handoff dev-<feature> "Architecture question: [question]" --status AMBIGUOUS
-sk tentacle create research-<question> --scope "." --desc "Research: [question]" --briefing
-sk tentacle swarm research-<question> --agent-type research-planner --model claude-opus-4.7 --briefing
-# After research-leader answers → continue dev-leader
-sk tentacle resume dev-<feature>
-```
-
----
-
-## ♾️ Infinite Confidence Loop Protocol
-
-**This project never marks BLOCKED when confidence < 1.0.** Instead, it loops until certainty.
-
-### The Loop
-
-```
-CONFIDENCE CHECK
-      │
-      ▼
-  ≥ 1.0? ──YES──► Proceed to execution
-      │
-      NO
-      │
-      ▼
-  Split ambiguity into atomic questions
-      │
-      ▼
-  Dispatch research-leader tentacle(s) on claude-opus-4.7
-      │
-      ▼
-  Research completes, evidence recorded
-      │
-      ▼
-  Re-evaluate confidence ───────────────► back to top
-      │
-  (loop forever until ≥ 1.0 or user explicitly overrides)
-```
-
-### Implementation
-
-```bash
-# Step 1: Spawn research tentacle for each ambiguous question
-sk tentacle create research-q1 --scope "." --desc "Research: <question 1>" --briefing
-sk tentacle swarm research-q1 --agent-type research-planner --model claude-opus-4.7 --briefing
-
-# Step 2: After research, record evidence and re-evaluate
-sk tentacle goal gate pass research-q1 --reason "Evidence: <summary>"
-
-# Step 3: Only proceed when ALL research gates pass
-sk tentacle goal criteria check
-# If not met: loop back to Step 1 with remaining questions
-# If met: proceed to Plan phase
-```
-
-**Override rule**: Only the human operator can exit the loop early. Record the override with rationale:
-```bash
-sk tentacle goal gate pass override --reason "Human override: <rationale>"
-```
-
----
-
 ## Planning Discipline
 
 Use this sequence before creating any tentacle:
 
-1. Generate a step file with `task-step-generator` (`.github/steps/<task-slug>.md`).
+1. Generate a step file with `task-step-generator` (`.github/steps/<task-slug>.md` when the project uses `.github/`, otherwise `STEPS.md` or the path requested by the user).
 2. Review the generated step file with `references/decomposition-review.md`.
 3. Record what was accepted, edited, and rejected before dispatching agents.
 4. Convert only the reviewed steps into non-overlapping tentacles and atomic todos.
 
 Why: decomposition and checklists reduce avoidable cognitive load, but generated plans can anchor on the first plausible split. Treat the step file as a draft planning artifact, not as authority.
 
----
-
 ## Decision Confidence Gate
 
 Before creating, dispatching, merging, deleting, or closing tentacles, verify the routing/plan
@@ -137,15 +34,18 @@ or validation agents first.
 Required behavior when confidence `< 1.0`:
 
 1. Stop implementation/deletion/merge decisions for the uncertain scope.
-2. Split the ambiguity into atomic questions: task type, scope boundaries, dependencies, acceptance evidence, and affected systems.
-3. Dispatch research/validation tentacles on **`claude-opus-4.7`** (opus-class mandatory, no exceptions).
-4. Record the evidence and rejected alternatives in the tentacle `handoff.md`.
-5. Continue only after the synthesized decision reaches confidence `1.0`, or after an explicit user override is recorded with its rationale.
-6. **Never mark BLOCKED** — instead create a `research-<topic>` tentacle, loop, and resume when certain.
-
-Why this gate exists: low-confidence orchestration creates the worst kind of parallelism — many agents confidently doing the wrong work. Research-first decomposition is cheaper than unwinding a bad swarm.
-
----
+2. Split the ambiguity into atomic questions: task type, scope boundaries, dependencies,
+   acceptance evidence, and affected systems.
+3. Dispatch research/validation tentacles or sub-agents on the strongest available model
+   (`claude-opus-4.7` when available; otherwise the newest opus-class model).
+4. Record the evidence and rejected alternatives in the tentacle `handoff.md` or the
+   conductor `research_gate` artifact.
+5. Continue only after the synthesized decision reaches confidence `1.0`, or after an
+   explicit user override is recorded with its rationale.
+
+Why this gate exists: low-confidence orchestration creates the worst kind of parallelism —
+many agents confidently doing the wrong work. Research-first decomposition is cheaper than
+unwinding a bad swarm.
 
 ## When to use
 
@@ -156,22 +56,44 @@ Why this gate exists: low-confidence orchestration creates the worst kind of par
 | 3+ files, multiple modules | **Tentacle required** — decompose into scoped units |
 | Multi-phase with agent delegation | **Tentacle required** — each delegated agent gets a tentacle |
 | Bug investigation, multiple hypotheses | Tentacle recommended — one tentacle per hypothesis |
-| Cross-surface (Python + browse-ui + Rust) | **Opus Leader Council required** — one leader per surface |
-| Confidence < 1.0 at any point | **Research tentacle required** — loop until certain |
 
 **Not a good fit:** strictly sequential single-file tasks, limited token budget, trivial edits.
 
----
-
 ## Sub-agent Guardrails
 
 These guardrails apply to dispatched sub-agents. The **commit restriction is enforced at the
-git level** when hooks are installed.
+git level** when hooks are installed; all other items are conventions reinforced by prompt
+context.
+
+**Why git hooks, not preToolUse alone:** When the orchestrator dispatches a sub-agent via the
+`task()` tool, the platform does not guarantee that `preToolUse` hooks from the parent
+`hooks.json` propagate into the sub-agent's context window. Git hooks (`pre-commit`,
+`pre-push`) are filesystem-level and fire for any git process regardless of which agent spawned
+it — they are the reliable enforcement surface.
 
 **How enforcement works:**
 1. `tentacle.py create` generates a UUID `tentacle_id` stored in the tentacle's `meta.json`.
-2. `hooks/pre-commit` and `hooks/pre-push` call `hooks/check_subagent_marker.py`, which blocks git operations while the marker is fresh.
-3. `sk tentacle complete <name>` removes the matching marker entry.
+   If the requested name directory already exists, `create` auto-resolves the collision by
+   creating `<name>-<uuid[:8]>` — the slug is printed and must be used for all subsequent
+   commands. `tentacle.py swarm` reads `tentacle_id` from `meta.json` and writes an HMAC-signed
+   marker file at `~/.copilot/markers/dispatched-subagent-active` containing `active_tentacles`
+   entries of the form `{"name": ..., "ts": ..., "git_root": ..., "tentacle_id": ...}`.
+   **Primary deduplication key: `tentacle_id`** (when present) — two instances with the same
+   logical name in the same repo each get a separate entry. Fallback for legacy entries without
+   `tentacle_id`: `(name, git_root)`.
+2. `hooks/pre-commit` and `hooks/pre-push` call `hooks/check_subagent_marker.py`, which blocks
+   the git operation when the marker is present, auth-valid, within its 4-hour TTL, **and the
+   entry's `git_root` matches the repo running the git command**. A marker from a different repo
+   does not block commits there — this prevents cross-repo false positives when tentacles are
+   active in other repos concurrently. Entries without `git_root` (old format) conservatively
+   block all repos.
+   > **Upgrade migration:** Cross-repo isolation is not retroactive. In-flight old-format marker
+   > entries (no `git_root`) continue to block all repos until completed, cleared, or expired (4h
+   > TTL). To get isolation immediately: `sk tentacle complete <name>` then re-dispatch.
+3. `hooks/rules/subagent_guard.py` provides a secondary `preToolUse` intercept for the
+   orchestrator session (defense-in-depth only — not the primary path).
+4. `sk tentacle complete <name>` reads `tentacle_id` from `meta.json` and removes only the
+   matching marker entry; the marker is deleted when `active_tentacles` becomes empty.
 
 **Install the git hooks** (once per repository):
 
@@ -180,39 +102,52 @@ sk install --install-git-hooks
 # fallback: python3 ~/.copilot/tools/install.py --install-git-hooks
 ```
 
+**Enforcement scope and known limitations:**
+
+- **Local-only.** Cloud-delegated or remote agent runs are not covered.
+- **`preToolUse` non-inheritance.** The `preToolUse` guard in the main session is
+  defense-in-depth — it does not replace git hooks. Whether `preToolUse` propagates into
+  `task()`-spawned subagents is undefined by the platform.
+- **Same-repo multi-session supported (phase 5) — with working-tree caveat.** `tentacle_id`
+  isolation at the marker/runtime layer means two instances with the same logical name in the
+  same repo each hold a separate entry and `complete` only clears the matching one. However,
+  the working tree and git index are shared — concurrent tentacles with overlapping file scopes
+  will produce conflicts. Keep scopes non-overlapping.
+- **Collision-resolved slug names:** When `create` auto-resolves a directory collision, the
+  printed `<name>-<uuid[:8]>` slug must be used for all subsequent commands.
+- **After tool updates,** `auto-update-tools.py` does NOT auto-reinstall git hooks. Re-run
+  `install.py --install-git-hooks` in each protected repo after relevant updates.
+
 | Convention | What to do |
 |------------|-----------|
-| **Commit restriction** | Do not run `git commit` or `git push`. Both are blocked at the filesystem level while `dispatched-subagent-active` marker is fresh. |
-| **Stay in scope** | Do not edit files outside your tentacle's declared `scope`. |
-| **Escalate, don't expand** | If scope is insufficient, write the gap to `handoff.md` and stop. |
-| **No over-implementation** | Implement only what your todos specify. |
-| **Handoff before stopping** | Always write a structured handoff: `sk tentacle handoff <name> "<summary>" --status DONE --changed-file <path> --learn` |
-| **No platform `create` for reports** | Use `tentacle.py handoff` to persist output to `handoff.md`. |
-
----
+| **Commit restriction** | Do not run `git commit` or `git push`. When git hooks are installed, both are blocked at the filesystem level while the `dispatched-subagent-active` marker is fresh. Even without hooks, committing from a subagent mid-run risks corrupting the orchestrator's merge flow. |
+| **Stay in scope** | Do not edit files outside your tentacle's declared `scope`. If you discover that more files are needed, escalate — do not expand unilaterally. |
+| **Escalate, don't expand** | If your scope is insufficient to complete the task, write the gap to `handoff.md` (e.g. "blocked: need changes in `src/db/` which is outside my scope") and stop. The orchestrator decides whether to create a new tentacle or adjust scope. |
+| **No over-implementation** | Implement only what your todos specify. Do not add features, refactors, or improvements that are not in your todo list — even if they seem obvious. |
+| **Handoff before stopping** | Always write a structured handoff before marking your work done — even if the session ends early. Use `tentacle.py handoff <name> "<prose summary>" --status <STATUS> --changed-file <path> --learn`. Required fields: a prose summary and `--status` (one of `DONE`, `BLOCKED`, `TOO_BIG`, `AMBIGUOUS`, `REGRESSED`). Add one `--changed-file` per modified file; omit it when no files changed (common for `BLOCKED`, `TOO_BIG`, or `AMBIGUOUS`). Old form `handoff <name> "<message>" --learn` still works when no structured status is needed. The orchestrator reads `STATUS:` and `Changed:` receipts to decide next steps and triage. |
+| **No platform `create` for reports** | Do **not** use the runtime platform's `create` file-creation tool to save research output, investigation findings, or final reports. The `create` tool is a platform capability that is **not available in all agent runtimes** (cloud agents, Copilot cloud runs, background tasks). Use `tentacle.py handoff` to persist agent output to `handoff.md` — this is always available when `tentacle.py` is on disk. If even `tentacle.py` is unavailable, print the report to chat so the orchestrator can capture it. Orchestrators must not assume sub-agents can create arbitrary files. |
 
 ## Anti-patterns
 
-- ❌ SQL/markdown todos only for multi-agent work → agents lose scope isolation
-- ❌ Launching sub-agents without `swarm` prompt → no scope, constraints, or key files
-- ❌ Skipping `--briefing` → past mistakes not injected into CONTEXT.md
+- ❌ SQL/markdown todos only for multi-agent work → agents lose scope isolation and CONTEXT.md
+- ❌ Launching sub-agents without `swarm` prompt → agent gets no scope, constraints, or key files
+- ❌ Skipping `--briefing` on create → past mistakes not injected into CONTEXT.md
 - ❌ Skipping `complete` before `delete` → learnings from handoff.md lost permanently
 - ❌ Overlapping tentacle scopes → agents overwrite each other's work
-- ❌ Creating tentacles from intuition without a generated-and-reviewed step file
-- ❌ Copying `task-step-generator` output blindly without checking dependencies
-- ❌ Skipping the runtime bundle on multi-agent work → no `recall-pack.json`
-- ❌ Sub-agent commits or pushes → blocked by git hooks when installed
-- ❌ Sub-agent edits files outside declared scope → silent conflicts
-- ❌ Treating confidence `< 1.0` as acceptable → use opus research loop, never skip
-- ❌ **Marking BLOCKED when stuck** → loop via peer leader discussion instead
-- ❌ **Using haiku/sonnet for leader agents** → all leaders must use `claude-opus-4.7`
-- ❌ **Skipping test-leader** → every feature/fix needs test coverage verified by test-leader
-- ❌ **qa-leader bypassed on cross-surface changes** → Python + browse-ui + Rust changes always need qa-leader
-- ❌ Accepting sub-agent claims of "tests pass" without running commands → unverified claims are not evidence
-- ❌ Closing a tentacle `DONE` with no verification evidence → treated as `AMBIGUOUS`
-- ❌ Sub-agent uses the platform `create` file-creation tool for research output → use `tentacle.py handoff`
-
----
+- ❌ Creating tentacles directly from intuition without a generated-and-reviewed step file
+- ❌ Copying `task-step-generator` output blindly without checking dependencies, ownership, work-in-progress limits, evidence, and agent fit
+- ❌ Skipping the runtime bundle on multi-agent work → agents lose file-backed context and `recall-pack.json`
+- ❌ Using `--briefing --output json --no-bundle` → briefing cannot be represented without the bundle
+- ❌ Sub-agent commits or pushes → blocked by git hooks when installed (and risky regardless: corrupts orchestrator's merge/verify flow)
+- ❌ Sub-agent edits files outside declared scope → silent conflicts with other parallel agents
+- ❌ Sub-agent silently expands scope instead of escalating → orchestrator loses visibility
+- ❌ Treating confidence `< 1.0` as acceptable → split ambiguity and run opus-class research
+  before implementation, deletion, merge, or routing decisions
+- ❌ Skipping `install.py --install-git-hooks` → git-level commit/push guard is inactive; enforcement falls back to preToolUse only (not guaranteed in subagent contexts)
+- ❌ Accepting sub-agent claims of "tests pass" / "lint clean" / "CI green" without running the commands → unverified claims are not evidence; always run the gates yourself and record output
+- ❌ Closing a tentacle `DONE` with no verification evidence → treated as `AMBIGUOUS`; requires triage before proceeding
+- ❌ Closing an issue without per-criterion evidence → acceptance criteria are unproven until commands run and output is recorded
+- ❌ Sub-agent uses the platform `create` file-creation tool to save research or investigation output → use `tentacle.py handoff` instead. The `create` tool is a runtime-platform capability and is **not guaranteed in all agent contexts** (cloud agents, Copilot cloud runs, background tasks). `tentacle.py handoff` writes to `handoff.md` in the tentacle directory and is always available as long as `tentacle.py` is on disk. If `tentacle.py` is also unavailable, fall back to printing the report to chat. Orchestrators should not assume sub-agents can create arbitrary files.
 
 ## Core concept
 
@@ -236,44 +171,33 @@ A **tentacle** is a scoped work unit stored as files:
 The octopus metaphor: one orchestrator (you), multiple tentacles (agents), each handling a distinct code region.
 
 <example>
-**Task:** Add token cost display to browse-ui (multi-surface: Python API + React component + Vitest tests)
-
-**Confidence check:** Python API shape? → confidence 0.8 → spawn research-leader tentacle first
+**Task:** Add dark mode support to a Next.js app
 
-**After research (confidence 1.0):**
+**Decomposition:**
+- `theme-tokens` tentacle — scope: `src/styles/tokens.css`, `tailwind.config.ts` — create CSS variables for dark/light palettes
+- `component-update` tentacle — scope: `src/components/**/*` — apply `dark:` Tailwind classes to all components
+- `test-suite` tentacle — scope: `tests/**/*` — write Playwright visual regression tests for dark mode
 
-**Opus Leader decomposition:**
-- `dev-leader-cost-api` — scope: `browse/routes/*.py`, `browse/api/*.py` — add /api/session/cost endpoint
-- `browse-leader-cost-ui` — scope: `browse-ui/src/components/**/*` — React cost display component
-- `test-leader-cost` — scope: `test_*.py`, `browse-ui/src/**/*.test.*` — Python unit tests + Vitest tests
-- `qa-leader-cost` — scope: `.` — verify cross-surface sync, run all gates
-
-**Dispatch order:** dev-leader + browse-leader in parallel → test-leader → qa-leader → commit
-
-Each leader uses `--model claude-opus-4.7`.
+Each tentacle is independent, non-overlapping, and completable in isolation. The orchestrator merges results after all three pass verification gates.
 </example>
 
----
-
 ## Internal workflow
 
 The workflow has 5 phases: **Clarify → Plan → Execute → Verify → Close**.
 
-Clarification is the most important phase. A bug found in spec costs 1x to fix. Found in code: 10x. Found in production: 100x. Never skip this phase.
+Clarification is the most important phase. A bug found in spec costs 1x to fix. Found in code: 10x. Found in production: 100x. Never skip this phase — time invested here prevents entire categories of downstream waste.
 
 ### Phase 0: Clarify Spec (Steps 0.0–0.5)
 
 This phase takes a raw specification and makes it implementation-ready through iterative Q&A. No planning or coding happens until the spec is CLEAN.
 
-- **Step 0.0** (optional): Co-author the spec when user has no written spec
+- **Step 0.0** (optional): Co-author the spec when user has no written spec — structured context gathering + iterative drafting
 - **Steps 0.1–0.4**: Analyze spec against 8 quality dimensions, generate Spec Health Report, iterative refinement until CLEAN
 - **Step 0.5**: Reader Testing — verify a fresh agent (no context) can correctly understand the spec
 
 For the full process, see `references/spec-clarification.md`.
 
-**Gate**: Never proceed to Phase 1 until the spec is CLEAN and reader-tested.
-
-**Multi-platform note**: For cross-surface tasks, clarification must identify which surfaces are affected (Python / browse-ui / Rust) so the leader mapping is correct before any planning.
+**Gate**: Planning on an unclear spec produces incorrect decomposition, wasted agent work, and rework. Never proceed to Phase 1 until the spec is CLEAN and reader-tested.
 
 ### Phase 1: Plan
 
@@ -284,103 +208,103 @@ Use the CLEAN spec and its Impact Analysis / Risk Assessment to inform decomposi
 Use `task-step-generator` before creating tentacles:
 
 ```text
-Generate a step file for this task. Include CLARIFY, RED evidence/test strategy, BUILD, TEST, REVIEW, LOOP-EVAL, and COMMIT/CLOSE.
+Generate a step file for this task. Include CLARIFY, RED evidence/test strategy for implementation or fixes, BUILD, TEST, REVIEW, LOOP-EVAL when iteration is likely, and COMMIT/CLOSE.
 ```
 
+The output may say the task is too large for a single step file. That is acceptable: use the step file as a top-level scaffold, then split reviewed steps into tentacles.
+
 #### Plan B: Review and edit the generated steps
 
-Apply `references/decomposition-review.md`. Verify:
+Apply `references/decomposition-review.md`. At minimum, verify:
+
 - acceptance signal is observable,
 - RED evidence/test strategy exists before implementation,
 - dependencies are ordered before parallel work,
 - steps are small enough to review in one context,
-- each step maps to the correct **Opus Leader** agent type.
+- only independent work is parallelized,
+- each evidence-producing step names logs/screenshots/hashes or equivalent artifacts,
+- each step maps to the correct agent type/model available in the project.
 
-#### Plan C: Confidence check before decomposition
+Do not proceed until the reviewed plan clearly states accepted, edited, and rejected steps.
 
-```bash
-# For each ambiguous point in the step file:
-sk tentacle create research-<topic> --scope "." --desc "Research: <topic>" --briefing
-sk tentacle swarm research-<topic> --agent-type research-planner --model claude-opus-4.7 --briefing
-# Wait for handoff, evaluate evidence, repeat until confidence = 1.0
-sk tentacle goal gate pass research-<topic> --reason "Evidence: <summary>"
-```
+#### Plan C: Decompose the reviewed task into modules
 
-#### Plan D: Map steps to Opus Leaders
+Read the task description and identify independent code regions. Each region becomes one tentacle.
 
-For each tentacle, assign an appropriate leader model:
+Each code tentacle must declare:
 
-| Tentacle type | agent_type | Model | Scope pattern |
-|--------------|-----------|-------|--------------|
-| Python dev (tools/hooks) | general-purpose | `claude-opus-4.7` | `*.py`, `hooks/**/*` |
-| Python dev (browse backend) | python-browse-backend | `claude-opus-4.7` | `browse/**/*.py` |
-| browse-ui frontend | browse-ui-host-state | `claude-opus-4.7` | `browse-ui/src/**/*` |
-| Browse-ui tests (Vitest/Playwright) | general-purpose | `claude-opus-4.7` | `browse-ui/src/**/*.test.*`, `browse-ui/e2e/**/*` |
-| Python tests | general-purpose | `claude-opus-4.7` | `test_*.py`, `run_all_tests.py` |
-| Security/auth review | browser-security-reviewer | `claude-opus-4.7` | `browse/**/*`, `browse-ui/src/**/*` |
-| Research / architecture decisions | research-planner | `claude-opus-4.7` | Read-only, all |
-| Cross-surface verification | verification-gate | `claude-opus-4.7` | All changed surfaces |
-| Whole-app impact audit | whole-app-impact-auditor | `claude-opus-4.7` | All |
-| Docs / skills / hooks | general-purpose | `claude-opus-4.7` | `docs/**/*`, `.github/**/*` |
+- source step file and accepted/edited/rejected step numbers,
+- decision confidence (`1.0` required; otherwise create a research/validation tentacle first),
+- dependency order,
+- test/evidence owner,
+- implementation owner,
+- acceptance signal,
+- verification/evidence paths.
 
-#### Plan E: Create tentacles
+#### Plan D: Create tentacles
 
 ```bash
-sk tentacle create <name> \
+sk tentacle create <module-name> \
   --scope "<file-patterns>" \
-  --profile "<agent-profile-id>" \
+  --profile "<profile-id>" \
   --desc "<short description>" \
   --briefing
+# fallback: python3 ~/.copilot/tools/tentacle.py create <module-name> ...
 ```
 
-Available profiles (from `.github/agents/`):
-- `browse-ui-host-state` — browse-ui host/provider work
-- `python-browse-backend` — Python backend routes
-- `browser-security-reviewer` — security review
-- `research-planner` — research and architecture
-- `verification-gate` — gate verification
-- `whole-app-impact-auditor` — cross-surface impact
+The `--profile` flag loads a specialist `.agent.md` contract (role, domain,
+quality gates, escalation rules, and evidence requirements) into CONTEXT.md and
+meta.json. The `--briefing` flag injects past mistakes and patterns from
+session-knowledge into CONTEXT.md — use both when a suitable profile exists.
 
-#### Plan F: Add todos
+#### Plan E: Add todos
 
 ```bash
 sk tentacle todo <name> add "<specific, atomic task>"
+# fallback: python3 ~/.copilot/tools/tentacle.py todo <name> add "<task>"
 ```
 
-#### Plan G: Enrich CONTEXT.md
+Each todo should be one deliverable — testable, reviewable, and completable in isolation.
 
-Add to each tentacle's CONTEXT.md:
-- **Step-plan review**: accepted/edited/rejected steps, confidence evidence
-- **Multi-platform context**: which surfaces are touched and why
-- **Key files**: full paths to reference files
-- **Constraints**: project rules (stdlib-only Python, parameterized SQL, no pickle, etc.)
-- **Verification requirement**: what commands prove this tentacle DONE
+#### Plan F: Enrich CONTEXT.md
 
----
+Read reference files with `view`, then edit CONTEXT.md to add:
+- **Step-plan review**: source step file, accepted/edited/rejected steps, dependency order, and evidence contract
+- **What exists**: describe the current code in the scope area
+- **Key files**: full paths to reference files the agent needs
+- **Constraints**: rules specific to this code region
+
+This is the most important step. Agent quality is directly proportional to CONTEXT.md quality.
 
 ### Phase 2: Execute (Steps 5–6)
 
-#### Step 5: Dispatch Opus Leaders (swarm)
+#### Step 5: Dispatch agents (swarm)
 
 ```bash
-# Always use claude-opus-4.7 for all leaders
-sk tentacle swarm <name> --agent-type <type> --model claude-opus-4.7 --briefing
+sk tentacle swarm <name> --agent-type <type> --model <model> --briefing
 sk tentacle swarm <name> --output parallel --briefing
+sk tentacle dispatch <name> --agent-type <type> --model <model> --briefing
+# fallback: python3 ~/.copilot/tools/tentacle.py swarm/dispatch <name> ...
+```
 
-# Example: multi-surface feature
-# Wave 1 (parallel): dev + browse
-sk tentacle swarm dev-<feature> --agent-type general-purpose --model claude-opus-4.7 --briefing &
-sk tentacle swarm browse-<feature> --agent-type browse-ui-host-state --model claude-opus-4.7 --briefing &
-wait
+`swarm` and `dispatch` materialize a runtime bundle by default. If the tentacle
+was created with `--profile`, dispatch also injects a compact Specialist Profile
+section before the task list so the worker sees the expert role, gates, escalation
+rules, and evidence requirements. The dispatch prompt stays token-lean and points
+agents at `.octogent/tentacles/<name>/bundle/manifest.json` first.
+The bundle carries the full `CONTEXT.md`, todos, latest checkpoint, instruction snippets,
+skills catalogue, and `recall-pack.json`.
 
-# Wave 2 (after Wave 1): tests
-sk tentacle swarm test-<feature> --agent-type general-purpose --model claude-opus-4.7 --briefing
+`--briefing` fetches live past knowledge from session-knowledge at dispatch time. With the
+default bundle, briefing is stored in `briefing.md` and machine-readable recall is stored in
+`recall-pack.json` by trying `briefing.py --task <id> --json` first and falling back to
+`briefing.py "<query>" --pack --limit 3`. Use `--no-bundle` only for tiny/manual prompts; if
+you combine `--output json --briefing`, keep the default bundle enabled so JSON output can
+surface `bundle_path`.
 
-# Wave 3 (after Wave 2): QA gate
-sk tentacle swarm qa-<feature> --agent-type verification-gate --model claude-opus-4.7 --briefing
-```
+Use the output as the prompt for `task()`. Launch independent tentacles in parallel.
 
-Every implementation tentacle must execute strict-TDD internally: define failing evidence first, make the smallest change, prove criterion turns green.
+Every implementation or bug-fix tentacle must execute the strict-TDD loop internally: define or reproduce the failing evidence first, make the smallest change, then prove the same criterion turns green. Research-only, documentation-only, and review-only tentacles still need explicit evidence gates, but they do not fabricate code tests just to satisfy the pattern.
 
 #### Step 6: Monitor progress
 
@@ -389,245 +313,159 @@ sk tentacle status
 sk tentacle show <name>
 ```
 
-If a leader reports `AMBIGUOUS` or `PEER_DISCUSS`, dispatch a research or peer tentacle immediately — do not wait:
+### Phase 3: Verify (Steps 7–12)
 
-```bash
-# Leader is stuck: dispatch research immediately
-sk tentacle create research-<blocker> --scope "." --desc "Research: <blocker>" --briefing
-sk tentacle swarm research-<blocker> --agent-type research-planner --model claude-opus-4.7 --briefing
-# After research resolves: resume the stuck leader
-sk tentacle resume <stuck-tentacle>
-```
+Every step here catches a different class of agent error. For detailed gate descriptions (build, lint, test, review, docs, QA audit), see `references/verification-gates.md`.
 
----
+Summary:
 
-### Phase 3: Verify (Steps 7–12)
+| Gate | What it catches | Skip when |
+|------|----------------|-----------|
+| **Build** | Syntax errors, type mismatches, import failures | Never skip |
+| **Lint** | Style violations, unused imports, formatting | Never skip |
+| **Test** | Logic bugs, regressions, broken contracts | Never skip |
+| **Review** | Security issues, design flaws, scope creep | Never skip |
+| **Docs** | Stale README, outdated JSDoc, missing CHANGELOG | Internal refactors only |
+| **QA audit** | Hallucinated tests, spec mismatches, blind spots | Low-risk changes only |
 
-Every step catches a different class of agent error.
-
-| Gate | What it catches | Command | Skip when |
-|------|----------------|---------|-----------|
-| **Python build** | Syntax, import failures | `python3 -c "import ast; ast.parse(open('file.py').read())"` | Never |
-| **Python lint** | Ruff violations | `ruff check *.py hooks/**/*.py` | Never |
-| **Python tests** | Logic bugs, regressions | `python3 test_security.py && python3 test_fixes.py` | Never |
-| **browse-ui typecheck** | TypeScript errors | `cd browse-ui && pnpm typecheck` | Never (if browse-ui touched) |
-| **browse-ui lint** | ESLint violations | `cd browse-ui && pnpm lint` | Never (if browse-ui touched) |
-| **browse-ui format** | Prettier violations | `cd browse-ui && pnpm format:check` | Never (if browse-ui touched) |
-| **browse-ui test** | Vitest failures | `cd browse-ui && pnpm test` | Never (if browse-ui touched) |
-| **browse-ui build** | Next.js build | `cd browse-ui && pnpm build` | Never (if browse-ui touched) |
-| **Rust** | Compile + clippy | `cargo fmt --check && cargo clippy -- -D warnings && cargo test` | If Rust untouched |
-| **Review** | Security, design flaws | `code-reviewer` agent | Never |
-| **Docs sync** | Stale docs | Check `docs/` matches changed behavior | Internal refactors only |
-| **QA audit** | Hallucinated tests, blind spots | `qa-leader` tentacle | Low-risk only |
-
-**Verification surface matrix** (from `copilot-instructions.md`):
-
-| Surface | Required evidence |
-|---------|-------------------|
-| Python | `python3 test_security.py && python3 test_fixes.py` |
-| Hooks/docs/skills | `python3 tests/test_quality_gates.py` |
-| browse-ui | `pnpm typecheck && pnpm lint && pnpm format:check && pnpm test && pnpm build` |
-| Rust | `cargo fmt --all -- --check && cargo clippy -- -D warnings && cargo test` |
-| remote-terminal | `npm test && npm run lint && npm run lint:clean` |
-
-**Evidence requirement:** Run commands yourself. Record output. "Tests pass" is not evidence — the command output is.
+The first 4 gates are mandatory. Skipping any of them means you don't know if the agent output is correct.
 
----
+**Evidence requirement:** Each gate must produce concrete, recorded output before being marked as passed. Do not rely on agent claims that "lint is clean" or "tests pass" — run the commands yourself and attach or reference the output. A gate is only passed when you hold the proof, not when the sub-agent says it is. See Rule 9 (Claims Require Evidence) in `docs/AGENT-RULES.md`.
 
 ### Phase 3.5: Goal Evaluation Loop
 
-After all verification gates pass, evaluate whether the overarching goal is met. This is the **loop-until-verified** phase.
+After all verification gates pass, evaluate whether the overarching goal is met before proceeding to commit and close. This is the **loop-until-verified** phase — the orchestrator decides whether to iterate or close.
+
+**Step: Record goal-eval evidence**
 
 ```bash
-# Run success-criteria check and persist result
+# Run the goal's success-criteria check and persist the result
 sk tentacle verify <name> "<success-criteria-command>" --label "goal-eval"
-sk tentacle goal criteria check
-sk tentacle goal eval --decision continue|complete
+# fallback: python3 ~/.copilot/tools/tentacle.py verify <name> ...
 ```
 
+**Decision logic:**
+
 | Result | Action |
 |--------|--------|
-| Goal met — all success criteria satisfied | Proceed to Phase 4 |
-| Goal partially met — gaps identified | Return to Phase 1, create new leader tentacles for gaps |
-| Goal blocked by external dependency | Surface to user. **Do not exit loop without user decision.** |
+| Goal met — all success criteria satisfied | Proceed to Phase 4 (Commit + Close) |
+| Goal partially met — remaining gaps identified | Return to Phase 1 (Plan), create new tentacles for gaps |
+| Goal blocked — external dependency or scope issue | Write gap to handoff, surface to user, decide whether to continue |
 
-**Loop rules:**
-1. Success criteria defined in Phase 1, not invented during evaluation.
-2. The orchestrator owns the loop — sub-agents report via handoff and stop.
-3. Create **new tentacles** for remaining gaps; do not re-open completed tentacles.
-4. Record evidence for every evaluation using `tentacle.py verify`.
-5. Do not infer goal status from handoff prose — run the command.
+**Rules:**
+1. Success criteria must be defined **before** dispatching tentacles (in Phase 1), not invented during evaluation.
+2. Evaluation is the **orchestrator's responsibility** — sub-agents do not loop. They report via handoff and stop.
+3. When looping, create **new tentacles** for remaining gaps; do not re-open completed tentacles.
+4. Record evidence for every evaluation using `tentacle.py verify` so the decision is auditable. Closing without recorded evidence is an anti-pattern — it removes the audit trail.
+5. Do not infer goal status from handoff prose alone. Run the success-criteria command and record its output.
 
----
+**Example loop iteration:**
+
+```
+Goal: "All 137 tests pass and benchmark score ≥ 90"
+
+Wave 1 results: 130/137 tests pass, score = 85
+→ Eval: NOT MET. Gaps: 7 failing tests, score delta = 5 pts
+→ Create tentacle "fix-failing-tests" (scope: tests/), tentacle "benchmark-perf" (scope: embed.py)
+→ Dispatch Wave 2
+
+Wave 2 results: 137/137 pass, score = 92
+→ Eval: MET. Proceed to Phase 4.
+```
 
 ### Phase 4: Commit + Close (Steps 13–17)
 
-#### Step 13: Commit (orchestrator only)
+#### Step 13: Commit after each completed phase (orchestrator only)
 
-```bash
-git add -A && git commit -m "feat(<scope>): <description>
+Commit working code after completing each major phase — not just at the end.
+If a later phase fails or the session crashes, earlier work is preserved and rollback is possible.
 
-Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>"
+```bash
+git add -A && git commit -m "feat(<scope>): <phase description>"
 ```
 
-Commit cadence:
-- After Phase 1 foundation tentacles + build passes → commit
-- After each parallel batch + build passes → commit
-- After Phase 3 verification → commit
-- Final integration → commit
+**Commit cadence:**
+- After Phase 1 shared/foundation tentacles complete + build passes → commit
+- After each Phase 2 parallel batch completes + build passes → commit
+- After Phase 3 verification passes → commit
+- Final integration wiring → commit
+
+**Commit restriction:** Sub-agents must not run `git commit` or `git push`. When git hooks
+are installed (`install.py --install-git-hooks`), both operations are blocked at the git level
+while the `dispatched-subagent-active` marker is fresh. Even without hooks, this is a hard
+convention: the orchestrator commits after merging and verifying all tentacle results.
+This enforcement is **local-only** — cloud-delegated or remote agent runs are not covered.
 
 #### Step 14: Runtime verification
 
-```bash
-# Python browse backend
-python3 browse.py --dev &
-curl -s http://localhost:8765/healthz | jq .
+Build passing ≠ app works. After all tentacles are merged, run the app:
 
-# browse-ui
-cd browse-ui && pnpm dev &
-# Check browser: http://localhost:3000
+```bash
+# Desktop: ./gradlew :composeApp:jvmRun
+# Mobile: deploy to emulator/simulator
+# Web: npm run dev / python manage.py runserver
 ```
 
+DI frameworks (Koin, Dagger, Spring) crash at runtime if bindings are missing — the compiler
+won't catch this. A 30-second launch test catches what build+test cannot.
+
 #### Step 15: Complete and learn
 
 ```bash
 sk tentacle complete <name>
 ```
 
-Always call `complete` before `delete`. `complete` auto-extracts learnings into long-term knowledge.
+Only call `complete` after all verification gates pass. This marks all todos done and auto-extracts learnings from handoff.md into long-term knowledge.
 
-#### Step 16: Resume an interrupted tentacle
+#### Step 16: Resume a tentacle (when picking up interrupted work)
 
 ```bash
 sk tentacle resume <name>             # Refresh briefing, mark active
-sk tentacle resume <name> --no-briefing  # Skip if briefing is fresh
+sk tentacle resume <name> --no-briefing  # Skip briefing injection
+# fallback: python3 ~/.copilot/tools/tentacle.py resume <name> [--no-briefing]
 ```
 
+`resume` refreshes the live briefing in CONTEXT.md and marks the tentacle active again. Use it when returning to a tentacle after an interruption or session boundary. Pass `--no-briefing` only when the briefing is already fresh and re-fetching would be wasteful.
+
 #### Step 17: Cleanup
 
 ```bash
 sk tentacle delete <name>
 ```
 
----
-
-## Verification summary
-
-Full gate table (mirrors Phase 3 gates):
-
-| Gate | Command | Surface |
-|------|---------|---------|
-| Python security | `python3 test_security.py` | Python |
-| Python fixes | `python3 test_fixes.py` | Python |
-| Python all | `python3 run_all_tests.py` | Python |
-| Quality gates | `python3 tests/test_quality_gates.py` | Hooks/docs/skills |
-| TS typecheck | `cd browse-ui && pnpm typecheck` | browse-ui |
-| TS lint | `cd browse-ui && pnpm lint` | browse-ui |
-| TS format | `cd browse-ui && pnpm format:check` | browse-ui |
-| Vitest | `cd browse-ui && pnpm test` | browse-ui |
-| Next.js build | `cd browse-ui && pnpm build` | browse-ui |
-| Rust | `cargo fmt --all -- --check && cargo clippy -- -D warnings && cargo test` | Rust |
-| Code review | `code-reviewer` agent | Any |
-
----
-
 ## CLI reference
 
-```bash
-# Planning
-sk tentacle create <name> --scope "<paths>" --desc "<desc>" --briefing [--profile <agent-id>]
-sk tentacle todo <name> add "<task>"
-
-# Dispatch (always use --model claude-opus-4.7 for leaders)
-sk tentacle swarm <name> --agent-type <type> --model claude-opus-4.7 --briefing
-sk tentacle swarm <name> --output parallel --briefing
-sk tentacle dispatch <name> --agent-type <type> --model claude-opus-4.7 --briefing
+See `references/cli-reference.md` for the full command reference, CONTEXT.md template, and agent selection guidance.
 
-# Monitoring
-sk tentacle status
-sk tentacle show <name>
+Quick reference:
 
-# Handoff
+```bash
+tentacle.py create <name> --scope "<paths>" --desc "<desc>" --briefing
+tentacle.py todo <name> add "<task>"
+tentacle.py swarm <name> --agent-type <type> --model <model> --briefing    # bundle-first default
+tentacle.py swarm <name> --output parallel --briefing                      # one worker per todo
+sk tentacle swarm <name> --output json --briefing                          # JSON + bundle_path
+sk tentacle dispatch <name> --agent-type <type> --briefing                 # single-agent dispatch
+sk tentacle swarm <name> --no-bundle                                       # rare opt-out for tiny prompts
 sk tentacle handoff <name> "<summary>" --status DONE --changed-file <path> --learn
-sk tentacle handoff <name> "<question>" --status AMBIGUOUS   # request peer input
-
-# Goal loop
-sk tentacle goal init --title "<goal>"
-sk tentacle goal link <name>
-sk tentacle goal criteria check
-sk tentacle goal eval --decision continue|complete
-sk tentacle goal gate pass <id> --reason "<evidence>"
-sk tentacle verify <name> "<command>" --label "goal-eval"
-sk tentacle goal verify-loop [--escalate]
-sk tentacle goal resume
-
-# Close
-sk tentacle resume <name>
+sk tentacle goal init --title "<goal title>" [--desc "<goal description>"]
+sk tentacle goal link <name>                                               # stamp goal metadata into meta.json
+sk tentacle goal eval --decision continue|pause|complete|abandon           # record orchestrator decision
+sk tentacle goal status [--format text|json]
+sk tentacle resume <name>                  # resume interrupted tentacle (refreshes briefing)
+sk tentacle resume <name> --no-briefing    # resume without re-fetching briefing
+sk tentacle status
 sk tentacle complete <name>
 sk tentacle delete <name>
-sk tentacle marker-cleanup [--apply]
-
 # fallback: python3 ~/.copilot/tools/tentacle.py <cmd> <args>
 ```
 
----
-
 ## Tips
 
-1. **Opus for all leaders** — never dispatch a leader with haiku or sonnet; quality-over-speed means `claude-opus-4.7` for every leader tentacle
-2. **Loop, never block** — when confidence < 1.0, create a research-leader tentacle; never write BLOCKED to a handoff without first trying the loop
-3. **Peer leaders as peers** — if dev-leader is stuck on a test strategy, handoff to test-leader for input; leaders collaborate, not silo
-4. **Invest in CONTEXT.md** — 2-3 minutes writing good context saves 10 minutes of agent confusion
-5. **Keep todos atomic** — each item = one testable deliverable
-6. **No scope overlap** — overlapping scopes cause agents to overwrite each other
-7. **Complete before delete** — `complete` saves learnings; `delete` alone loses them
-8. **Commit after each phase** — uncommitted code is lost if the session crashes
-9. **Run the app** — build+test ≠ works. Launch browse-ui + Python backend to verify E2E behavior
-10. **Multi-surface = multi-wave** — dev + browse-leader in parallel, then test-leader, then qa-leader; never send qa-leader before tests pass
-11. **⚠️ Commit restriction** — Sub-agents must not run `git commit`/`git push`; enforced by git hooks when installed
-
----
-
-## ⛔ Workflow Integration
-
-This project's verification workflow (from `copilot-instructions.md` and `AGENTS.md`) maps to tentacle phases as follows:
-
-| Outer Workflow Phase | Tentacle Phase |
-|---------------------|---------------|
-| **Preflight**: `sk briefing --auto --compact` | Phase 0: Clarify Spec |
-| **Edit**: minimal footprint, no SQL interpolation | Phase 2: Execute |
-| **Verification by surface**: run all gates | Phase 3: Verify |
-| **Closeout**: `sk learn`, `task_complete` | Phase 4: Close |
-
-**Key rule**: The tentacle's internal lifecycle (Clarify→Plan→Execute→Verify→Close) is NOT the entire workflow. The outer workflow gates (briefing, verification, learn) must run AROUND the tentacle lifecycle.
-
-```
-sk briefing --auto --compact          ← BEFORE first tentacle
-  │
-  ▼
-Tentacle Lifecycle (Clarify→Plan→Execute→Verify→Close)
-  │
-  ▼
-python3 test_security.py              ← AFTER all tentacles complete (Python surface)
-python3 test_fixes.py
-  │
-  ▼
-sk learn --pattern/--mistake          ← BEFORE task_complete
-  │
-  ▼
-task_complete / git commit
-```
-
----
-
-## Reference docs
-
-- `~/.copilot/tools/skills/tentacle-orchestration/references/` — canonical reference docs
-  - `cli-reference.md` — full command reference and CONTEXT.md template
-  - `decomposition-review.md` — step file review checklist
-  - `verification-gates.md` — gate descriptions
-  - `spec-clarification.md` — Phase 0 full process
-- `docs/AGENT-RULES.md` — all 11 agent rules including confidence gate and tentacle obligations
-- `docs/ARCHITECTURE.md` — Python/Rust boundary, script inventory
-- `docs/HOOKS.md` — hook enforcement table
-- `.github/agents/*.agent.md` — available specialist agent profiles
+1. **Invest in CONTEXT.md** — 2-3 minutes writing good context saves 10 minutes of agent confusion
+2. **Keep todos atomic** — each item = one testable deliverable
+3. **No scope overlap** — overlapping scopes cause agents to overwrite each other
+4. **Complete before delete** — `complete` saves learnings; `delete` alone loses them
+5. **Commit after each phase** — uncommitted code is lost if the session crashes or compacts
+6. **Run the app** — build+test ≠ works. Launch the app to verify DI resolution and runtime behavior
+7. **⚠️ Commit restriction** — Sub-agents must not run `git commit`/`git push`. When git hooks are installed (`sk install --install-git-hooks`), both are blocked at the filesystem level for the repo where the tentacle was dispatched, while the `dispatched-subagent-active` marker is fresh. Commits in other repos are not affected. Even without hooks, a sub-agent commit mid-run corrupts the orchestrator's merge flow. Enforcement is local-only; cloud-delegated runs are not covered.
diff --git a/mcp-server.py b/mcp-server.py
index b55a7514..e3e23a4f 100644
--- a/mcp-server.py
+++ b/mcp-server.py
@@ -6,13 +6,20 @@
 - briefing(task, mode?, limit?, agent_tag?, msg_tag?)
 - query_session(query, semantic?, limit?, agent_tag?, msg_tag?)
 - query_memory(query?, category?, agent_tag?, msg_tag?, limit?, token?)  # issue #404
+
+Write tools (issue #717):
+- learn(category, title, description, tags?)
+- status()
+- session_list(limit?)
 """
 
 import importlib.util
 import io
 import json
 import os
+import re
 import sqlite3
+import subprocess
 import sys
 from contextlib import redirect_stderr, redirect_stdout
 from pathlib import Path
@@ -38,6 +45,7 @@
 JSONRPC_INTERNAL_ERROR = -32603
 
 VALID_BRIEFING_MODES = {"auto", "implement", "debug", "review", "plan", "test"}
+VALID_LEARN_CATEGORIES = {"mistake", "pattern", "decision", "tool", "feature", "refactor", "discovery"}
 
 # Auth error code for query_memory token failures (issue #404, fails closed)
 _MCP_AUTH_ERROR = -32600  # reuse INVALID_REQUEST for auth failures
@@ -173,14 +181,66 @@ def _load_script_module(module_name: str, filename: str):
             "additionalProperties": False,
         },
     },
+    {
+        "name": "learn",
+        "description": "Write a knowledge entry (mistake, pattern, feature, discovery, etc.) to the local knowledge base. Issue #717.",
+        "inputSchema": {
+            "type": "object",
+            "properties": {
+                "category": {
+                    "type": "string",
+                    "enum": sorted(VALID_LEARN_CATEGORIES),
+                    "description": "Knowledge category.",
+                },
+                "title": {"type": "string", "description": "Short title for the knowledge entry."},
+                "description": {"type": "string", "description": "Content / body of the knowledge entry."},
+                "tags": {
+                    "type": "string",
+                    "description": "Comma-separated tags (optional).",
+                },
+            },
+            "required": ["category", "title", "description"],
+            "additionalProperties": False,
+        },
+    },
+    {
+        "name": "status",
+        "description": "Return a JSON health snapshot: session count, entry count, watcher status. Issue #717.",
+        "inputSchema": {
+            "type": "object",
+            "properties": {},
+            "required": [],
+            "additionalProperties": False,
+        },
+    },
+    {
+        "name": "session_list",
+        "description": "Return the most recent sessions from the local knowledge base. Issue #717.",
+        "inputSchema": {
+            "type": "object",
+            "properties": {
+                "limit": {
+                    "type": "integer",
+                    "minimum": 1,
+                    "maximum": 100,
+                    "description": "Maximum sessions to return (default 20).",
+                },
+            },
+            "required": [],
+            "additionalProperties": False,
+        },
+    },
 ]
 
 
-def _require_string(arguments: dict[str, Any], key: str) -> str:
+def _require_string(arguments: dict[str, Any], key: str, max_length: int | None = None) -> str:
     value = arguments.get(key)
     if not isinstance(value, str) or not value.strip():
         raise JsonRpcError(JSONRPC_INVALID_PARAMS, f"'{key}' must be a non-empty string")
-    return value.strip()
+    stripped = value.strip()
+    if max_length is not None:
+        stripped = stripped[:max_length]
+    return stripped
 
 
 def _optional_string(arguments: dict[str, Any], key: str, max_length: int = 200) -> str:
@@ -408,6 +468,192 @@ def _run_query_memory(arguments: dict[str, Any]) -> dict[str, Any]:
     }
 
 
+# ---------------------------------------------------------------------------
+# learn — write a knowledge entry via learn.py subprocess (issue #717)
+# ---------------------------------------------------------------------------
+
+
+def _run_learn(arguments: dict[str, Any]) -> dict[str, Any]:
+    """Write a knowledge entry by calling learn.py as a subprocess."""
+    _check_auth(arguments)
+
+    category = _require_string(arguments, "category")
+    if category not in VALID_LEARN_CATEGORIES:
+        raise JsonRpcError(
+            JSONRPC_INVALID_PARAMS,
+            f"'category' must be one of: {', '.join(sorted(VALID_LEARN_CATEGORIES))}",
+        )
+    title = _require_string(arguments, "title", max_length=500)
+    description = _require_string(arguments, "description", max_length=10_000)
+    tags = _optional_string(arguments, "tags", max_length=500)
+
+    learn_py = TOOLS_DIR / "learn.py"
+    if not learn_py.exists():
+        raise JsonRpcError(JSONRPC_INTERNAL_ERROR, "learn.py not found")
+
+    flag = f"--{category}"
+    cmd = [sys.executable, str(learn_py), flag, title, description, "--json"]
+    if tags:
+        cmd += ["--tags", tags]
+
+    try:
+        result = subprocess.run(cmd, capture_output=True, text=True, timeout=30)
+    except subprocess.TimeoutExpired as exc:
+        raise JsonRpcError(JSONRPC_INTERNAL_ERROR, "learn.py timed out") from exc
+    except Exception as exc:
+        raise JsonRpcError(JSONRPC_INTERNAL_ERROR, f"learn.py subprocess error: {exc}") from exc
+
+    if result.returncode != 0:
+        msg = result.stderr.strip() or result.stdout.strip() or "learn.py failed"
+        raise JsonRpcError(JSONRPC_INTERNAL_ERROR, msg)
+
+    entry_id = None
+    try:
+        parsed = json.loads(result.stdout.strip())
+        raw_id = parsed.get("id")
+        if isinstance(raw_id, int) and raw_id > 0:
+            entry_id = raw_id
+    except (json.JSONDecodeError, AttributeError):
+        pass
+
+    status = "ok"
+    if entry_id is None:
+        # Fallback: scan for "Added new <category> #N" pattern in combined output
+        m = re.search(r"Added new \w+ #(\d+)", result.stdout + result.stderr)
+        if m:
+            entry_id = int(m.group(1))
+
+    body = {"status": status, "message": "Entry recorded", "id": entry_id}
+    return {
+        "content": [{"type": "text", "text": json.dumps(body, ensure_ascii=False)}],
+        "structuredContent": body,
+    }
+
+
+# ---------------------------------------------------------------------------
+# status — return DB health snapshot (issue #717)
+# ---------------------------------------------------------------------------
+
+
+def _is_watcher_pid_running(pid: int) -> bool:
+    try:
+        os.kill(pid, 0)
+        return True
+    except (ProcessLookupError, PermissionError):
+        return False
+    except Exception:
+        return False
+
+
+def _run_status(_arguments: dict[str, Any]) -> dict[str, Any]:
+    """Return a JSON health snapshot: session_count, entry_count, watcher."""
+    session_count = 0
+    entry_count = 0
+    watcher = "stopped"
+
+    # Check watcher via lock file
+    lock_file = _DB_PATH.parent / ".watcher.lock"
+    if lock_file.exists():
+        try:
+            pid = int(lock_file.read_text(encoding="utf-8").strip())
+            if _is_watcher_pid_running(pid):
+                watcher = "running"
+        except Exception:
+            pass
+
+    if not _DB_PATH.exists():
+        body = {
+            "session_count": 0,
+            "entry_count": 0,
+            "watcher": watcher,
+            "db_path": str(_DB_PATH),
+        }
+        return {
+            "content": [{"type": "text", "text": json.dumps(body, ensure_ascii=False)}],
+            "structuredContent": body,
+        }
+
+    try:
+        db_uri = _DB_PATH.as_uri() + "?mode=ro"
+        db = sqlite3.connect(db_uri, uri=True)
+        try:
+            try:
+                row = db.execute("SELECT COUNT(*) FROM sessions").fetchone()
+                session_count = row[0] if row else 0
+            except sqlite3.OperationalError:
+                session_count = 0
+            try:
+                row = db.execute("SELECT COUNT(*) FROM knowledge_entries").fetchone()
+                entry_count = row[0] if row else 0
+            except sqlite3.OperationalError:
+                entry_count = 0
+        finally:
+            db.close()
+    except Exception:
+        pass
+
+    body = {
+        "session_count": session_count,
+        "entry_count": entry_count,
+        "watcher": watcher,
+        "db_path": str(_DB_PATH),
+    }
+    return {
+        "content": [{"type": "text", "text": json.dumps(body, ensure_ascii=False)}],
+        "structuredContent": body,
+    }
+
+
+# ---------------------------------------------------------------------------
+# session_list — return last N sessions (issue #717)
+# ---------------------------------------------------------------------------
+
+
+def _run_session_list(arguments: dict[str, Any]) -> dict[str, Any]:
+    """Return the most recent sessions from the local DB."""
+    limit = _optional_int(arguments, "limit", default=20, minimum=1, maximum=100)
+
+    if not _DB_PATH.exists():
+        body: dict[str, Any] = {"sessions": [], "count": 0}
+        return {
+            "content": [{"type": "text", "text": json.dumps(body, ensure_ascii=False)}],
+            "structuredContent": body,
+        }
+
+    try:
+        db_uri = _DB_PATH.as_uri() + "?mode=ro"
+        db = sqlite3.connect(db_uri, uri=True)
+        db.row_factory = sqlite3.Row
+        try:
+            rows = db.execute(
+                "SELECT id, summary, source, indexed_at FROM sessions ORDER BY indexed_at DESC LIMIT ?",
+                (limit,),
+            ).fetchall()
+        except sqlite3.OperationalError as exc:
+            raise JsonRpcError(JSONRPC_INTERNAL_ERROR, f"Query error: {exc}") from exc
+        finally:
+            db.close()
+    except JsonRpcError:
+        raise
+    except Exception as exc:
+        raise JsonRpcError(JSONRPC_INTERNAL_ERROR, f"DB error: {exc}") from exc
+
+    sessions = [
+        {
+            "id": dict(r).get("id"),
+            "summary": dict(r).get("summary", ""),
+            "source": dict(r).get("source", ""),
+            "indexed_at": dict(r).get("indexed_at"),
+        }
+        for r in rows
+    ]
+    body = {"sessions": sessions, "count": len(sessions)}
+    return {
+        "content": [{"type": "text", "text": json.dumps(body, ensure_ascii=False)}],
+        "structuredContent": body,
+    }
+
+
 def _handle_tools_call(params: dict[str, Any]) -> dict[str, Any]:
     name = params.get("name")
     if not isinstance(name, str) or not name:
@@ -423,6 +669,12 @@ def _handle_tools_call(params: dict[str, Any]) -> dict[str, Any]:
         return _run_query_session(arguments)
     if name == "query_memory":
         return _run_query_memory(arguments)
+    if name == "learn":
+        return _run_learn(arguments)
+    if name == "status":
+        return _run_status(arguments)
+    if name == "session_list":
+        return _run_session_list(arguments)
     raise JsonRpcError(JSONRPC_INVALID_PARAMS, f"Unknown tool: {name}")
 
 
diff --git a/test_fixes.py b/test_fixes.py
index ed14c1ed..0b69ad12 100755
--- a/test_fixes.py
+++ b/test_fixes.py
@@ -10076,6 +10076,473 @@ def _make_i720_db(db_path: Path) -> None:
 except Exception as _e722:
     test("I722: bulk-tag test setup", False, str(_e722))
 
+# === I717: MCP Write Tools ===
+print("\n✍️  I717: MCP Write Tools")
+
+_mcp717_src = (REPO / "mcp-server.py").read_text(encoding="utf-8")
+
+# I717-01: TOOLS list includes 'learn'
+try:
+    test(
+        "I717-01: TOOLS list contains learn tool",
+        '"name": "learn"' in _mcp717_src or "'name': 'learn'" in _mcp717_src,
+        "learn tool not found in TOOLS list",
+    )
+except Exception as _e717_01:
+    test("I717-01: learn in TOOLS", False, str(_e717_01))
+
+# I717-02: TOOLS list includes 'status'
+try:
+    test(
+        "I717-02: TOOLS list contains status tool",
+        '"name": "status"' in _mcp717_src or "'name': 'status'" in _mcp717_src,
+        "status tool not found in TOOLS list",
+    )
+except Exception as _e717_02:
+    test("I717-02: status in TOOLS", False, str(_e717_02))
+
+# I717-03: TOOLS list includes 'session_list'
+try:
+    test(
+        "I717-03: TOOLS list contains session_list tool",
+        '"name": "session_list"' in _mcp717_src or "'name': 'session_list'" in _mcp717_src,
+        "session_list tool not found in TOOLS list",
+    )
+except Exception as _e717_03:
+    test("I717-03: session_list in TOOLS", False, str(_e717_03))
+
+# I717-04: _run_learn function exists
+try:
+    test("I717-04: _run_learn function defined", "def _run_learn(" in _mcp717_src, "_run_learn not found")
+except Exception as _e717_04:
+    test("I717-04: _run_learn defined", False, str(_e717_04))
+
+# I717-05: _run_status function exists
+try:
+    test("I717-05: _run_status function defined", "def _run_status(" in _mcp717_src, "_run_status not found")
+except Exception as _e717_05:
+    test("I717-05: _run_status defined", False, str(_e717_05))
+
+# I717-06: _run_session_list function exists
+try:
+    test(
+        "I717-06: _run_session_list function defined",
+        "def _run_session_list(" in _mcp717_src,
+        "_run_session_list not found",
+    )
+except Exception as _e717_06:
+    test("I717-06: _run_session_list defined", False, str(_e717_06))
+
+# I717-07: dispatch learn, status, session_list in _handle_tools_call
+try:
+    _dispatch_body = _mcp717_src.split("def _handle_tools_call(")[1].split("def _read_exact(")[0]
+    test("I717-07a: dispatch learn in _handle_tools_call", "_run_learn" in _dispatch_body, "_run_learn not dispatched")
+    test(
+        "I717-07b: dispatch status in _handle_tools_call", "_run_status" in _dispatch_body, "_run_status not dispatched"
+    )
+    test(
+        "I717-07c: dispatch session_list in _handle_tools_call",
+        "_run_session_list" in _dispatch_body,
+        "_run_session_list not dispatched",
+    )
+except Exception as _e717_07:
+    test("I717-07: dispatch check", False, str(_e717_07))
+
+# I717-08: learn tool has category enum in schema
+try:
+    test(
+        "I717-08: learn schema has category enum",
+        "VALID_LEARN_CATEGORIES" in _mcp717_src or '"mistake"' in _mcp717_src,
+        "category enum missing from learn schema",
+    )
+except Exception as _e717_08:
+    test("I717-08: learn category enum", False, str(_e717_08))
+
+# I717-09: learn uses subprocess (not importlib) for calling learn.py
+try:
+    _learn_fn = _mcp717_src.split("def _run_learn(")[1].split("def _run_status(")[0]
+    test(
+        "I717-09: _run_learn uses subprocess.run",
+        "subprocess.run(" in _learn_fn,
+        "subprocess.run not used in _run_learn",
+    )
+except Exception as _e717_09:
+    test("I717-09: _run_learn subprocess", False, str(_e717_09))
+
+# I717-10: no SQL injection in session_list (uses ? placeholder)
+try:
+    _sl_fn = _mcp717_src.split("def _run_session_list(")[1].split("def _handle_tools_call(")[0]
+    test(
+        "I717-10: session_list uses ? placeholder",
+        "?" in _sl_fn and 'f"SELECT' not in _sl_fn and "f'SELECT" not in _sl_fn,
+        "session_list may use string-interpolated SQL",
+    )
+except Exception as _e717_10:
+    test("I717-10: session_list SQL safety", False, str(_e717_10))
+
+# I717-11: status returns watcher field
+try:
+    _status_fn = _mcp717_src.split("def _run_status(")[1].split("def _run_session_list(")[0]
+    test(
+        "I717-11: status returns watcher field",
+        '"watcher"' in _status_fn or "'watcher'" in _status_fn,
+        "watcher field missing from status output",
+    )
+except Exception as _e717_11:
+    test("I717-11: status watcher field", False, str(_e717_11))
+
+# I717-12: status returns db_path field
+try:
+    _status_fn12 = _mcp717_src.split("def _run_status(")[1].split("def _run_session_list(")[0]
+    test(
+        "I717-12: status returns db_path field",
+        '"db_path"' in _status_fn12 or "'db_path'" in _status_fn12,
+        "db_path missing from status output",
+    )
+except Exception as _e717_12:
+    test("I717-12: status db_path field", False, str(_e717_12))
+
+# I717-13: VALID_LEARN_CATEGORIES constant defined
+try:
+    test(
+        "I717-13: VALID_LEARN_CATEGORIES constant defined",
+        "VALID_LEARN_CATEGORIES" in _mcp717_src,
+        "VALID_LEARN_CATEGORIES not found",
+    )
+except Exception as _e717_13:
+    test("I717-13: VALID_LEARN_CATEGORIES", False, str(_e717_13))
+
+# I717-14: learn schema requires category, title, description
+try:
+    _learn_schema = _mcp717_src[_mcp717_src.find('"name": "learn"') : _mcp717_src.find('"name": "status"')]
+    test(
+        "I717-14: learn required fields include title and description",
+        '"category"' in _learn_schema and '"title"' in _learn_schema and '"description"' in _learn_schema,
+        "learn schema missing required fields",
+    )
+except Exception as _e717_14:
+    test("I717-14: learn required fields", False, str(_e717_14))
+
+# I717-15..22: Integration tests via MCP subprocess
+try:
+    import json as _json717
+    import pathlib as _pl717
+    import sqlite3 as _sq717
+    import subprocess as _sp717
+    import tempfile as _tmp717
+
+    with _tmp717.TemporaryDirectory(prefix="mcp717-test-") as _tmp717_dir:
+        _home717 = _pl717.Path(_tmp717_dir)
+        _state717 = _home717 / ".copilot" / "session-state"
+        _state717.mkdir(parents=True, exist_ok=True)
+        _db717 = _sq717.connect(_state717 / "knowledge.db")
+        _db717.executescript("""
+            CREATE TABLE sessions (
+                id TEXT PRIMARY KEY,
+                path TEXT NOT NULL,
+                summary TEXT DEFAULT '',
+                source TEXT DEFAULT 'copilot',
+                indexed_at TEXT
+            );
+            CREATE TABLE knowledge_entries (
+                id INTEGER PRIMARY KEY AUTOINCREMENT,
+                session_id TEXT NOT NULL DEFAULT '',
+                document_id INTEGER,
+                category TEXT NOT NULL,
+                title TEXT NOT NULL,
+                stable_id TEXT,
+                content TEXT NOT NULL DEFAULT '',
+                tags TEXT DEFAULT '',
+                confidence REAL DEFAULT 1.0,
+                occurrence_count INTEGER DEFAULT 1,
+                first_seen TEXT,
+                last_seen TEXT,
+                source TEXT DEFAULT 'copilot',
+                topic_key TEXT,
+                revision_count INTEGER DEFAULT 1,
+                content_hash TEXT,
+                wing TEXT DEFAULT '',
+                room TEXT DEFAULT '',
+                facts TEXT DEFAULT '[]',
+                est_tokens INTEGER DEFAULT 0,
+                task_id TEXT DEFAULT '',
+                affected_files TEXT DEFAULT '[]',
+                source_section TEXT DEFAULT '',
+                source_file TEXT DEFAULT '',
+                start_line INTEGER DEFAULT 0,
+                end_line INTEGER DEFAULT 0,
+                code_language TEXT DEFAULT '',
+                code_snippet TEXT DEFAULT '',
+                error_type TEXT DEFAULT '',
+                root_cause TEXT DEFAULT '',
+                severity TEXT DEFAULT 'medium',
+                is_resolved INTEGER DEFAULT 0,
+                fix_steps TEXT DEFAULT '',
+                prevention_hook TEXT DEFAULT '',
+                recurrence_after_briefing INTEGER DEFAULT 0,
+                valence TEXT DEFAULT '',
+                intensity REAL DEFAULT 0.5,
+                priority TEXT DEFAULT 'P2',
+                project_id TEXT DEFAULT '',
+                agent_id TEXT DEFAULT ''
+            );
+            CREATE VIRTUAL TABLE knowledge_fts USING fts5(title, section_name, content, doc_type UNINDEXED, session_id UNINDEXED, document_id UNINDEXED);
+            CREATE VIRTUAL TABLE ke_fts USING fts5(title, content);
+            CREATE VIRTUAL TABLE sessions_fts USING fts5(session_id UNINDEXED, title, user_messages, assistant_messages, tool_names);
+            CREATE TABLE documents (id INTEGER PRIMARY KEY, session_id TEXT NOT NULL, doc_type TEXT NOT NULL, title TEXT NOT NULL, file_path TEXT DEFAULT '', seq INTEGER DEFAULT 1, size_bytes INTEGER DEFAULT 0, source TEXT DEFAULT 'copilot');
+            CREATE TABLE sections (id INTEGER PRIMARY KEY, document_id INTEGER NOT NULL, section_name TEXT DEFAULT '', content TEXT DEFAULT '');
+        """)
+        _db717.execute(
+            "INSERT INTO sessions (id, path, summary, source, indexed_at) VALUES (?, ?, ?, ?, ?)",
+            ("sess-i717-1", "/a/b", "Session I717 Alpha", "copilot", "2025-01-01T10:00:00"),
+        )
+        _db717.execute(
+            "INSERT INTO sessions (id, path, summary, source, indexed_at) VALUES (?, ?, ?, ?, ?)",
+            ("sess-i717-2", "/c/d", "Session I717 Beta", "copilot", "2025-01-02T12:00:00"),
+        )
+        _db717.commit()
+        _db717.close()
+
+        _env717 = os.environ.copy()
+        _env717["HOME"] = str(_home717)
+        _env717["USERPROFILE"] = str(_home717)
+
+        def _mcp717_roundtrip(method, params):
+            proc = _sp717.Popen(
+                [sys.executable, str(REPO / "mcp-server.py")],
+                stdin=_sp717.PIPE,
+                stdout=_sp717.PIPE,
+                stderr=_sp717.PIPE,
+                env=_env717,
+            )
+            try:
+                # initialize
+                init_msg = _json717.dumps(
+                    {
+                        "jsonrpc": "2.0",
+                        "id": 1,
+                        "method": "initialize",
+                        "params": {"protocolVersion": "2024-11-05", "capabilities": {}},
+                    }
+                ).encode()
+                proc.stdin.write(f"Content-Length: {len(init_msg)}\r\n\r\n".encode() + init_msg)
+                notif = _json717.dumps({"jsonrpc": "2.0", "method": "notifications/initialized"}).encode()
+                proc.stdin.write(f"Content-Length: {len(notif)}\r\n\r\n".encode() + notif)
+                # send actual request
+                req = _json717.dumps({"jsonrpc": "2.0", "id": 2, "method": method, "params": params}).encode()
+                proc.stdin.write(f"Content-Length: {len(req)}\r\n\r\n".encode() + req)
+                # shutdown
+                shutdown_msg = _json717.dumps({"jsonrpc": "2.0", "id": 3, "method": "shutdown"}).encode()
+                proc.stdin.write(f"Content-Length: {len(shutdown_msg)}\r\n\r\n".encode() + shutdown_msg)
+                proc.stdin.flush()
+                proc.stdin.close()
+                out = proc.stdout.read()
+                proc.wait(timeout=15)
+            finally:
+                try:
+                    proc.kill()
+                except Exception:
+                    pass
+            # Parse all JSON-RPC messages from output
+            responses = []
+            remaining = out
+            while remaining:
+                if b"Content-Length:" not in remaining:
+                    break
+                hdr_end = remaining.find(b"\r\n\r\n")
+                if hdr_end == -1:
+                    break
+                hdr = remaining[:hdr_end].decode("ascii", errors="replace")
+                cl = int([l.split(":")[1].strip() for l in hdr.split("\r\n") if "content-length" in l.lower()][0])
+                body_start = hdr_end + 4
+                body = remaining[body_start : body_start + cl]
+                remaining = remaining[body_start + cl :]
+                try:
+                    responses.append(_json717.loads(body))
+                except Exception:
+                    pass
+            return [r for r in responses if r.get("id") == 2]
+
+        # I717-15: tools/list includes all 6 tools
+        try:
+            _r717_list = _mcp717_roundtrip("tools/list", {})
+            _names717 = [
+                t.get("name") for t in (_r717_list[0].get("result", {}).get("tools", []) if _r717_list else [])
+            ]
+            test("I717-15: tools/list includes learn", "learn" in _names717, str(_names717))
+            test("I717-16: tools/list includes status", "status" in _names717, str(_names717))
+            test("I717-17: tools/list includes session_list", "session_list" in _names717, str(_names717))
+        except Exception as _e717_15:
+            test("I717-15: tools/list learn", False, str(_e717_15))
+            test("I717-16: tools/list status", False, str(_e717_15))
+            test("I717-17: tools/list session_list", False, str(_e717_15))
+
+        # I717-18: status tool returns JSON with expected fields
+        try:
+            _r717_status = _mcp717_roundtrip("tools/call", {"name": "status", "arguments": {}})
+            if _r717_status:
+                _body717_s = _json717.loads(_r717_status[0].get("result", {}).get("content", [{}])[0].get("text", "{}"))
+                test("I717-18a: status has session_count key", "session_count" in _body717_s, str(_body717_s))
+                test("I717-18b: status has entry_count key", "entry_count" in _body717_s, str(_body717_s))
+                test("I717-18c: status has watcher key", "watcher" in _body717_s, str(_body717_s))
+                test("I717-18d: status has db_path key", "db_path" in _body717_s, str(_body717_s))
+                test(
+                    "I717-18e: status session_count is int",
+                    isinstance(_body717_s.get("session_count"), int),
+                    str(_body717_s),
+                )
+                test(
+                    "I717-18f: status watcher is running or stopped",
+                    _body717_s.get("watcher") in ("running", "stopped"),
+                    str(_body717_s),
+                )
+            else:
+                for sfx in ["a", "b", "c", "d", "e", "f"]:
+                    test(f"I717-18{sfx}: status (no response)", False, "no MCP response")
+        except Exception as _e717_18:
+            for sfx in ["a", "b", "c", "d", "e", "f"]:
+                test(f"I717-18{sfx}: status tool", False, str(_e717_18))
+
+        # I717-19: session_list returns sessions array
+        try:
+            _r717_sl = _mcp717_roundtrip("tools/call", {"name": "session_list", "arguments": {"limit": 10}})
+            if _r717_sl:
+                _body717_sl = _json717.loads(_r717_sl[0].get("result", {}).get("content", [{}])[0].get("text", "{}"))
+                test("I717-19a: session_list has sessions key", "sessions" in _body717_sl, str(_body717_sl))
+                test("I717-19b: session_list has count key", "count" in _body717_sl, str(_body717_sl))
+                test(
+                    "I717-19c: session_list count matches len",
+                    _body717_sl.get("count") == len(_body717_sl.get("sessions", [])),
+                    str(_body717_sl),
+                )
+                test(
+                    "I717-19d: session_list returns 2 sessions",
+                    _body717_sl.get("count") == 2,
+                    f"count={_body717_sl.get('count')}",
+                )
+                _sess0 = _body717_sl.get("sessions", [{}])[0] if _body717_sl.get("sessions") else {}
+                test("I717-19e: session entry has id field", "id" in _sess0, str(_sess0))
+                test("I717-19f: session entry has summary field", "summary" in _sess0, str(_sess0))
+            else:
+                for sfx in ["a", "b", "c", "d", "e", "f"]:
+                    test(f"I717-19{sfx}: session_list (no response)", False, "no MCP response")
+        except Exception as _e717_19:
+            for sfx in ["a", "b", "c", "d", "e", "f"]:
+                test(f"I717-19{sfx}: session_list tool", False, str(_e717_19))
+
+        # I717-20: session_list default limit (no args)
+        try:
+            _r717_sl20 = _mcp717_roundtrip("tools/call", {"name": "session_list", "arguments": {}})
+            if _r717_sl20:
+                _body717_sl20 = _json717.loads(
+                    _r717_sl20[0].get("result", {}).get("content", [{}])[0].get("text", "{}")
+                )
+                test("I717-20: session_list default limit works", "sessions" in _body717_sl20, str(_body717_sl20))
+            else:
+                test("I717-20: session_list default limit", False, "no response")
+        except Exception as _e717_20:
+            test("I717-20: session_list default limit", False, str(_e717_20))
+
+        # I717-21: learn tool records entry (subprocess roundtrip)
+        try:
+            _r717_learn = _mcp717_roundtrip(
+                "tools/call",
+                {
+                    "name": "learn",
+                    "arguments": {
+                        "category": "pattern",
+                        "title": "I717 MCP test pattern",
+                        "description": "Test pattern recorded via MCP learn tool",
+                        "tags": "mcp,i717",
+                    },
+                },
+            )
+            if _r717_learn:
+                _r717_lr = _r717_learn[0]
+                if "error" in _r717_lr:
+                    test(
+                        "I717-21: learn tool records entry",
+                        False,
+                        _r717_lr["error"].get("message", str(_r717_lr["error"])),
+                    )
+                else:
+                    _body717_l = _json717.loads(_r717_lr.get("result", {}).get("content", [{}])[0].get("text", "{}"))
+                    test("I717-21a: learn returns status ok", _body717_l.get("status") == "ok", str(_body717_l))
+                    test("I717-21b: learn returns message field", "message" in _body717_l, str(_body717_l))
+                    test("I717-21c: learn returns id field", "id" in _body717_l, str(_body717_l))
+            else:
+                for sfx in ["a", "b", "c"]:
+                    test(f"I717-21{sfx}: learn (no response)", False, "no MCP response")
+        except Exception as _e717_21:
+            for sfx in ["a", "b", "c"]:
+                test(f"I717-21{sfx}: learn tool", False, str(_e717_21))
+
+        # I717-22: learn rejects invalid category
+        try:
+            _r717_inv = _mcp717_roundtrip(
+                "tools/call",
+                {
+                    "name": "learn",
+                    "arguments": {
+                        "category": "invalid_cat",
+                        "title": "Test",
+                        "description": "Should fail",
+                    },
+                },
+            )
+            if _r717_inv:
+                _r717_iv = _r717_inv[0]
+                test(
+                    "I717-22: learn rejects invalid category",
+                    "error" in _r717_iv or _r717_iv.get("result", {}).get("isError"),
+                    str(_r717_inv),
+                )
+            else:
+                test("I717-22: learn invalid category", False, "no response")
+        except Exception as _e717_22:
+            test("I717-22: learn invalid category", False, str(_e717_22))
+
+except Exception as _e717_outer:
+    for _sfx in [
+        "15",
+        "16",
+        "17",
+        "18a",
+        "18b",
+        "18c",
+        "18d",
+        "18e",
+        "18f",
+        "19a",
+        "19b",
+        "19c",
+        "19d",
+        "19e",
+        "19f",
+        "20",
+        "21a",
+        "21b",
+        "21c",
+        "22",
+    ]:
+        test(f"I717-{_sfx}: MCP write tools (setup error)", False, str(_e717_outer))
+
+# I718-17: briefing.py 📌 badge source check
+try:
+    _br_src718 = (REPO / "briefing.py").read_text(encoding="utf-8")
+    test(
+        "I718-17a: briefing.py has pinned_badge for P0",
+        "pinned_badge" in _br_src718 or "📌" in _br_src718,
+        "pinned_badge or 📌 not found in briefing.py",
+    )
+    test(
+        "I718-17b: briefing.py checks priority P0 for badge",
+        "P0" in _br_src718 and ("pinned_badge" in _br_src718 or "📌" in _br_src718),
+        "P0 badge logic not found in briefing.py",
+    )
+except Exception as _e718_br:
+    test("I718-17: briefing.py badge source check", False, str(_e718_br))
+
 # ---------------------------------------------------------------------------
 if FAIL == 0:
     print("🎉 All tests passed!")
diff --git a/tests/test_mcp_server.py b/tests/test_mcp_server.py
index 5aa3a562..699bdab0 100644
--- a/tests/test_mcp_server.py
+++ b/tests/test_mcp_server.py
@@ -102,8 +102,8 @@ def setUp(self):
         self.tools = {t["name"]: t for t in mcp.TOOLS}
 
     def test_exactly_two_tools(self):
-        # Updated: query_memory added in issue #404; now 3 tools total
-        self.assertEqual(len(mcp.TOOLS), 3)
+        # Updated: wave 8 added learn, status, session_list; now 6 tools total
+        self.assertEqual(len(mcp.TOOLS), 6)
 
     def test_briefing_tool_present(self):
         self.assertIn("briefing", self.tools)
@@ -1059,10 +1059,11 @@ def test_query_memory_auth_error_propagates(self):
                 mcp._run_query_memory({"token": "bad"})
         self.assertEqual(ctx.exception.code, mcp._MCP_AUTH_ERROR)
 
-    # -- three tools total ---
+    # -- six tools total ---
 
     def test_exactly_three_tools(self):
-        self.assertEqual(len(mcp.TOOLS), 3)
+        # Updated: wave 8 added learn, status, session_list; now 6 tools total
+        self.assertEqual(len(mcp.TOOLS), 6)
 
 
 # ---------------------------------------------------------------------------