From e281248270a10f894826b42f1c3b8aa1fbc0cc7b Mon Sep 17 00:00:00 2001
From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com>
Date: Mon, 9 Mar 2026 14:10:29 +0000
Subject: [PATCH 1/8] Initial plan


From deec5b689bfc6d263b07ee1f5225d9e271d04739 Mon Sep 17 00:00:00 2001
From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com>
Date: Mon, 9 Mar 2026 14:15:13 +0000
Subject: [PATCH 2/8] docs: add path-scoped Copilot instructions for context/
 and routing/

Co-authored-by: dgenio <12731907+dgenio@users.noreply.github.com>
---
 .github/instructions/context.instructions.md | 77 +++++++++++++++++
 .github/instructions/routing.instructions.md | 87 ++++++++++++++++++++
 2 files changed, 164 insertions(+)
 create mode 100644 .github/instructions/context.instructions.md
 create mode 100644 .github/instructions/routing.instructions.md

diff --git a/.github/instructions/context.instructions.md b/.github/instructions/context.instructions.md
new file mode 100644
index 0000000..b5e6570
--- /dev/null
+++ b/.github/instructions/context.instructions.md
@@ -0,0 +1,77 @@
+---
+applyTo: src/contextweaver/context/**
+---
+
+# Context Engine — Agent Instructions
+
+Path-scoped guidance for `src/contextweaver/context/`. Read before modifying any file here.
+
+## Pipeline stage ordering (must not be reordered)
+
+`ContextManager.build()` executes exactly these 8 stages in order:
+
+1. `generate_candidates` (`candidates.py`) — phase + policy filter over event log
+2. `resolve_dependency_closure` (`candidates.py`) — pull in parent items via `parent_id`
+3. `apply_sensitivity_filter` (`sensitivity.py`) — drop/redact by sensitivity level
+4. `apply_firewall_to_batch` (`firewall.py`) — intercept raw `tool_result` text
+5. `score_candidates` (`scoring.py`) — TF-IDF relevance scoring
+6. `deduplicate_candidates` (`dedup.py`) — near-duplicate removal
+7. `select_and_pack` (`selection.py`) — budget-aware token selection
+8. `render_context` (`prompt.py`) — final prompt assembly
+
+**Never reorder these stages.** Stages 2 and 4 have hard ordering constraints:
+dependency closure must run before scoring (ancestors must be scoreable), and the
+firewall must run before scoring (summaries, not raw text, must be scored).
+
+## Firewall invariants
+
+- Raw `tool_result` text **never** reaches the prompt. `apply_firewall` replaces
+  `item.text` with a summary and stores the raw bytes in `ArtifactStore`.
+- The artifact handle is always `f"artifact:{item.id}"`.
+- `item.artifact_ref` is set on every firewall-processed item.
+- Do not bypass `apply_firewall_to_batch` or move raw text past stage 4.
+- See `firewall.py` and `docs/agent-context/invariants.md` for full rationale.
+
+## Async-first pattern
+
+- `build()` is `async`. The sync entry point is `build_sync()`, which is a thin
+  `asyncio.run(self.build(...))` wrapper.
+- All new pipeline stages must be `async` with `_sync` wrappers where needed.
+- Do not introduce blocking I/O inside `async` pipeline functions.
+
+## Dependency closure
+
+- `resolve_dependency_closure()` (stage 2) walks `item.parent_id` chains and
+  adds missing ancestors to the candidate list.
+- **Must run before scoring and deduplication.** Removing or skipping it produces
+  incoherent context: tool results appear without their tool calls.
+- Closure count is tracked in `BuildStats.closures_added`.
+
+## `manager.py` size and decomposition
+
+- `manager.py` is currently ~876 lines, which exceeds the ≤300-line module
+  guideline. Decomposition is tracked in dgenio/contextweaver#73 and
+  dgenio/contextweaver#69.
+- Do not add new methods to `ContextManager` until the decomposition is complete.
+- Prefer adding new logic to an existing focused module (e.g. `candidates.py`,
+  `scoring.py`) and calling it from the manager.
+
+## Sensitivity enforcement
+
+- `sensitivity.py` is security-grade code. Changes require extra review scrutiny.
+- Never weaken the default sensitivity floor or default drop action.
+- See `.github/instructions/sensitivity.instructions.md` for full rules.
+
+## Import rules
+
+- Raise custom exceptions from `contextweaver.exceptions`, not bare `ValueError`
+  or `RuntimeError`.
+- Text similarity utilities (`tokenize`, `jaccard`, `TfIdfScorer`) must be
+  imported from `contextweaver._utils` — never duplicated here.
+- Use `from __future__ import annotations` in every source file.
+
+## Related issues
+
+- dgenio/contextweaver#73 — `manager.py` decomposition (large file)
+- dgenio/contextweaver#69 — context pipeline refactor
+- dgenio/contextweaver#63 — context firewall design
diff --git a/.github/instructions/routing.instructions.md b/.github/instructions/routing.instructions.md
new file mode 100644
index 0000000..deffa9d
--- /dev/null
+++ b/.github/instructions/routing.instructions.md
@@ -0,0 +1,87 @@
+---
+applyTo: src/contextweaver/routing/**
+---
+
+# Routing Engine — Agent Instructions
+
+Path-scoped guidance for `src/contextweaver/routing/`. Read before modifying any file here.
+
+## ChoiceGraph validation invariants (`graph.py`)
+
+`ChoiceGraph._validate()` enforces four rules — all must hold at all times:
+
+1. **Root exists** — `root_id` must be present in `_nodes`.
+2. **Children resolve** — every edge destination must exist in `_nodes | _items`.
+3. **No cycles** — `topological_order()` must succeed (raises `GraphBuildError` if not).
+4. **All items reachable** — every item in `_items` must be reachable from `root_id`.
+
+Cycle detection is **eager**: `add_edge()` calls `_creates_cycle()` immediately and
+raises `GraphBuildError` before the edge is persisted. Do not bypass this check.
+
+Serialisation via `from_dict()` rebuilds `children` / `child_types` from `_edges`
+to guarantee consistency — never rely on serialised node metadata for child lists.
+
+## TreeBuilder grouping strategies (`tree.py`)
+
+`TreeBuilder.build()` tries three strategies in priority order:
+
+1. **Namespace grouping** — group by first dot-segment of `item.namespace`;
+   requires ≥ 50 % of items to have a namespace and ≥ 2 groups.
+2. **Jaccard clustering** — farthest-first seeding over `tokenize(_text_repr(item))`;
+   falls back if clustering yields < 2 groups.
+3. **Alphabetical fallback** — sort by `item.name.lower()`, split into even chunks.
+
+The builder is **deterministic**: it sorts items by `item.id` before processing.
+Do not introduce randomness or non-deterministic ordering inside `_build_subtree`.
+
+Every node has at most `max_children` children (default 20). Oversized groups are
+coalesced via `_coalesce_groups()` or re-split before adding edges.
+
+## Router beam-search constraints (`router.py`)
+
+- **Deterministic tie-breaking**: children are sorted `(-score, id)` — descending
+  score, alphabetical ID for ties. Never change this sort key.
+- `confidence_gap` (default 0.15) widens the beam by 1 when rank-1 and rank-2
+  scores differ by less than the gap. Must stay in `[0.0, 1.0]`.
+- Results are ranked `(-score, item_id)` — same determinism guarantee end-to-end.
+- The TF-IDF index is lazily built on first `route()` call via `_ensure_index()`.
+  Items are indexed by sorted `item_id` before non-leaf nodes; do not change order.
+- Fallback scoring (nodes not in TF-IDF index) uses `jaccard()` from
+  `contextweaver._utils` — never duplicate this logic here.
+
+## Catalog invariants (`catalog.py`)
+
+- Item IDs must be unique within a `Catalog`; `register()` raises `CatalogError`
+  on duplicates.
+- `generate_sample_catalog(n, seed=42)` is seeded for reproducibility. The default
+  seed **must not change** — demos and tests depend on deterministic output.
+- `Catalog.hydrate()` returns **shallow copies** of `args_schema`, `examples`, and
+  `constraints`. Callers must not mutate the returned dicts; use `copy.deepcopy`
+  if mutation is needed.
+
+## ChoiceCard constraints (`cards.py`)
+
+- `ChoiceCard` must **never** include a full argument schema. It is a compact,
+  LLM-friendly summary; full schemas are hydrated on demand via `Catalog.hydrate()`.
+- Keep card text representation minimal to avoid consuming prompt tokens.
+
+## Synchronous-only routing
+
+- The entire routing engine is **synchronous** (pure computation, DAG traversal,
+  beam search). Do not introduce `async`/`await` anywhere in `routing/`.
+- The engine has zero runtime dependencies on the context engine — do not import
+  from `contextweaver.context.*` inside `routing/`.
+
+## Import rules
+
+- Raise custom exceptions from `contextweaver.exceptions` (`GraphBuildError`,
+  `RouteError`, `CatalogError`, `ItemNotFoundError`), not bare exceptions.
+- Text similarity (`tokenize`, `jaccard`, `TfIdfScorer`) must come from
+  `contextweaver._utils`.
+- Use `from __future__ import annotations` in every source file.
+
+## Related issues
+
+- dgenio/contextweaver#73 — module size tracking (routing/graph.py is 316 lines)
+- dgenio/contextweaver#69 — routing refactor work
+- dgenio/contextweaver#63 — ChoiceGraph design and validation

From 52c21cd994170675eb63bb497562e5127db7fa33 Mon Sep 17 00:00:00 2001
From: dgenio <diogo.ansantos@nos.pt>
Date: Tue, 10 Mar 2026 06:08:01 +0000
Subject: [PATCH 3/8] fix(docs): correct scoring description from TF-IDF to
 actual factors

Stage 5 in context.instructions.md incorrectly described scoring as
'TF-IDF relevance scoring'. The actual implementation in scoring.py
uses recency + Jaccard token overlap + kind priority + token penalty.

Addresses review comment on PR #146.
---
 .github/instructions/context.instructions.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.github/instructions/context.instructions.md b/.github/instructions/context.instructions.md
index b5e6570..85b3859 100644
--- a/.github/instructions/context.instructions.md
+++ b/.github/instructions/context.instructions.md
@@ -14,7 +14,7 @@ Path-scoped guidance for `src/contextweaver/context/`. Read before modifying any
 2. `resolve_dependency_closure` (`candidates.py`) — pull in parent items via `parent_id`
 3. `apply_sensitivity_filter` (`sensitivity.py`) — drop/redact by sensitivity level
 4. `apply_firewall_to_batch` (`firewall.py`) — intercept raw `tool_result` text
-5. `score_candidates` (`scoring.py`) — TF-IDF relevance scoring
+5. `score_candidates` (`scoring.py`) — recency + Jaccard token overlap + kind priority + token penalty
 6. `deduplicate_candidates` (`dedup.py`) — near-duplicate removal
 7. `select_and_pack` (`selection.py`) — budget-aware token selection
 8. `render_context` (`prompt.py`) — final prompt assembly

From 3a37e19e02364798453e5ae6f01c3bc6fef0baae Mon Sep 17 00:00:00 2001
From: dgenio <diogo.ansantos@nos.pt>
Date: Tue, 10 Mar 2026 06:09:57 +0000
Subject: [PATCH 4/8] fix(docs): correct async/sync structure description in
 context instructions

_build() is the synchronous core; both build() (async) and build_sync()
delegate directly to it. There is no asyncio.run() wrapper. Updated the
async-first pattern section to match the actual implementation.

Addresses review comment on PR #146.
---
 .github/instructions/context.instructions.md | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/.github/instructions/context.instructions.md b/.github/instructions/context.instructions.md
index 85b3859..7b005f5 100644
--- a/.github/instructions/context.instructions.md
+++ b/.github/instructions/context.instructions.md
@@ -34,10 +34,11 @@ firewall must run before scoring (summaries, not raw text, must be scored).
 
 ## Async-first pattern
 
-- `build()` is `async`. The sync entry point is `build_sync()`, which is a thin
-  `asyncio.run(self.build(...))` wrapper.
-- All new pipeline stages must be `async` with `_sync` wrappers where needed.
-- Do not introduce blocking I/O inside `async` pipeline functions.
+- The core pipeline runs in `_build()`, which is **synchronous**. Both `build()`
+  (async) and `build_sync()` (sync) delegate directly to `_build()`.
+- `build()` is `async def` so callers can `await` it today; true async I/O will
+  be added if pipeline stages gain `await`-able steps in the future.
+- Do not wrap `_build()` in `asyncio.run()` — `build_sync()` calls it directly.
 
 ## Dependency closure
 

From 1e1eecce1f4e1f4bdaf7b016984acd459c0cc2da Mon Sep 17 00:00:00 2001
From: dgenio <diogo.ansantos@nos.pt>
Date: Tue, 10 Mar 2026 06:11:42 +0000
Subject: [PATCH 5/8] fix(docs): remove stale graph.py line count from routing
 instructions

Exact line counts go stale with every edit. The issue reference
(#73) already tracks module size  no need to duplicate the count
in the instruction file.

Addresses review comment on PR #146.
---
 .github/instructions/routing.instructions.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.github/instructions/routing.instructions.md b/.github/instructions/routing.instructions.md
index deffa9d..52f264a 100644
--- a/.github/instructions/routing.instructions.md
+++ b/.github/instructions/routing.instructions.md
@@ -82,6 +82,6 @@ coalesced via `_coalesce_groups()` or re-split before adding edges.
 
 ## Related issues
 
-- dgenio/contextweaver#73 — module size tracking (routing/graph.py is 316 lines)
+- dgenio/contextweaver#73 — module size tracking
 - dgenio/contextweaver#69 — routing refactor work
 - dgenio/contextweaver#63 — ChoiceGraph design and validation

From b0b48bc3418cc3b313c135b817dd69e1bce8fb3c Mon Sep 17 00:00:00 2001
From: dgenio <diogo.ansantos@nos.pt>
Date: Tue, 10 Mar 2026 06:14:15 +0000
Subject: [PATCH 6/8] fix(docs): use 'non-item (navigation) nodes' in router
 indexing note

The _ensure_index() filter is 'node_id not in self._items', which
selects non-item nodes (navigation/structural), not non-leaf nodes.
A navigation node can be a leaf, so 'non-leaf' was imprecise.

Addresses review comment on PR #146.
---
 .github/instructions/routing.instructions.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.github/instructions/routing.instructions.md b/.github/instructions/routing.instructions.md
index 52f264a..fbf77dc 100644
--- a/.github/instructions/routing.instructions.md
+++ b/.github/instructions/routing.instructions.md
@@ -45,7 +45,7 @@ coalesced via `_coalesce_groups()` or re-split before adding edges.
   scores differ by less than the gap. Must stay in `[0.0, 1.0]`.
 - Results are ranked `(-score, item_id)` — same determinism guarantee end-to-end.
 - The TF-IDF index is lazily built on first `route()` call via `_ensure_index()`.
-  Items are indexed by sorted `item_id` before non-leaf nodes; do not change order.
+  Items are indexed by sorted `item_id` before non-item (navigation) nodes; do not change order.
 - Fallback scoring (nodes not in TF-IDF index) uses `jaccard()` from
   `contextweaver._utils` — never duplicate this logic here.
 

From be178c58505f7d84ac3e1cbe3e6a70c707f48a25 Mon Sep 17 00:00:00 2001
From: dgenio <diogo.ansantos@nos.pt>
Date: Tue, 10 Mar 2026 06:20:03 +0000
Subject: [PATCH 7/8] fix(docs): note build_call_prompt pair in async-first
 pattern section

Both _build() and _build_call_prompt() follow the same sync-core /
async-wrapper / sync-wrapper delegation pattern. Added a bullet so
agents editing call-prompt methods see the pattern applies there too.

Addresses nit review comment on PR #146.
---
 .github/instructions/context.instructions.md | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/.github/instructions/context.instructions.md b/.github/instructions/context.instructions.md
index 7b005f5..50cdc85 100644
--- a/.github/instructions/context.instructions.md
+++ b/.github/instructions/context.instructions.md
@@ -39,6 +39,8 @@ firewall must run before scoring (summaries, not raw text, must be scored).
 - `build()` is `async def` so callers can `await` it today; true async I/O will
   be added if pipeline stages gain `await`-able steps in the future.
 - Do not wrap `_build()` in `asyncio.run()` — `build_sync()` calls it directly.
+- The same pattern applies to `_build_call_prompt()` → `build_call_prompt()` /
+  `build_call_prompt_sync()`.
 
 ## Dependency closure
 

From a2671598fa3a9fa0b1a42fbf20dcefc7fb8333ca Mon Sep 17 00:00:00 2001
From: dgenio <diogo.ansantos@nos.pt>
Date: Tue, 10 Mar 2026 06:23:04 +0000
Subject: [PATCH 8/8] docs: add CHANGELOG entry for path-scoped instructions
 (#95)

Adds an entry under [Unreleased] / ### Added for the new context/
and routing/ instruction files.

Addresses PR #146 checklist item.
---
 CHANGELOG.md | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index e4168f3..dd888ac 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -7,6 +7,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ## [Unreleased]
 
+### Added
+- Path-scoped Copilot instructions for `context/` and `routing/` (#95)
+
 ## [0.1.5] - 2026-03-07
 
 ### Added