diff --git a/AGENTS.md b/AGENTS.md
new file mode 100644
index 0000000..e7eb004
--- /dev/null
+++ b/AGENTS.md
@@ -0,0 +1,127 @@
+# Agents Guide
+
+> Onboarding for future agents (and humans new to the repo). Read this before touching code so you don't accidentally re-litigate decisions that already shipped.
+
+## Project north star
+
+openworkers is **the multi-agent system that refuses to make things up**. Every claim the system emits is either tied to a verifiable primary source or marked as unsupported. Two domains live in the codebase:
+
+1. **Thesis assistant** (the legacy flagship). Audits literature claims against arXiv / Semantic Scholar / CrossRef. Producing prose is explicitly out of scope — every output is structured JSON.
+2. **Code audit** (the new flagship, in progress). Audits factual claims in technical artefacts (READMEs first, then PRs, compliance docs, architecture docs) against the actual codebase, language specs, and dependencies.
+
+The two domains share the same DNA: planner → researcher → checker → critic pipelines, structured output everywhere, a hard trust gate that *refuses* to verdict without evidence.
+
+## Where things are
+
+```
+core/
+  blackboard/        # Redis-backed shared state (thesis-only for now)
+  orchestrator/
+    thesis_flow.py   # ThesisOrchestrator — legacy, do not break
+    readme_flow.py   # ReadmeAuditOrchestrator — new code-audit slice
+    compiler.py      # PromptCompiler for thesis (blackboard → prompt vars)
+  router/            # Provider tier routing (quality/balanced/cheap)
+  memory/episodic.py # Qdrant episodic memory (thesis)
+  schemas.py         # Thesis Pydantic models
+  schemas_audit.py   # Code-audit Pydantic models (kept separate on purpose)
+  sources/
+    base.py          # SourceAdapter ABC — the new evidence-backend contract
+    local_repo.py    # LocalRepoAdapter — grep over a local repo
+providers/
+  unified.py         # UnifiedLLM: provider fallback, breakers, DRY_RUN path
+  thesis_agents.py   # Thesis agent suite — untouched, keep passing
+  code_audit_agents.py # README planner, checker, critic + trust gate
+  budget.py          # BudgetGuard (contextvars-scoped session ceiling)
+  resilience.py      # Tenacity + pybreaker glue
+prompts/
+  *.md               # Thesis templates (head_planner, specialist_*, ...)
+  code_audit/*.md    # Audit templates (readme_planner, readme_checker, ...)
+tools/mcp/           # Literature MCP tools; will migrate behind SourceAdapter
+apps/
+  cli/main.py        # Single argparse CLI for both `thesis ...` and `audit ...`
+  api/               # FastAPI surface
+  mcp_server/        # MCP stdio server
+  worker/            # Async worker stub
+tests/
+  fixtures/sample_repo/   # Synthetic widgetlib repo for audit tests
+  code_audit/             # New audit tests
+  test_*.py               # Thesis tests — DO NOT regress
+```
+
+## The trust gate (read this twice)
+
+For code audit, the invariant **"no verdict without evidence"** is enforced in code, not in prompts. In `providers/code_audit_agents.py::_enforce_trust_gate`:
+
+```
+for each claim:
+    if retrieved evidence is empty:
+        verdict = "unsupported"
+        confidence = 0.0
+        evidence_paths = []
+        notes = "No supporting evidence found in the repository."
+```
+
+This overwrites whatever the LLM said. A confidently hallucinating checker that returns `verified` for a claim with zero evidence gets corrected before the user ever sees the report. Do **not** move this logic into a prompt. The test `test_readme_audit_end_to_end` in `tests/code_audit/test_readme_flow.py` explicitly seeds a hallucinating checker stub and asserts the override fires.
+
+Mirror this pattern when you add new auditors (PR auditor, compliance auditor, etc.): keep the LLM creative, but enforce the trust invariant in Python.
+
+## The README-audit flow (current slice)
+
+1. **Planner (LLM)** — reads the README, extracts atomic factual claims with verbatim quotes + grep-friendly search hints. Schema: `ReadmeClaimList`.
+2. **Researcher (deterministic Python)** — uses `LocalRepoAdapter.search_any(hints)` to retrieve evidence snippets from the repo. **No LLM call here** — it's just a filesystem grep with safety rails (path traversal guard, file-size cap, dir excludes).
+3. **Checker (LLM + trust gate)** — judges each `(claim, evidence)` pair into `verified | drifted | unsupported | contradicted`. Trust gate runs after.
+4. **Critic (LLM)** — adversarial pass: weak verdicts, missed claims, suggestions.
+
+The audited README is **excluded** from its own evidence pool — otherwise every fabricated claim could "verify itself" against the README quote. See `ReadmeAuditOrchestrator.audit` for the exclusion logic.
+
+## Coexistence rules
+
+- **Do not break the thesis path.** The full thesis test suite (`tests/test_*.py` minus `tests/code_audit/`) must stay green. Thesis is being deprecated *gradually*, not yanked.
+- **Do not modify `core/schemas.py` to add audit fields.** `core/schemas_audit.py` is the audit-domain home. The two domains evolve independently until a real reason to merge appears.
+- **Blackboard is thesis-only for now.** README audit deliberately skips it — claim/evidence flow is plain Python. When a second audit type ships and shared state actually buys something, fold the blackboard in.
+
+## LLM routing & DRY_RUN
+
+- `UnifiedLLM` (in `providers/unified.py`) routes to Anthropic / OpenAI / DeepSeek by tier (quality / balanced / cheap), with per-provider circuit breakers and a fallback chain.
+- Tests run with `DRY_RUN=true` by default (from `Settings`). Under DRY_RUN, `generate()` returns a placeholder JSON shaped from the `response_schema`. **Caveat:** array fields come back empty — useful for "did the wiring work?" smoke checks, useless for end-to-end behaviour.
+- For end-to-end tests of new agent flows, do **not** rely on DRY_RUN. Set `DRY_RUN=false`, set the `THESIS_<TIER>_PROVIDER` / `_MODEL` env vars, and stub LLM responses via `UnifiedLLM.set_generate_fn(...)` — content-aware (route by which agent's system prompt is in play). See `tests/code_audit/test_readme_flow.py::_make_stub_unified` for the pattern.
+
+## Conventions
+
+- **No prose generation.** Every agent emits structured JSON validated against a Pydantic model. If you find yourself prompting for "a paragraph that summarises…", stop and write a schema instead.
+- **Structured output schemas** are derived from Pydantic models via `_schema_for()` (one per file: see `providers/thesis_agents.py` and `providers/code_audit_agents.py`).
+- **`from __future__ import annotations`** at the top of every new file. The project targets Python 3.9 but uses py3.10+ syntax via deferred evaluation.
+- **Lint stack:** `ruff` + `black --line-length=100`, both run in CI. New files must pass both. `mypy` strict is enforced on `core/` and `providers/`.
+- **Comments:** non-obvious *why* only. No "this function reads a file"–style narration. Comments explaining hidden invariants, past incidents, or workarounds are welcome.
+- **Commit hygiene:** no `--no-verify`, no skipping hooks. Pre-commit hook failures are diagnostic signals, not nuisances to bypass.
+
+## Where the project is going (1.0 trajectory)
+
+See `ROADMAP.md` for the full picture. Short version:
+
+- ✅ Slice 1 (shipped): README auditor.
+- 🚧 Next slices: PR auditor (`audit pr <url>`), compliance auditor, architecture auditor. All slot in behind the same `SourceAdapter` + agent-suite + trust-gate pattern.
+- 🚧 Layered source adapters: repo (highest trust) → language specs / RFCs → dependency source. The literature MCP tools will migrate behind the same contract.
+- 🚧 Cherry-picked from the v1.0 plan: tool/source registry, light provider-registry abstraction (Ollama later for local inference on private repos), structlog audit trail.
+- ⏸️ Deferred: PyPI packaging, Typer CLI rewrite, OTel, Smart truncation, Ollama. Not blocking the audit-track expansion.
+
+The thesis pipeline stays first-class through the transition, then is gradually deprecated as code-audit reaches feature parity.
+
+## How to add a new auditor (recipe)
+
+1. **Schema** — add audit-specific Pydantic models to `core/schemas_audit.py` (claim shape, verdict shape, report shape).
+2. **Source adapter** — if a new evidence backend is needed (e.g., GitHub PR adapter), add a class to `core/sources/` implementing `SourceAdapter`. Keep the path-traversal / scope guard at the adapter boundary.
+3. **Agents** — add `<Domain>PlannerAgent`, `<Domain>CheckerAgent`, `AuditCriticAgent` (or reuse) to `providers/code_audit_agents.py` or a sibling module. The checker's post-LLM step **must** call a trust gate equivalent to `_enforce_trust_gate`.
+4. **Orchestrator** — add `core/orchestrator/<domain>_flow.py` following `readme_flow.py`. Stage order is planner → deterministic researcher → checker (+ gate) → critic. Exclude the audited artefact from its own evidence pool if applicable.
+5. **Prompts** — add `prompts/code_audit/<domain>_*.md` templates with explicit JSON schemas in the body and "no prose, no markdown fences" rules.
+6. **CLI** — register a new subcommand under `audit` in `apps/cli/main.py`.
+7. **Fixture + test** — add a `tests/fixtures/<domain>_*/` repo and a `tests/code_audit/test_<domain>_flow.py` with at least: an adapter-level test, an end-to-end test asserting verdict distribution, and a trust-gate test asserting that a hallucinating checker stub is overridden.
+8. **Docs** — update README's "Code audit" section and `ROADMAP.md`. Add a `CHANGELOG.md` entry under `[Unreleased]`.
+
+## Things that look like good ideas but aren't
+
+- **Letting the LLM verdict without evidence.** No matter how good the model, the trust gate stays. The whole product premise rides on this.
+- **Premature shared base class for orchestrators.** Wait until the *third* auditor lands before extracting `BaseAuditFlow`. Two examples isn't a pattern; three is.
+- **Sharing `core/schemas.py` between domains.** Keep `schemas_audit.py` separate. The merge cost is low; the divergence cost of cross-domain field coupling is high.
+- **Adding the README to its own evidence pool.** Already burned us once. Self-evidence makes hallucinations verify themselves.
+- **Skipping `from __future__ import annotations` because "we're on 3.13 locally".** CI runs on 3.9.
diff --git a/CHANGELOG.md b/CHANGELOG.md
index a1e78cb..5e708e4 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -5,6 +5,8 @@ All notable changes to OpenWorkers are documented here. The format is loosely ba
 ## [Unreleased]
 
 ### Added
+- **Code-audit track — README auditor (first slice).** New `openworkers audit readme <repo>` CLI subcommand verifies every factual claim in a README against the actual repository, emitting `verified | drifted | unsupported | contradicted` verdicts with cited file paths. Pipeline: planner (LLM) → researcher (deterministic grep via new `LocalRepoAdapter`) → checker (LLM + post-LLM trust gate) → critic (LLM adversarial pass). New modules: `core/sources/` (`SourceAdapter` ABC + `LocalRepoAdapter`), `core/schemas_audit.py` (Pydantic audit models), `core/orchestrator/readme_flow.py` (`ReadmeAuditOrchestrator`), `providers/code_audit_agents.py` (planner / checker / critic + `_enforce_trust_gate` invariant), `prompts/code_audit/*.md` (audit templates). The trust gate is enforced in code, not delegated to prompts: any claim with no retrieved evidence is forced to `unsupported` regardless of LLM output. The audited README is excluded from its own evidence pool so fabricated claims cannot self-verify. `tests/code_audit/test_readme_flow.py` exercises the full flow with a stubbed `UnifiedLLM.generate_fn` and an `tests/fixtures/sample_repo/` containing a deliberate mix of verified / drifted / contradicted / fabricated claims. Thesis pipeline untouched.
+- **Contributor onboarding doc** `AGENTS.md` capturing project DNA, code-audit slice design, trust-gate invariant, conventions, and the recipe for adding new auditors.
 - **RAG over user PDFs** (first incremental v1.0 slice). New `tools/mcp/rag.py` with sentence-aware chunker, `RAGIndexer` (PDF/text → Qdrant via PyMuPDF + FastEmbed `BAAI/bge-small-en-v1.5`), and `RAGSearchTool` (registered as `rag_search` in `ToolRegistry`). Collections namespaced under `rag_*` so they cannot collide with `thesis_corpus` or `episodes`. New CLI: `thesis ingest add|list|delete`. New flag: `thesis research ... --rag-collection <name>` makes the researcher pull from the user collection alongside arXiv/SS. New field: `ResearchContext.rag_collection`. `tests/test_rag.py` covers chunking edge cases, BOM/text extraction, collection naming, indexer round-trip, privacy gating, and idempotent re-ingest.
 
 ### Documentation
diff --git a/README.md b/README.md
index 5261b80..1864570 100644
--- a/README.md
+++ b/README.md
@@ -6,9 +6,29 @@
 [![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
 [![Lint: ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)
 
-A multi-agent thesis assistant that searches real literature, audits citations, and produces structured critiques. **It does not write prose.** It runs a hierarchical pipeline (HEAD planner → researcher → checker → synthesizer → critic → HEAD supervisor) over a Redis blackboard, with provider-agnostic LLM routing across Anthropic, OpenAI, and DeepSeek and verified citations via arXiv, Semantic Scholar, and CrossRef.
+A multi-agent system that **refuses to make things up**. Two domains live here today:
 
-> **Project status:** 0.1.0 (pre-release). The pipeline runs end-to-end and ships an MCP server, a CLI, and a FastAPI app, but APIs may shift before 1.0. See [ROADMAP.md](ROADMAP.md) for the planned 1.0 direction.
+- **Thesis assistant** — audits literature claims against arXiv / Semantic Scholar / CrossRef.
+- **Code audit** *(new flagship, in progress)* — audits factual claims in technical artefacts (READMEs first, then PRs / compliance docs / architecture docs) against the actual codebase, language specs, and dependencies.
+
+Both domains share the same DNA: a hierarchical pipeline (planner → researcher → checker → critic) producing structured JSON, with provider-agnostic LLM routing across Anthropic, OpenAI, and DeepSeek and a hard trust gate that **refuses to verdict without evidence**.
+
+> **Project status:** 0.1.0 (pre-release). The thesis pipeline runs end-to-end; the code-audit track has landed its first slice (`openworkers audit readme <repo>`). APIs may shift before 1.0. See [ROADMAP.md](ROADMAP.md) for direction, [AGENTS.md](AGENTS.md) for contributor context.
+
+## Code audit *(new track)*
+
+`openworkers audit readme <repo>` extracts every factual claim from a README and verdicts each one against the actual repository:
+
+| Verdict | Meaning |
+|---|---|
+| `verified` | Code clearly demonstrates the claim is true |
+| `drifted` | A related but divergent implementation exists (renamed flag, changed default, etc.) |
+| `contradicted` | Code directly disproves the claim |
+| `unsupported` | No evidence in the repo — enforced in code, not delegated to the LLM |
+
+The pipeline is planner (LLM extracts claims) → researcher (deterministic grep via `LocalRepoAdapter`) → checker (LLM judges + trust gate forces `unsupported` when evidence is empty) → critic (adversarial pass). The audited README is excluded from its own evidence pool, so fabricated claims cannot verify themselves.
+
+Roadmap for this track: PR auditor (PR description vs. diff), compliance auditor (security/policy claims vs. code), architecture auditor (design doc vs. implementation). See [AGENTS.md](AGENTS.md) for the contributor recipe.
 
 ## What it does
 
diff --git a/ROADMAP.md b/ROADMAP.md
index 178840f..1a6ef3d 100644
--- a/ROADMAP.md
+++ b/ROADMAP.md
@@ -19,6 +19,19 @@
 - ✅ Docker Compose stack (Redis, Qdrant, CLI, MCP) and CI matrix on Python 3.9 / 3.12
 - ✅ **RAG over user PDFs** — `thesis ingest add paper.pdf --collection my_papers` chunks + embeds via FastEmbed (BAAI/bge-small-en-v1.5) into Qdrant; the researcher transparently retrieves from the user collection when `thesis research ... --rag-collection my_papers` is set. Collections are namespaced under `rag_*` so they cannot collide with the thesis corpus or episodic memory.
 
+## Code-audit track (new flagship)
+
+A second domain alongside the thesis assistant: audit factual claims in technical artefacts against the codebase. Same DNA — multi-agent, structured JSON, never fabricates, trust gate refuses verdicts without evidence — applied to a domain where trustworthy automated review matters to OSS maintainers and contributors. See [AGENTS.md](AGENTS.md) for the contributor recipe and trust-gate invariant.
+
+- ✅ **README auditor** *(first slice, shipped)*. `openworkers audit readme <repo>` extracts atomic claims from a README and verdicts each one against the actual codebase as `verified | drifted | unsupported | contradicted`. Trust gate is enforced in `providers/code_audit_agents.py::_enforce_trust_gate`, not in prompts. The audited README is excluded from its own evidence pool. New `SourceAdapter` abstraction (`core/sources/`) with `LocalRepoAdapter`.
+- 🚧 **PR auditor** — `openworkers audit pr <url>`. Verify the PR description against the actual diff; flag scope creep, missing tests, undocumented changes. Needs a `GitHubAdapter` implementing `SourceAdapter`.
+- 🚧 **Compliance auditor** — `openworkers audit compliance <repo>`. Verify security/policy claims ("inputs sanitized", "no secrets", "auth required on X") against the code.
+- 🚧 **Architecture auditor** — verify RFC / design-doc claims against implementation, language specs, and dependency source.
+- 🚧 **Layered source adapters** — repo (highest trust) → language specs / RFCs (`SpecAdapter`) → dependency source (`DependencyAdapter`). The existing literature MCP tools (arXiv / Semantic Scholar / CrossRef) will migrate behind the same `SourceAdapter` contract.
+- 📋 **Tool/source registry** — `@register_source(...)` decorator + entry-point discovery so users add their own evidence backends without forking.
+- 📋 **Local-inference provider** — Ollama path so users can audit private/proprietary repos without sending code to a cloud LLM.
+- 🚧 **Gradual thesis deprecation** — the thesis pipeline stays first-class through the transition; deprecation happens after the audit track reaches feature parity.
+
 ## Proposed for 1.0
 
 The 1.0 line targets a polished, packaged release on PyPI. The themes:
diff --git a/apps/cli/main.py b/apps/cli/main.py
index b1dcf46..5998922 100644
--- a/apps/cli/main.py
+++ b/apps/cli/main.py
@@ -9,6 +9,7 @@
     format_session_text,
 )
 from core.memory.episodic import EpisodicMemory
+from core.orchestrator.readme_flow import ReadmeAuditOrchestrator, format_report_text
 from core.orchestrator.thesis_flow import ThesisOrchestrator
 from core.router.engine import Router
 from core.schemas import ResearchContext
@@ -230,6 +231,29 @@ async def cmd_corpus(args):
     return summary
 
 
+async def cmd_audit_dispatch(args):
+    """Route `audit <subcommand>` to its handler."""
+    if args.audit_action == "readme":
+        return await cmd_audit_readme(args)
+    raise SystemExit(f"Unknown audit action: {args.audit_action}")
+
+
+async def cmd_audit_readme(args):
+    """Run the README auditor on a local repo."""
+    unified = create_unified_llm()
+    orch = ReadmeAuditOrchestrator(unified=unified)
+    report, critique = await orch.audit(repo_path=args.repo, readme_path=args.readme)
+    if args.format == "json":
+        payload = {
+            "report": report.model_dump(),
+            "critique": critique.model_dump(),
+        }
+        _output(payload, "json", args.output)
+    else:
+        _output(format_report_text(report, critique), "text", args.output)
+    return report
+
+
 async def cmd_ingest(args):
     from tools.mcp.rag import RAGIndexer
 
@@ -355,6 +379,26 @@ def build_parser() -> argparse.ArgumentParser:
     p_corpus.add_argument("--year", type=int, default=0, help="Year of publication")
     add_output_args(p_corpus)
 
+    p_audit = sub.add_parser("audit", help="Audit technical artefacts against the codebase")
+    audit_sub = p_audit.add_subparsers(dest="audit_action", required=True)
+
+    p_audit_readme = audit_sub.add_parser(
+        "readme",
+        help="Verify every factual claim in a README against the repository",
+    )
+    p_audit_readme.add_argument(
+        "repo",
+        type=str,
+        help="Path to the repository to audit",
+    )
+    p_audit_readme.add_argument(
+        "--readme",
+        type=str,
+        default=None,
+        help="Explicit README path (default: auto-discover under <repo>)",
+    )
+    add_output_args(p_audit_readme)
+
     p_ingest = sub.add_parser("ingest", help="Manage user RAG collections (PDF/text -> Qdrant)")
     ingest_sub = p_ingest.add_subparsers(dest="ingest_action", required=True)
 
@@ -402,6 +446,7 @@ def main():
         "sessions": cmd_sessions,
         "corpus": cmd_corpus,
         "ingest": cmd_ingest,
+        "audit": cmd_audit_dispatch,
     }
 
     handler = command_map.get(args.command)
diff --git a/core/orchestrator/readme_flow.py b/core/orchestrator/readme_flow.py
new file mode 100644
index 0000000..503419c
--- /dev/null
+++ b/core/orchestrator/readme_flow.py
@@ -0,0 +1,268 @@
+"""README audit orchestrator.
+
+Parallels ``ThesisOrchestrator`` in spirit (planner → researcher →
+checker → critic) but with three deliberate differences:
+
+1. The researcher is **deterministic Python**, not an LLM. Evidence
+   retrieval is a grep over the local repo via ``LocalRepoAdapter`` —
+   no fabrication risk, no per-claim API cost.
+
+2. The trustworthiness gate is enforced in code (``_enforce_trust_gate``
+   in ``providers/code_audit_agents.py``), not entrusted to a prompt.
+
+3. There is no shared blackboard yet — claim/evidence state flows as
+   plain Python between agents. The blackboard layer will fold in
+   when a second audit type (PR / compliance) is added and the
+   shared state actually buys something.
+"""
+
+from __future__ import annotations
+
+import asyncio
+import logging
+import os
+import time
+from collections import Counter
+from pathlib import Path
+from typing import Any, Callable
+
+from core.schemas_audit import (
+    ALL_VERDICTS,
+    VERDICT_UNSUPPORTED,
+    AuditCritique,
+    AuditReport,
+    ClaimEvidence,
+    ClaimVerdict,
+    EvidenceRef,
+    ReadmeClaim,
+    ReadmeClaimList,
+)
+from core.sources.local_repo import LocalRepoAdapter
+from providers.code_audit_agents import (
+    AuditCriticAgent,
+    ReadmeCheckerAgent,
+    ReadmePlannerAgent,
+)
+from providers.unified import UnifiedLLM
+
+logger = logging.getLogger(__name__)
+
+
+_AUDIT_PROMPT_DIR = os.path.join(
+    os.path.dirname(os.path.dirname(os.path.dirname(__file__))),
+    "prompts",
+    "code_audit",
+)
+
+_TEMPLATE_FILES = {
+    "readme_planner": "readme_planner.md",
+    "readme_checker": "readme_checker.md",
+    "audit_critic": "audit_critic.md",
+}
+
+
+def _render_audit_prompt(name: str, variables: dict[str, Any]) -> str:
+    """Tiny placeholder-substitution renderer for audit prompts.
+
+    Deliberately not reusing PromptCompiler: that compiler is wired to
+    extract blackboard state, which this slice doesn't use. Audit
+    templates only need ``{{ var }}`` substitution.
+    """
+    filename = _TEMPLATE_FILES.get(name)
+    if not filename:
+        raise ValueError(f"Unknown audit template: {name}")
+    path = os.path.join(_AUDIT_PROMPT_DIR, filename)
+    try:
+        with open(path, encoding="utf-8") as f:
+            template = f.read()
+    except OSError:
+        return f"[Template {name} not found at {path}]"
+    for key, value in variables.items():
+        template = template.replace("{{ " + key + " }}", str(value))
+    return template
+
+
+class ReadmeAuditOrchestrator:
+    """Run a README audit end-to-end against a local repo."""
+
+    def __init__(
+        self,
+        unified: UnifiedLLM,
+        adapter: LocalRepoAdapter | None = None,
+        prompt_renderer: Callable[[str, dict[str, Any]], str] | None = None,
+        max_evidence_per_claim: int = 5,
+    ) -> None:
+        self.unified = unified
+        self.adapter = adapter
+        self.render = prompt_renderer or _render_audit_prompt
+        self.max_evidence_per_claim = max_evidence_per_claim
+        self.planner = ReadmePlannerAgent(unified=unified, prompt_renderer=self.render)
+        self.checker = ReadmeCheckerAgent(unified=unified, prompt_renderer=self.render)
+        self.critic = AuditCriticAgent(unified=unified, prompt_renderer=self.render)
+
+    async def audit(
+        self,
+        repo_path: Path | str,
+        readme_path: Path | str | None = None,
+    ) -> tuple[AuditReport, AuditCritique]:
+        start = time.time()
+        adapter = self.adapter or LocalRepoAdapter(repo_path)
+        # Re-bind adapter when caller passed an explicit repo_path: the
+        # default-construction branch above already pinned it; this
+        # branch handles the case where the caller reused an existing
+        # orchestrator across multiple repos.
+        if self.adapter is None:
+            self.adapter = adapter
+
+        readme_file = Path(readme_path) if readme_path else adapter.find_readme()
+        errors: list[str] = []
+        if readme_file is None or not Path(readme_file).is_file():
+            empty = AuditReport(
+                repo_path=str(adapter.root),
+                readme_path="",
+                verdicts=[],
+                summary=dict.fromkeys(ALL_VERDICTS, 0),
+                errors=["No README found in repo."],
+            )
+            return empty, AuditCritique()
+
+        readme_path_str = str(Path(readme_file).resolve())
+        readme_text = Path(readme_file).read_text(encoding="utf-8", errors="replace")
+        # The README under audit must not count as evidence for its own
+        # claims — otherwise every fabricated claim 'verifies' itself.
+        try:
+            readme_rel = str(Path(readme_path_str).resolve().relative_to(adapter.root))
+        except ValueError:
+            readme_rel = ""
+
+        # ── Stage 1: Planner extracts claims ──
+        try:
+            planner_result = await self.planner.execute(readme_text, readme_path_str)
+            claim_list: ReadmeClaimList = planner_result["output"]
+        except Exception as e:
+            errors.append(f"planner: {e}")
+            claim_list = ReadmeClaimList(claims=[], readme_path=readme_path_str)
+
+        # ── Stage 2: Researcher retrieves evidence (deterministic) ──
+        evidence = await asyncio.to_thread(
+            self._retrieve_all_evidence, claim_list.claims, adapter, readme_rel
+        )
+
+        # ── Stage 3: Checker renders verdicts (with trust gate) ──
+        if claim_list.claims:
+            try:
+                checker_result = await self.checker.execute(claim_list.claims, evidence)
+                verdicts: list[ClaimVerdict] = list(checker_result["output"].verdicts)
+            except Exception as e:
+                errors.append(f"checker: {e}")
+                verdicts = [
+                    ClaimVerdict(
+                        claim_id=c.claim_id,
+                        claim_text=c.claim_text,
+                        verdict=VERDICT_UNSUPPORTED,
+                        confidence=0.0,
+                        evidence_paths=[],
+                        notes=f"Checker failed: {e}",
+                    )
+                    for c in claim_list.claims
+                ]
+        else:
+            verdicts = []
+
+        # ── Stage 4: Critic adversarial pass ──
+        try:
+            critic_result = await self.critic.execute(verdicts, readme_text)
+            critique: AuditCritique = critic_result["output"]
+        except Exception as e:
+            errors.append(f"critic: {e}")
+            critique = AuditCritique()
+
+        summary = Counter(v.verdict for v in verdicts)
+        report = AuditReport(
+            repo_path=str(adapter.root),
+            readme_path=readme_path_str,
+            verdicts=verdicts,
+            summary={v: int(summary.get(v, 0)) for v in ALL_VERDICTS},
+            errors=errors,
+        )
+        elapsed_ms = int((time.time() - start) * 1000)
+        logger.info(
+            "readme_audit completed repo=%s claims=%d elapsed_ms=%d errors=%d",
+            adapter.root,
+            len(verdicts),
+            elapsed_ms,
+            len(errors),
+        )
+        return report, critique
+
+    def _retrieve_all_evidence(
+        self,
+        claims: list[ReadmeClaim],
+        adapter: LocalRepoAdapter,
+        exclude_path: str = "",
+    ) -> list[ClaimEvidence]:
+        out: list[ClaimEvidence] = []
+        for claim in claims:
+            # Over-fetch then post-filter so excluding the README doesn't
+            # silently shrink the result set below ``max_evidence_per_claim``.
+            raw = adapter.search_any(
+                claim.search_hints,
+                limit=(
+                    self.max_evidence_per_claim * 2 if exclude_path else self.max_evidence_per_claim
+                ),
+            )
+            filtered = [s for s in raw if s.path != exclude_path][: self.max_evidence_per_claim]
+            refs = [
+                EvidenceRef(
+                    path=s.path,
+                    line_start=s.line_start,
+                    line_end=s.line_end,
+                    text=s.text,
+                    source=s.source,
+                )
+                for s in filtered
+            ]
+            out.append(ClaimEvidence(claim_id=claim.claim_id, snippets=refs))
+        return out
+
+
+def format_report_text(report: AuditReport, critique: AuditCritique | None = None) -> str:
+    """Pretty-print an audit report for terminal output."""
+    lines: list[str] = []
+    lines.append(f"README audit — {report.repo_path}")
+    lines.append(f"README: {report.readme_path or '(not found)'}")
+    lines.append("")
+    if report.summary:
+        summary_line = "  ".join(f"{k}={v}" for k, v in report.summary.items())
+        lines.append(f"Summary: {summary_line}")
+        lines.append("")
+    for v in report.verdicts:
+        marker = {
+            "verified": "✓",
+            "drifted": "≠",
+            "contradicted": "✗",
+            "unsupported": "?",
+        }.get(v.verdict, "·")
+        lines.append(f"{marker} [{v.verdict.upper():12s}] {v.claim_id}: {v.claim_text}")
+        if v.evidence_paths:
+            for p in v.evidence_paths:
+                lines.append(f"      ↳ {p}")
+        if v.notes:
+            lines.append(f"      note: {v.notes}")
+    if report.errors:
+        lines.append("")
+        lines.append("Errors:")
+        for e in report.errors:
+            lines.append(f"  - {e}")
+    if critique is not None:
+        lines.append("")
+        lines.append("Critic pass:")
+        for wv in critique.weak_verdicts:
+            lines.append(f"  weak:    {wv}")
+        for mc in critique.missed_claims:
+            lines.append(f"  missed:  {mc}")
+        for sg in critique.suggestions:
+            lines.append(f"  suggest: {sg}")
+        if critique.overall_assessment:
+            lines.append(f"  → {critique.overall_assessment}")
+    return "\n".join(lines)
diff --git a/core/schemas_audit.py b/core/schemas_audit.py
new file mode 100644
index 0000000..577b195
--- /dev/null
+++ b/core/schemas_audit.py
@@ -0,0 +1,89 @@
+"""Pydantic schemas for the code-audit domain.
+
+Kept separate from ``core/schemas.py`` so the legacy thesis types and the
+new audit types evolve independently while the two domains coexist.
+"""
+
+from __future__ import annotations
+
+from pydantic import BaseModel, Field
+
+VERDICT_VERIFIED = "verified"
+VERDICT_DRIFTED = "drifted"
+VERDICT_UNSUPPORTED = "unsupported"
+VERDICT_CONTRADICTED = "contradicted"
+
+ALL_VERDICTS = (
+    VERDICT_VERIFIED,
+    VERDICT_DRIFTED,
+    VERDICT_UNSUPPORTED,
+    VERDICT_CONTRADICTED,
+)
+
+
+class ReadmeClaim(BaseModel):
+    """A single atomic factual claim extracted from a README."""
+
+    claim_id: str
+    claim_text: str = Field(description="Verbatim quote from the README")
+    claim_type: str = Field(
+        default="other",
+        description="feature | install | usage | requirement | metric | api | other",
+    )
+    search_hints: list[str] = Field(
+        default_factory=list,
+        description="Tokens/identifiers the researcher should grep for",
+    )
+
+
+class ReadmeClaimList(BaseModel):
+    claims: list[ReadmeClaim] = Field(default_factory=list)
+    readme_path: str = ""
+
+
+class EvidenceRef(BaseModel):
+    """Adapter-agnostic citation handle. Mirrors ``EvidenceSnippet``
+    in shape but is the JSON-serialisable form passed across the
+    LLM boundary.
+    """
+
+    path: str
+    line_start: int = 0
+    line_end: int = 0
+    text: str = ""
+    source: str = ""
+
+
+class ClaimEvidence(BaseModel):
+    claim_id: str
+    snippets: list[EvidenceRef] = Field(default_factory=list)
+
+
+class ClaimVerdict(BaseModel):
+    claim_id: str
+    claim_text: str = ""
+    verdict: str = Field(description="verified | drifted | unsupported | contradicted")
+    confidence: float = 0.0
+    evidence_paths: list[str] = Field(default_factory=list)
+    notes: str = ""
+
+
+class ClaimVerdictList(BaseModel):
+    """LLM-side wrapper so the checker emits a single JSON object."""
+
+    verdicts: list[ClaimVerdict] = Field(default_factory=list)
+
+
+class AuditReport(BaseModel):
+    repo_path: str
+    readme_path: str = ""
+    verdicts: list[ClaimVerdict] = Field(default_factory=list)
+    summary: dict[str, int] = Field(default_factory=dict)
+    errors: list[str] = Field(default_factory=list)
+
+
+class AuditCritique(BaseModel):
+    weak_verdicts: list[str] = Field(default_factory=list)
+    missed_claims: list[str] = Field(default_factory=list)
+    suggestions: list[str] = Field(default_factory=list)
+    overall_assessment: str = ""
diff --git a/core/sources/__init__.py b/core/sources/__init__.py
new file mode 100644
index 0000000..c384f96
--- /dev/null
+++ b/core/sources/__init__.py
@@ -0,0 +1,13 @@
+"""SourceAdapter layer: pluggable evidence retrieval for audit pipelines.
+
+Each adapter implements a uniform contract (search, fetch, cite) so domain
+flows can compose evidence from heterogeneous sources without hardcoding
+which backend they speak to. The literature-domain tools under
+``tools/mcp/`` will migrate behind this contract in a later slice; for the
+README-audit slice we ship only the local-repo adapter.
+"""
+
+from core.sources.base import EvidenceSnippet, SourceAdapter
+from core.sources.local_repo import LocalRepoAdapter
+
+__all__ = ["EvidenceSnippet", "LocalRepoAdapter", "SourceAdapter"]
diff --git a/core/sources/base.py b/core/sources/base.py
new file mode 100644
index 0000000..001519f
--- /dev/null
+++ b/core/sources/base.py
@@ -0,0 +1,51 @@
+from __future__ import annotations
+
+from abc import ABC, abstractmethod
+from dataclasses import dataclass
+
+
+@dataclass
+class EvidenceSnippet:
+    """A single piece of evidence retrieved from a source.
+
+    ``path`` is opaque to the adapter contract — it might be a file path,
+    a URL, a DOI, or a chunk id. The synthesizer treats it as a citation
+    handle: anything a human can navigate back to.
+    """
+
+    path: str
+    line_start: int
+    line_end: int
+    text: str
+    source: str = ""
+
+    def cite(self) -> str:
+        if self.line_start <= 0:
+            return self.path
+        if self.line_end and self.line_end != self.line_start:
+            return f"{self.path}:{self.line_start}-{self.line_end}"
+        return f"{self.path}:{self.line_start}"
+
+
+class SourceAdapter(ABC):
+    """Abstract contract every evidence backend implements."""
+
+    name: str = "unknown"
+
+    @abstractmethod
+    def search(self, query: str, limit: int = 5) -> list[EvidenceSnippet]:
+        """Return up to ``limit`` snippets relevant to ``query``."""
+
+    def fetch(self, path: str, line_start: int = 0, line_end: int = 0) -> EvidenceSnippet:
+        """Return the canonical content for a citation handle.
+
+        Default implementation: callers that only need search results can
+        ignore this; adapters that support deep-linked retrieval override.
+        """
+        return EvidenceSnippet(
+            path=path,
+            line_start=line_start,
+            line_end=line_end,
+            text="",
+            source=self.name,
+        )
diff --git a/core/sources/local_repo.py b/core/sources/local_repo.py
new file mode 100644
index 0000000..d4ab88f
--- /dev/null
+++ b/core/sources/local_repo.py
@@ -0,0 +1,199 @@
+from __future__ import annotations
+
+import re
+from collections.abc import Iterable
+from pathlib import Path
+
+from core.sources.base import EvidenceSnippet, SourceAdapter
+
+_DEFAULT_INCLUDE_SUFFIXES = {
+    ".py",
+    ".pyi",
+    ".js",
+    ".ts",
+    ".tsx",
+    ".jsx",
+    ".go",
+    ".rs",
+    ".java",
+    ".kt",
+    ".rb",
+    ".php",
+    ".cs",
+    ".c",
+    ".cc",
+    ".cpp",
+    ".h",
+    ".hpp",
+    ".swift",
+    ".sh",
+    ".bash",
+    ".zsh",
+    ".sql",
+    ".toml",
+    ".yaml",
+    ".yml",
+    ".json",
+    ".ini",
+    ".cfg",
+    ".md",
+    ".rst",
+    ".txt",
+    ".env",
+    ".dockerfile",
+    ".tf",
+}
+
+_DEFAULT_EXCLUDE_DIRS = {
+    ".git",
+    "node_modules",
+    "__pycache__",
+    ".venv",
+    "venv",
+    "dist",
+    "build",
+    ".mypy_cache",
+    ".pytest_cache",
+    ".ruff_cache",
+    "qdrant_data",
+}
+
+_MAX_BYTES_PER_FILE = 256 * 1024
+_CONTEXT_LINES = 2
+
+
+class LocalRepoAdapter(SourceAdapter):
+    """Evidence backend that grep-walks a repo from the filesystem.
+
+    Why not shell out to ripgrep: we want zero external deps for the
+    smoke path; a pure-Python walker is fast enough on the kinds of
+    READMEs we audit (a few dozen claims, repos under a few thousand
+    files). Swap in ripgrep later if profiling demands it.
+    """
+
+    name = "local_repo"
+
+    def __init__(
+        self,
+        root: Path | str,
+        include_suffixes: set[str] | None = None,
+        exclude_dirs: set[str] | None = None,
+    ) -> None:
+        self.root = Path(root).resolve()
+        if not self.root.exists():
+            raise FileNotFoundError(f"Repo root does not exist: {self.root}")
+        self.include_suffixes = include_suffixes or _DEFAULT_INCLUDE_SUFFIXES
+        self.exclude_dirs = exclude_dirs or _DEFAULT_EXCLUDE_DIRS
+
+    def search(self, query: str, limit: int = 5) -> list[EvidenceSnippet]:
+        query = (query or "").strip()
+        if not query:
+            return []
+        pattern = re.compile(re.escape(query), re.IGNORECASE)
+        return self._search_pattern(pattern, limit)
+
+    def search_any(self, terms: Iterable[str], limit: int = 5) -> list[EvidenceSnippet]:
+        """Search for *any* of ``terms``. Useful when the planner emits
+        multiple search hints per claim — we want to retrieve evidence
+        when any hint matches, not require all.
+        """
+        cleaned = [re.escape(t.strip()) for t in terms if t and t.strip()]
+        if not cleaned:
+            return []
+        pattern = re.compile("|".join(cleaned), re.IGNORECASE)
+        return self._search_pattern(pattern, limit)
+
+    def _search_pattern(self, pattern: re.Pattern[str], limit: int) -> list[EvidenceSnippet]:
+        hits: list[EvidenceSnippet] = []
+        for path in self._walk():
+            if len(hits) >= limit:
+                break
+            try:
+                size = path.stat().st_size
+            except OSError:
+                continue
+            if size > _MAX_BYTES_PER_FILE:
+                continue
+            try:
+                text = path.read_text(encoding="utf-8", errors="replace")
+            except OSError:
+                continue
+            lines = text.splitlines()
+            for i, line in enumerate(lines):
+                if pattern.search(line):
+                    start = max(0, i - _CONTEXT_LINES)
+                    end = min(len(lines), i + _CONTEXT_LINES + 1)
+                    snippet_text = "\n".join(lines[start:end])
+                    rel = path.relative_to(self.root)
+                    hits.append(
+                        EvidenceSnippet(
+                            path=str(rel),
+                            line_start=start + 1,
+                            line_end=end,
+                            text=snippet_text,
+                            source=self.name,
+                        )
+                    )
+                    if len(hits) >= limit:
+                        break
+        return hits
+
+    def fetch(self, path: str, line_start: int = 0, line_end: int = 0) -> EvidenceSnippet:
+        target = (self.root / path).resolve()
+        # Refuse to read outside the repo root — a SourceAdapter must never
+        # leak files the user didn't authorise. This is the trustworthiness
+        # gate at the adapter boundary.
+        try:
+            target.relative_to(self.root)
+        except ValueError:
+            return EvidenceSnippet(path=path, line_start=0, line_end=0, text="", source=self.name)
+        if not target.is_file():
+            return EvidenceSnippet(path=path, line_start=0, line_end=0, text="", source=self.name)
+        try:
+            lines = target.read_text(encoding="utf-8", errors="replace").splitlines()
+        except OSError:
+            return EvidenceSnippet(path=path, line_start=0, line_end=0, text="", source=self.name)
+        if line_start <= 0:
+            return EvidenceSnippet(
+                path=path,
+                line_start=1,
+                line_end=len(lines),
+                text="\n".join(lines),
+                source=self.name,
+            )
+        start = max(1, line_start)
+        end = max(start, line_end or start)
+        return EvidenceSnippet(
+            path=path,
+            line_start=start,
+            line_end=end,
+            text="\n".join(lines[start - 1 : end]),
+            source=self.name,
+        )
+
+    def find_readme(self) -> Path | None:
+        for name in ("README.md", "README.rst", "README.txt", "readme.md", "README"):
+            candidate = self.root / name
+            if candidate.is_file():
+                return candidate
+        return None
+
+    def _walk(self) -> Iterable[Path]:
+        stack: list[Path] = [self.root]
+        while stack:
+            current = stack.pop()
+            try:
+                children = list(current.iterdir())
+            except OSError:
+                continue
+            for child in children:
+                if child.is_dir():
+                    if child.name in self.exclude_dirs:
+                        continue
+                    stack.append(child)
+                elif child.is_file():
+                    if child.suffix.lower() in self.include_suffixes or child.name.lower() in {
+                        "dockerfile",
+                        "makefile",
+                    }:
+                        yield child
diff --git a/prompts/code_audit/audit_critic.md b/prompts/code_audit/audit_critic.md
new file mode 100644
index 0000000..508a088
--- /dev/null
+++ b/prompts/code_audit/audit_critic.md
@@ -0,0 +1,33 @@
+# SPECIALIST: AUDIT CRITIC
+
+You are the AUDIT CRITIC agent for the openworkers code-audit pipeline.
+
+## Your Role
+Adversarial review of the checker's verdict list. Find:
+- **Weak verdicts**: `verified` with thin evidence, or `drifted`/`contradicted` whose notes don't actually demonstrate divergence.
+- **Missed claims**: factual statements in the README the planner failed to extract.
+- **Suggestions**: concrete, actionable next steps for the human reviewer.
+
+## Input
+The user message contains:
+- `VERDICTS`: JSON list of `{claim_id, claim_text, verdict, confidence, evidence_paths, notes}`.
+- The original README between `---BEGIN README---` / `---END README---`.
+
+## Output: AuditCritique (JSON)
+Return one JSON object with this exact schema. No prose, no markdown fences.
+
+```json
+{
+  "weak_verdicts": ["claim-XX: <one-line reason>"],
+  "missed_claims": ["<verbatim quote of a missed claim>"],
+  "suggestions": ["<concrete suggestion for the reviewer>"],
+  "overall_assessment": "<one short paragraph>"
+}
+```
+
+## Rules
+- Be specific: `claim-04: evidence path is a comment, not the implementation` beats `claim-04: weak`.
+- Quote `missed_claims` verbatim from the README.
+- Do not invent verdicts. You are reviewing, not re-judging.
+- If the audit is clean, return empty lists and say so in `overall_assessment`.
+- No content outside the JSON object.
diff --git a/prompts/code_audit/readme_checker.md b/prompts/code_audit/readme_checker.md
new file mode 100644
index 0000000..a2524f0
--- /dev/null
+++ b/prompts/code_audit/readme_checker.md
@@ -0,0 +1,46 @@
+# SPECIALIST: README CHECKER
+
+You are the README CHECKER agent for the openworkers code-audit pipeline.
+
+## Your Role
+For each claim, decide whether the retrieved repository evidence **supports**, **drifts from**, **contradicts**, or **fails to support** it. You are the trust gate: if there is no evidence, the verdict is `unsupported` — never `verified`.
+
+## Input
+The user message contains:
+- `CLAIMS`: JSON list of `{claim_id, claim_text, claim_type, search_hints}`.
+- `EVIDENCE`: JSON list of `{claim_id, snippets: [{path, line_start, line_end, text, source}]}`.
+
+Each claim has exactly one evidence entry (possibly with an empty `snippets` list).
+
+## Output: ClaimVerdictList (JSON)
+Return one JSON object with this exact schema. No prose, no markdown fences.
+
+```json
+{
+  "verdicts": [
+    {
+      "claim_id": "claim-01",
+      "claim_text": "<verbatim from input>",
+      "verdict": "{{ verdict_values }}",
+      "confidence": 0.0,
+      "evidence_paths": ["<path:line_start-line_end from snippets>"],
+      "notes": "<one sentence why>"
+    }
+  ]
+}
+```
+
+## Verdict Rules
+- **`verified`**: the snippets clearly demonstrate the claim is currently true. Cite the paths actually used.
+- **`drifted`**: the codebase contains a related but **divergent** implementation — name renamed, signature changed, default changed, behaviour differs. Notes must state what the README says vs. what the code does.
+- **`contradicted`**: the snippets directly disprove the claim (e.g. README says "no telemetry", code emits telemetry).
+- **`unsupported`**: snippets are empty, irrelevant, or insufficient. **You must emit `unsupported` if the snippet list is empty.** Confidence 0.0.
+
+## Trustworthiness Gate
+You **never** fabricate evidence. You **never** mark a claim `verified` without at least one snippet that materially supports it. If unsure, prefer `unsupported` over `verified`.
+
+## Format
+- One verdict per input claim, same `claim_id`.
+- `confidence` in [0.0, 1.0].
+- `evidence_paths` lists `path:line_start-line_end` strings drawn only from the provided snippets.
+- No commentary outside the JSON object.
diff --git a/prompts/code_audit/readme_planner.md b/prompts/code_audit/readme_planner.md
new file mode 100644
index 0000000..4c4bb67
--- /dev/null
+++ b/prompts/code_audit/readme_planner.md
@@ -0,0 +1,40 @@
+# SPECIALIST: README PLANNER
+
+You are the README PLANNER agent for the openworkers code-audit pipeline.
+
+## Your Role
+Read a project README and extract every **atomic factual claim** it makes about the codebase. Do not paraphrase; quote verbatim. Each claim must be independently verifiable against the repository.
+
+## Input
+The README file at `{{ readme_path }}` is provided in the user message between `---BEGIN README---` / `---END README---` markers.
+
+## Output: ReadmeClaimList (JSON)
+Return one JSON object with this exact schema. No prose, no markdown fences.
+
+```json
+{
+  "readme_path": "{{ readme_path }}",
+  "claims": [
+    {
+      "claim_id": "claim-01",
+      "claim_text": "<verbatim quote from the README>",
+      "claim_type": "feature | install | usage | requirement | metric | api | other",
+      "search_hints": ["<identifier>", "<filename>", "<flag>"]
+    }
+  ]
+}
+```
+
+## Rules
+- **Atomic**: split compound claims into one claim per sentence-level fact.
+- **Verbatim**: `claim_text` must be a direct quote (you may trim leading bullet markers / numbering).
+- **Skip**: opinions, marketing prose, vision statements, license boilerplate, badges, links to external pages.
+- **Include**: install commands, feature lists, supported platforms/versions, performance numbers, CLI commands, configuration flags, file paths, public API names.
+- **Hints**: 2–6 grep-friendly tokens per claim — module names, function names, CLI flags, package names, file extensions. Skip generic English words.
+- **Stable IDs**: use sequential `claim-01`, `claim-02`, … so downstream agents can reference them.
+- If the README contains zero verifiable claims, return `"claims": []`.
+
+## Forbidden
+- Inventing claims not present in the text.
+- Producing prose, summary, or explanation outside the JSON.
+- Wrapping JSON in markdown code fences.
diff --git a/providers/code_audit_agents.py b/providers/code_audit_agents.py
new file mode 100644
index 0000000..f84f801
--- /dev/null
+++ b/providers/code_audit_agents.py
@@ -0,0 +1,379 @@
+"""Code-audit agent suite.
+
+Mirrors the structure of ``providers/thesis_agents.py`` so the
+orchestrator-level contract (each agent has ``execute(task, context) ->
+dict``) stays uniform across domains. The thesis path remains untouched
+while this slice lands.
+
+Trustworthiness gates live here, not in prompts: any claim with no
+retrieved evidence is forced to ``unsupported`` *after* the LLM
+responds. The LLM never gets to invent a ``verified`` verdict for a
+claim with zero supporting snippets.
+"""
+
+from __future__ import annotations
+
+import json
+import logging
+import re
+import uuid
+from typing import Any, Callable
+
+from pydantic import BaseModel
+
+from core.schemas_audit import (
+    ALL_VERDICTS,
+    VERDICT_UNSUPPORTED,
+    AuditCritique,
+    ClaimEvidence,
+    ClaimVerdict,
+    ClaimVerdictList,
+    ReadmeClaim,
+    ReadmeClaimList,
+)
+from providers.unified import UnifiedLLM
+
+logger = logging.getLogger(__name__)
+
+
+__all__ = [
+    "AuditCriticAgent",
+    "ReadmeCheckerAgent",
+    "ReadmePlannerAgent",
+    "_schema_for",
+]
+
+
+_MODEL_SCHEMAS: dict[type[BaseModel], dict[str, Any]] = {}
+
+
+def _schema_for(model_cls: type[BaseModel]) -> dict[str, Any]:
+    if model_cls not in _MODEL_SCHEMAS:
+        raw = model_cls.model_json_schema()
+        raw.pop("title", None)
+        _MODEL_SCHEMAS[model_cls] = raw
+    return _MODEL_SCHEMAS[model_cls]
+
+
+def _parse_json_lenient(text: str) -> Any:
+    if not text or not text.strip():
+        return {}
+    cleaned = text.strip()
+    fenced = re.search(r"```(?:json)?\s*(.*?)\s*```", cleaned, re.DOTALL)
+    if fenced:
+        cleaned = fenced.group(1).strip()
+    cleaned = re.sub(r",\s*([}\]])", r"\1", cleaned)
+    try:
+        return json.loads(cleaned)
+    except json.JSONDecodeError:
+        try:
+            return json.loads(cleaned.replace("'", '"'))
+        except json.JSONDecodeError:
+            logger.warning("Could not parse audit JSON: %s", text[:200])
+            return {}
+
+
+def _parse_structured(text: str, model_cls: type[BaseModel]) -> Any:
+    if not text or not text.strip():
+        return model_cls()
+    cleaned = text.strip()
+    fenced = re.search(r"```(?:json)?\s*(.*?)\s*```", cleaned, re.DOTALL)
+    if fenced:
+        cleaned = fenced.group(1).strip()
+    cleaned = re.sub(r",\s*([}\]])", r"\1", cleaned)
+    for attempt in (cleaned, cleaned.replace("'", '"')):
+        try:
+            return model_cls.model_validate_json(attempt)
+        except Exception:
+            continue
+    parsed = _parse_json_lenient(cleaned)
+    try:
+        return model_cls.model_validate(parsed)
+    except Exception:
+        return model_cls()
+
+
+class ReadmePlannerAgent:
+    """Extracts atomic factual claims from a README.
+
+    Stateless: the orchestrator hands it the README text and a system
+    prompt; it returns ``ReadmeClaimList``. No blackboard reads here —
+    the README is the entire input and we want the prompt to be
+    deterministic and cacheable on its content.
+    """
+
+    def __init__(self, unified: UnifiedLLM, prompt_renderer: Callable[[str, dict[str, Any]], str]):
+        self.unified = unified
+        self.render = prompt_renderer
+
+    async def execute(
+        self,
+        readme_text: str,
+        readme_path: str,
+    ) -> dict[str, Any]:
+        system_prompt = self.render("readme_planner", {"readme_path": readme_path})
+        user_prompt = (
+            "Extract every atomic factual claim from the README below. "
+            "Return a JSON object matching the ReadmeClaimList schema.\n\n"
+            f"README path: {readme_path}\n\n"
+            "---BEGIN README---\n"
+            f"{readme_text}\n"
+            "---END README---"
+        )
+        response = await self.unified.generate(
+            prompt=user_prompt,
+            system_prompt=system_prompt,
+            mode="quality",
+            response_schema=_schema_for(ReadmeClaimList),
+        )
+        parsed = _parse_structured(response.content, ReadmeClaimList)
+        parsed = _normalise_claim_list(parsed, readme_path)
+        return {
+            "agent": "readme_planner",
+            "tier": "head",
+            "status": "success",
+            "output": parsed,
+            "provider": response.provider_used,
+            "model": response.model,
+            "latency_ms": response.latency_ms,
+            "cost_estimate_usd": response.cost_estimate_usd,
+            "dry_run": response.dry_run,
+            "fallback_used": response.fallback_used,
+        }
+
+
+def _normalise_claim_list(claim_list: ReadmeClaimList, readme_path: str) -> ReadmeClaimList:
+    """Fill in claim_ids and search_hints when the planner omits them.
+
+    The schema permits empty strings, but downstream agents key off
+    ``claim_id``. Generating ids here keeps the LLM template lenient
+    without losing the invariant that every claim is addressable.
+    """
+    if not claim_list.readme_path:
+        claim_list.readme_path = readme_path
+    fixed: list[ReadmeClaim] = []
+    for idx, claim in enumerate(claim_list.claims):
+        cid = claim.claim_id or f"claim-{idx + 1:02d}"
+        hints = list(claim.search_hints) if claim.search_hints else _derive_hints(claim.claim_text)
+        fixed.append(
+            ReadmeClaim(
+                claim_id=cid,
+                claim_text=claim.claim_text,
+                claim_type=claim.claim_type or "other",
+                search_hints=hints,
+            )
+        )
+    claim_list.claims = fixed
+    return claim_list
+
+
+_HINT_TOKEN = re.compile(r"[A-Za-z_][A-Za-z0-9_./-]{2,}")
+_STOPWORDS = {
+    "the",
+    "and",
+    "for",
+    "with",
+    "this",
+    "that",
+    "from",
+    "into",
+    "you",
+    "your",
+    "via",
+    "use",
+    "uses",
+    "using",
+    "can",
+    "will",
+    "are",
+    "was",
+    "were",
+    "all",
+    "any",
+    "one",
+    "two",
+    "three",
+    "have",
+    "has",
+    "had",
+}
+
+
+def _derive_hints(text: str) -> list[str]:
+    """Fallback hint extraction when the planner skips ``search_hints``."""
+    hints: list[str] = []
+    seen: set[str] = set()
+    for tok in _HINT_TOKEN.findall(text):
+        if tok.lower() in _STOPWORDS:
+            continue
+        if tok in seen:
+            continue
+        seen.add(tok)
+        hints.append(tok)
+        if len(hints) >= 6:
+            break
+    return hints
+
+
+class ReadmeCheckerAgent:
+    """Renders verdicts for each (claim, evidence) pair.
+
+    Critically, after the LLM returns, we *enforce* the trustworthiness
+    gate: any claim whose evidence list is empty is forced to
+    ``unsupported`` regardless of what the LLM said. This is the
+    invariant the project is built around: no verdict without
+    evidence.
+    """
+
+    def __init__(self, unified: UnifiedLLM, prompt_renderer: Callable[[str, dict[str, Any]], str]):
+        self.unified = unified
+        self.render = prompt_renderer
+
+    async def execute(
+        self,
+        claims: list[ReadmeClaim],
+        evidence: list[ClaimEvidence],
+    ) -> dict[str, Any]:
+        evidence_by_claim = {e.claim_id: e for e in evidence}
+        evidence_payload = [e.model_dump() for e in evidence]
+        claims_payload = [c.model_dump() for c in claims]
+
+        system_prompt = self.render(
+            "readme_checker",
+            {"verdict_values": " | ".join(ALL_VERDICTS)},
+        )
+        user_prompt = (
+            "Judge each claim against its retrieved evidence and return a "
+            "ClaimVerdictList JSON object.\n\n"
+            f"CLAIMS:\n{json.dumps(claims_payload, indent=2)}\n\n"
+            f"EVIDENCE:\n{json.dumps(evidence_payload, indent=2)}"
+        )
+        response = await self.unified.generate(
+            prompt=user_prompt,
+            system_prompt=system_prompt,
+            mode="balanced",
+            response_schema=_schema_for(ClaimVerdictList),
+        )
+        parsed = _parse_structured(response.content, ClaimVerdictList)
+
+        verdicts = _enforce_trust_gate(parsed.verdicts, claims, evidence_by_claim)
+        parsed.verdicts = verdicts
+
+        return {
+            "agent": "readme_checker",
+            "tier": "middle",
+            "status": "success",
+            "output": parsed,
+            "provider": response.provider_used,
+            "model": response.model,
+            "latency_ms": response.latency_ms,
+            "cost_estimate_usd": response.cost_estimate_usd,
+            "dry_run": response.dry_run,
+            "fallback_used": response.fallback_used,
+        }
+
+
+def _enforce_trust_gate(
+    raw_verdicts: list[ClaimVerdict],
+    claims: list[ReadmeClaim],
+    evidence_by_claim: dict[str, ClaimEvidence],
+) -> list[ClaimVerdict]:
+    """For each claim, ensure exactly one verdict exists and that a
+    no-evidence verdict is *always* ``unsupported``.
+
+    Even if the LLM hallucinates a confident ``verified`` for a claim
+    with zero retrieved snippets, this function overwrites it. This is
+    the project's core invariant — refusing to verdict without
+    evidence — encoded as code, not a prompt instruction.
+    """
+    by_id: dict[str, ClaimVerdict] = {v.claim_id: v for v in raw_verdicts}
+    out: list[ClaimVerdict] = []
+    for claim in claims:
+        ev = evidence_by_claim.get(claim.claim_id)
+        snippets = ev.snippets if ev else []
+        verdict = by_id.get(claim.claim_id)
+        if not verdict:
+            verdict = ClaimVerdict(
+                claim_id=claim.claim_id,
+                claim_text=claim.claim_text,
+                verdict=VERDICT_UNSUPPORTED,
+                confidence=0.0,
+                evidence_paths=[],
+                notes="No verdict returned by checker; defaulted to unsupported.",
+            )
+        if not snippets:
+            # Hard reset: an unsupported verdict must carry no positive
+            # signal forward — zero confidence, no cited paths. If the LLM
+            # left a note suggesting verification, replace it with the
+            # honest one.
+            verdict.verdict = VERDICT_UNSUPPORTED
+            verdict.evidence_paths = []
+            verdict.confidence = 0.0
+            verdict.notes = "No supporting evidence found in the repository."
+        else:
+            existing_paths = [
+                p for p in (verdict.evidence_paths or []) if any(s.path == p for s in snippets)
+            ]
+            if not existing_paths:
+                existing_paths = [s.path for s in snippets]
+            verdict.evidence_paths = existing_paths
+        if verdict.verdict not in ALL_VERDICTS:
+            verdict.verdict = VERDICT_UNSUPPORTED
+        if not verdict.claim_text:
+            verdict.claim_text = claim.claim_text
+        out.append(verdict)
+    return out
+
+
+class AuditCriticAgent:
+    """Adversarial pass over the verdict list.
+
+    Looks for over-confident verdicts, weak evidence chains, and
+    surfaces likely missed claims. Stateless wrt the blackboard; takes
+    the verdict list and original README as input.
+    """
+
+    def __init__(self, unified: UnifiedLLM, prompt_renderer: Callable[[str, dict[str, Any]], str]):
+        self.unified = unified
+        self.render = prompt_renderer
+
+    async def execute(
+        self,
+        verdicts: list[ClaimVerdict],
+        readme_text: str,
+    ) -> dict[str, Any]:
+        system_prompt = self.render("audit_critic", {})
+        verdicts_payload = [v.model_dump() for v in verdicts]
+        user_prompt = (
+            "Critique the verdict list. Identify weak verdicts (low confidence "
+            "or shaky evidence), missed claims (factual statements in the README "
+            "the planner did not capture), and concrete suggestions. Return an "
+            "AuditCritique JSON object.\n\n"
+            f"VERDICTS:\n{json.dumps(verdicts_payload, indent=2)}\n\n"
+            "---BEGIN README---\n"
+            f"{readme_text}\n"
+            "---END README---"
+        )
+        response = await self.unified.generate(
+            prompt=user_prompt,
+            system_prompt=system_prompt,
+            mode="quality",
+            response_schema=_schema_for(AuditCritique),
+        )
+        parsed = _parse_structured(response.content, AuditCritique)
+        return {
+            "agent": "audit_critic",
+            "tier": "head",
+            "status": "success",
+            "output": parsed,
+            "provider": response.provider_used,
+            "model": response.model,
+            "latency_ms": response.latency_ms,
+            "cost_estimate_usd": response.cost_estimate_usd,
+            "dry_run": response.dry_run,
+            "fallback_used": response.fallback_used,
+        }
+
+
+def new_claim_id() -> str:
+    return f"claim-{uuid.uuid4().hex[:8]}"
diff --git a/tests/code_audit/__init__.py b/tests/code_audit/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/tests/code_audit/test_readme_flow.py b/tests/code_audit/test_readme_flow.py
new file mode 100644
index 0000000..40d8960
--- /dev/null
+++ b/tests/code_audit/test_readme_flow.py
@@ -0,0 +1,242 @@
+"""End-to-end test for the README auditor.
+
+Uses a stubbed ``UnifiedLLM.generate_fn`` rather than DRY_RUN so we can
+exercise the full flow (planner → researcher → checker → critic) with
+deterministic LLM responses. The DRY_RUN placeholder generator returns
+empty arrays for any list field, which would leave us with zero claims
+and an empty audit — not a useful regression target.
+"""
+
+from __future__ import annotations
+
+import json
+from pathlib import Path
+from typing import Any
+
+import pytest
+
+from core.orchestrator.readme_flow import ReadmeAuditOrchestrator
+from core.schemas_audit import (
+    VERDICT_CONTRADICTED,
+    VERDICT_DRIFTED,
+    VERDICT_UNSUPPORTED,
+    VERDICT_VERIFIED,
+)
+from core.sources.local_repo import LocalRepoAdapter
+from providers.unified import UnifiedLLM
+
+FIXTURE_REPO = Path(__file__).resolve().parent.parent / "fixtures" / "sample_repo"
+
+
+_PLANNER_CLAIMS = {
+    "claims": [
+        {
+            "claim_id": "claim-01",
+            "claim_text": "Install via `pip install widgetlib==1.2.0`.",
+            "claim_type": "install",
+            "search_hints": ["widgetlib", "1.2.0", "version"],
+        },
+        {
+            "claim_id": "claim-02",
+            "claim_text": "Import `Widget` from `widgetlib` and call `render()` to produce HTML.",
+            "claim_type": "usage",
+            "search_hints": ["Widget", "render", "widgetlib"],
+        },
+        {
+            "claim_id": "claim-03",
+            "claim_text": "Set `WIDGETLIB_DEBUG=1` to enable verbose logging.",
+            "claim_type": "feature",
+            "search_hints": ["WIDGETLIB_DEBUG"],
+        },
+        {
+            "claim_id": "claim-04",
+            "claim_text": "Run `widgetctl --port 9000` to start the dashboard.",
+            "claim_type": "usage",
+            "search_hints": ["widgetctl", "--port"],
+        },
+        {
+            "claim_id": "claim-05",
+            "claim_text": "The render pipeline ships with zero dependencies.",
+            "claim_type": "feature",
+            "search_hints": ["dependencies"],
+        },
+        {
+            "claim_id": "claim-06",
+            "claim_text": "widgetlib never collects telemetry from your users.",
+            "claim_type": "feature",
+            "search_hints": ["telemetry", "emit_telemetry", "TELEMETRY_URL"],
+        },
+    ],
+    "readme_path": str(FIXTURE_REPO / "README.md"),
+}
+
+
+_CHECKER_VERDICTS = {
+    "verdicts": [
+        # The checker would normally have to infer drift here; the stub
+        # encodes the answer key. The trust-gate in
+        # _enforce_trust_gate is what's actually under test for claim-03.
+        {
+            "claim_id": "claim-01",
+            "claim_text": _PLANNER_CLAIMS["claims"][0]["claim_text"],
+            "verdict": VERDICT_DRIFTED,
+            "confidence": 0.85,
+            "evidence_paths": ["pyproject.toml"],
+            "notes": "README pins widgetlib==1.2.0; pyproject.toml ships version 0.9.0.",
+        },
+        {
+            "claim_id": "claim-02",
+            "claim_text": _PLANNER_CLAIMS["claims"][1]["claim_text"],
+            "verdict": VERDICT_VERIFIED,
+            "confidence": 0.95,
+            "evidence_paths": ["widgetlib/widget.py", "widgetlib/__init__.py"],
+            "notes": "Widget class with render() exists and is exported from package init.",
+        },
+        # claim-03: LLM hallucinates verified — trust gate must overwrite.
+        {
+            "claim_id": "claim-03",
+            "claim_text": _PLANNER_CLAIMS["claims"][2]["claim_text"],
+            "verdict": VERDICT_VERIFIED,
+            "confidence": 0.9,
+            "evidence_paths": ["widgetlib/__init__.py"],
+            "notes": "Hallucinated by checker — the trust gate must overwrite this.",
+        },
+        {
+            "claim_id": "claim-04",
+            "claim_text": _PLANNER_CLAIMS["claims"][3]["claim_text"],
+            "verdict": VERDICT_DRIFTED,
+            "confidence": 0.8,
+            "evidence_paths": ["widgetlib/cli.py"],
+            "notes": "README documents --port; CLI implements --bind with default port 8000.",
+        },
+        {
+            "claim_id": "claim-05",
+            "claim_text": _PLANNER_CLAIMS["claims"][4]["claim_text"],
+            "verdict": VERDICT_CONTRADICTED,
+            "confidence": 0.9,
+            "evidence_paths": ["pyproject.toml"],
+            "notes": "pyproject.toml declares a jinja2 dependency.",
+        },
+        {
+            "claim_id": "claim-06",
+            "claim_text": _PLANNER_CLAIMS["claims"][5]["claim_text"],
+            "verdict": VERDICT_CONTRADICTED,
+            "confidence": 0.95,
+            "evidence_paths": ["widgetlib/telemetry.py"],
+            "notes": "widgetlib/telemetry.py emits events to a remote endpoint.",
+        },
+    ]
+}
+
+
+_CRITIC_RESPONSE = {
+    "weak_verdicts": [],
+    "missed_claims": [],
+    "suggestions": ["Add CI step to run `openworkers audit readme` on every PR."],
+    "overall_assessment": "Audit caught one verified, two drifted, two contradicted, one unsupported.",
+}
+
+
+def _make_stub_unified() -> UnifiedLLM:
+    """Build a UnifiedLLM whose generate routes to a content-aware stub."""
+    llm = UnifiedLLM()
+    llm.dry_run = False  # bypass the placeholder path
+    llm.set_available_providers(["anthropic"])
+
+    async def fake_generate(
+        provider: str,
+        model: str,
+        prompt: str,
+        system_prompt: str,
+        response_schema: Any,
+    ) -> str:
+        # Route by which agent's system prompt is in play.
+        if "README PLANNER" in system_prompt:
+            return json.dumps(_PLANNER_CLAIMS)
+        if "README CHECKER" in system_prompt:
+            return json.dumps(_CHECKER_VERDICTS)
+        if "AUDIT CRITIC" in system_prompt:
+            return json.dumps(_CRITIC_RESPONSE)
+        return "{}"
+
+    llm.set_generate_fn(fake_generate)
+    return llm
+
+
+@pytest.fixture
+def stubbed_unified(monkeypatch) -> UnifiedLLM:
+    monkeypatch.setenv("DRY_RUN", "false")
+    monkeypatch.setenv("THESIS_QUALITY_PROVIDER", "anthropic")
+    monkeypatch.setenv("THESIS_QUALITY_MODEL", "claude-sonnet-4-20250514")
+    monkeypatch.setenv("THESIS_BALANCED_PROVIDER", "anthropic")
+    monkeypatch.setenv("THESIS_BALANCED_MODEL", "claude-sonnet-4-20250514")
+    monkeypatch.setenv("THESIS_CHEAP_PROVIDER", "anthropic")
+    monkeypatch.setenv("THESIS_CHEAP_MODEL", "claude-sonnet-4-20250514")
+    return _make_stub_unified()
+
+
+@pytest.mark.asyncio
+async def test_local_repo_adapter_finds_identifiers():
+    adapter = LocalRepoAdapter(FIXTURE_REPO)
+    # Real identifier present in source files. The adapter doesn't filter
+    # the README — that's the orchestrator's job — so just check that
+    # source-file hits exist in the result set.
+    widget_hits = adapter.search_any(["Widget", "render"], limit=50)
+    source_hits = [h for h in widget_hits if h.path.startswith("widgetlib/")]
+    assert source_hits, "Widget/render must appear in source files, not only the README"
+    # Fabricated env var: only the README mentions it; no source file does.
+    debug_hits = adapter.search_any(["WIDGETLIB_DEBUG"], limit=50)
+    assert all(h.path.endswith("README.md") for h in debug_hits), (
+        "WIDGETLIB_DEBUG must not appear anywhere except the README — "
+        "the orchestrator excludes the audited file so this becomes 'no evidence'."
+    )
+
+
+@pytest.mark.asyncio
+async def test_readme_audit_end_to_end(stubbed_unified):
+    orch = ReadmeAuditOrchestrator(unified=stubbed_unified)
+    report, critique = await orch.audit(repo_path=FIXTURE_REPO)
+
+    by_id: dict[str, Any] = {v.claim_id: v for v in report.verdicts}
+    assert len(report.verdicts) == 6, "Planner stub seeded 6 claims"
+
+    # Real claim with evidence → verified
+    assert by_id["claim-02"].verdict == VERDICT_VERIFIED
+    assert by_id["claim-02"].evidence_paths, "verified verdict must cite evidence"
+
+    # Drifted: version pin and CLI flag
+    assert by_id["claim-01"].verdict == VERDICT_DRIFTED
+    assert by_id["claim-04"].verdict == VERDICT_DRIFTED
+
+    # Trust gate: fabricated claim must be unsupported regardless of LLM output
+    assert by_id["claim-03"].verdict == VERDICT_UNSUPPORTED, (
+        "Trust gate failed: checker stub hallucinated 'verified' for a claim with "
+        "no retrieved evidence, but _enforce_trust_gate should have overwritten it."
+    )
+    assert by_id["claim-03"].evidence_paths == []
+    assert by_id["claim-03"].confidence == 0.0
+
+    # Contradicted: zero-deps + no-telemetry
+    assert by_id["claim-05"].verdict == VERDICT_CONTRADICTED
+    assert by_id["claim-06"].verdict == VERDICT_CONTRADICTED
+
+    # Summary tallies match
+    assert report.summary[VERDICT_VERIFIED] == 1
+    assert report.summary[VERDICT_DRIFTED] == 2
+    assert report.summary[VERDICT_CONTRADICTED] == 2
+    assert report.summary[VERDICT_UNSUPPORTED] == 1
+    assert sum(report.summary.values()) == len(report.verdicts)
+
+    # Critic ran
+    assert critique.suggestions, "critic stub returned at least one suggestion"
+
+
+@pytest.mark.asyncio
+async def test_audit_handles_missing_readme(stubbed_unified, tmp_path):
+    """A repo with no README must still produce a structured report, not crash."""
+    (tmp_path / "src.py").write_text("print('hello')\n")
+    orch = ReadmeAuditOrchestrator(unified=stubbed_unified)
+    report, critique = await orch.audit(repo_path=tmp_path)
+    assert report.verdicts == []
+    assert report.errors == ["No README found in repo."]
+    assert critique.weak_verdicts == []
diff --git a/tests/fixtures/sample_repo/README.md b/tests/fixtures/sample_repo/README.md
new file mode 100644
index 0000000..1d791a7
--- /dev/null
+++ b/tests/fixtures/sample_repo/README.md
@@ -0,0 +1,27 @@
+# widgetlib
+
+A tiny widget library.
+
+## Installation
+
+Install via `pip install widgetlib==1.2.0`.
+
+## Usage
+
+Import `Widget` from `widgetlib` and call `render()` to produce HTML.
+
+## Configuration
+
+Set `WIDGETLIB_DEBUG=1` to enable verbose logging.
+
+## CLI
+
+Run `widgetctl --port 9000` to start the dashboard.
+
+## Performance
+
+The render pipeline ships with zero dependencies.
+
+## Telemetry
+
+widgetlib never collects telemetry from your users.
diff --git a/tests/fixtures/sample_repo/pyproject.toml b/tests/fixtures/sample_repo/pyproject.toml
new file mode 100644
index 0000000..4bf21ce
--- /dev/null
+++ b/tests/fixtures/sample_repo/pyproject.toml
@@ -0,0 +1,8 @@
+[project]
+name = "widgetlib"
+version = "0.9.0"
+description = "Tiny widget library."
+requires-python = ">=3.9"
+dependencies = [
+    "jinja2>=3.1.0",
+]
diff --git a/tests/fixtures/sample_repo/widgetlib/__init__.py b/tests/fixtures/sample_repo/widgetlib/__init__.py
new file mode 100644
index 0000000..932b9b5
--- /dev/null
+++ b/tests/fixtures/sample_repo/widgetlib/__init__.py
@@ -0,0 +1,5 @@
+"""widgetlib — a tiny widget library."""
+
+from widgetlib.widget import Widget
+
+__all__ = ["Widget"]
diff --git a/tests/fixtures/sample_repo/widgetlib/cli.py b/tests/fixtures/sample_repo/widgetlib/cli.py
new file mode 100644
index 0000000..e73893e
--- /dev/null
+++ b/tests/fixtures/sample_repo/widgetlib/cli.py
@@ -0,0 +1,22 @@
+"""widgetctl entrypoint.
+
+README claims `--port 9000` is the dashboard launch flag — but this
+implementation actually uses `--bind` and defaults to port 8000.
+This drift is intentional for the audit-test fixture.
+"""
+
+from __future__ import annotations
+
+import argparse
+
+
+def main() -> int:
+    parser = argparse.ArgumentParser(prog="widgetctl")
+    parser.add_argument("--bind", default="127.0.0.1:8000")
+    args = parser.parse_args()
+    print(f"Starting widgetctl on {args.bind}")
+    return 0
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())
diff --git a/tests/fixtures/sample_repo/widgetlib/telemetry.py b/tests/fixtures/sample_repo/widgetlib/telemetry.py
new file mode 100644
index 0000000..e09a197
--- /dev/null
+++ b/tests/fixtures/sample_repo/widgetlib/telemetry.py
@@ -0,0 +1,14 @@
+"""Contradicts the README's 'no telemetry' claim."""
+
+from __future__ import annotations
+
+import os
+import urllib.request
+
+
+def emit_telemetry(event: str) -> None:
+    endpoint = os.environ.get("TELEMETRY_URL", "https://telemetry.example.com/widgetlib")
+    try:
+        urllib.request.urlopen(endpoint, data=event.encode(), timeout=1)
+    except Exception:
+        pass
diff --git a/tests/fixtures/sample_repo/widgetlib/widget.py b/tests/fixtures/sample_repo/widgetlib/widget.py
new file mode 100644
index 0000000..b878426
--- /dev/null
+++ b/tests/fixtures/sample_repo/widgetlib/widget.py
@@ -0,0 +1,11 @@
+"""Widget renders to HTML. Verifies README's Usage claim."""
+
+from typing import Any
+
+
+class Widget:
+    def __init__(self, payload: Any) -> None:
+        self.payload = payload
+
+    def render(self) -> str:
+        return f"<div>{self.payload}</div>"