diff --git a/RELEASING.md b/RELEASING.md
new file mode 100644
index 00000000..605dca36
--- /dev/null
+++ b/RELEASING.md
@@ -0,0 +1,69 @@
+# Releasing `ordvec`
+
+> **Publish is held.** A real `cargo publish` / PyPI publish happens only
+> on the maintainer's explicit go. CI never publishes for real — the crate job
+> runs `cargo publish -p ordvec --dry-run --locked`, and the PyPI wheel is
+> `publish = false` on crates.io and ships separately.
+
+`ordvec` (the Rust crate) and `ordvec` on PyPI (the PyO3 wheel built from
+`ordvec-python/`) are released by **manually dispatching** the release
+workflows. Nothing ships on a tag push or a merge.
+
+## Release pipeline controls
+
+Both `release-crate.yml` and `release-python.yml`:
+
+- are **`workflow_dispatch`-only** (no `push` / tag trigger);
+- run a **`require-ci-green`** gate confirming `ci.yml` (and, for the wheel,
+  `python.yml`) are green for the target commit on `main`;
+- publish via **OIDC trusted publishing** (no long-lived crates.io / PyPI
+  tokens in the repo);
+- emit **SLSA build provenance** (`actions/attest-build-provenance`) **before**
+  publishing — a failed attestation fails the release closed, so nothing ships
+  without provenance recorded first;
+- pin every third-party action by **commit SHA**, set
+  `persist-credentials: false`, and default to `permissions: contents: read`.
+
+`release-python.yml` additionally produces **PEP 740** attestations via the PyPI
+Trusted Publishing step.
+
+### Environment protection (configured in repo settings, not in code)
+
+- **Required reviewer** — each environment (`crates-io`, `pypi`) requires
+  maintainer (`Fieldnote-Echo`) approval before the publish job runs.
+- **Deployment branch** — each environment is restricted to **`main`**, the
+  only ref a release may be dispatched from. This makes "only `main` can
+  publish" a configuration invariant rather than a manual check at approval
+  time.
+
+> These two settings are the supply-chain backstop the workflow code cannot
+> express on its own (THREAT-SUPPLY-001 in [THREAT_MODEL.md](THREAT_MODEL.md)).
+
+### Recommended (open)
+
+- A **`v*` tag-protection ruleset** (block update + deletion) and a basic
+  `main` ruleset, so a release tag cannot be force-moved and `main` cannot be
+  force-pushed/deleted (THREAT-SUPPLY-002). Registries are already immutable
+  (crates.io is yank-only; PyPI burns a version on delete), so this closes the
+  remaining GitHub-side mutability surface.
+
+## Checklist
+
+1. Land everything on `main`; confirm the working tree and `Cargo.lock` are in
+   sync (`cargo build --locked`).
+2. Bump the version (crate `Cargo.toml`, and `ordvec-python` if the wheel
+   changed) and update `CHANGELOG.md`. Commit on `main`.
+3. Confirm CI is **green for that exact `main` SHA** (the dispatch ref must be
+   `main` — the environment will refuse any other branch).
+4. Get the maintainer's explicit go to publish.
+5. Dispatch `release-crate.yml` (crate) and/or `release-python.yml` (wheel)
+   from **`main`**.
+6. Approve the environment deployment when prompted (required reviewer).
+7. Verify the published artifact (crates.io / docs.rs / PyPI) and its
+   provenance, and — for a coordinated release — the Zenodo deposit.
+
+## Coordinated release note
+
+The crate publish, the PyPI wheel, and the paper's Zenodo deposit are
+coordinated (the paper consumes the bindings for a final cold-repro run). Do
+not ship one leg in isolation without the maintainer's go.
diff --git a/SECURITY.md b/SECURITY.md
index cb71eee6..c2cba27b 100644
--- a/SECURITY.md
+++ b/SECURITY.md
@@ -18,4 +18,12 @@ We aim to acknowledge reports within a few business days.
 `ordvec` parses serialized index files (`.tvr` / `.tvrq` / `.tvbm` /
 `.tvsb`); the loaders are fuzzed (`cargo +nightly fuzz`), so
 parsing-robustness reports against the deserialization paths are especially
-welcome.
+welcome. Reports are also welcome against the `unsafe` SIMD kernels (shape /
+bounds invariants), the Python FFI contract (buffer handling, GIL discipline),
+and the release pipeline.
+
+## Threat model
+
+See [`THREAT_MODEL.md`](THREAT_MODEL.md) for the full attack-surface analysis —
+existing defenses, known residual risks, and the library-owned vs
+deployment-owned split.
diff --git a/THREAT_MODEL.md b/THREAT_MODEL.md
new file mode 100644
index 00000000..3786f416
--- /dev/null
+++ b/THREAT_MODEL.md
@@ -0,0 +1,450 @@
+# Threat Model — `ordvec`
+
+> **Status:** v0.2.0 (pre-1.0), 2026-05-25. This is the maintained threat model
+> for the `ordvec` Rust crate and the `ordvec` PyO3/maturin Python bindings. It
+> is reviewed when the attack surface changes (new persistence formats, new
+> `unsafe` kernels, new FFI surface, or release-pipeline changes).
+>
+> Scope discipline: `ordvec` is a **pure computational library** — no network
+> surface, no authentication/authorization, no secrets handling, no
+> multi-tenancy of its own. This document deliberately does **not** enumerate
+> web-application threats (SQLi/XSS/CSRF/session) that do not apply. It covers
+> the surfaces that actually exist: untrusted-input parsing, `unsafe` SIMD, the
+> Python FFI boundary, the supply chain, and resource use under untrusted
+> callers. Deployment-owned risks (corpus trust, co-tenancy, admission control)
+> are documented as *context* for integrators, not as library action items.
+
+See also: [`SECURITY.md`](SECURITY.md) (reporting), [`RELEASING.md`](RELEASING.md)
+(release controls), [`docs/INDEX_PROVENANCE.md`](docs/INDEX_PROVENANCE.md)
+(what the loaders do and do not guarantee).
+
+---
+
+## Scope and security ownership
+
+**`ordvec` owns:**
+
+- Memory safety of all safe public APIs.
+- Robust rejection of malformed serialized index files — no panic, no OOM
+  abort, no silent data corruption, no trailing-data acceptance.
+- Deterministic, finite-input behavior for valid embeddings.
+- Clear, documented failure contracts for invalid caller input (non-finite
+  floats, dimension mismatches, shape errors) — panic in Rust, `ValueError`
+  in Python.
+- Supply-chain hygiene for the published crate and Python wheels.
+
+**`ordvec` does not own:**
+
+- Trustworthiness of the upstream embedding model.
+- Corpus provenance or document-level poisoning.
+- Authorization over which documents may be indexed or retrieved.
+- Tenant isolation or microarchitectural isolation on a hosting platform.
+- Cryptographic verification of index-file origin (callers add this externally
+  — see [`docs/INDEX_PROVENANCE.md`](docs/INDEX_PROVENANCE.md)).
+
+> A structurally valid index file can still be semantically malicious. The
+> loaders validate format invariants — not truth, authorization, or corpus
+> integrity.
+
+## Maintenance budget
+
+`ordvec` is maintained by a single primary contributor. Mitigations are
+prioritized when they are (1) low-maintenance once merged, (2) enforceable by
+tests or CI, (3) local to the library boundary, and (4) unlikely to add
+operational burden downstream. Heavyweight controls (mandatory index signing,
+long-running fuzz farms, service-level admission control) are documented as
+**deployment guidance** until there is maintainer capacity to own them. The
+absence of a second maintainer is itself a tracked supply-chain residual
+(see THREAT-SUPPLY-001).
+
+---
+
+## 1. Architecture and trust boundaries
+
+### 1.1 Component map
+
+| Layer | Components | Trust boundary |
+|---|---|---|
+| **Deserialization** | `rank_io.rs` — `.tvr` / `.tvrq` / `.tvbm` / `.tvsb` loaders | Untrusted filesystem / network byte stream |
+| **Compute kernels** | `fastscan.rs`, `quant_kernels.rs`, `bitmap.rs`, `sign_bitmap.rs` | Trust established after format validation |
+| **Index API** | `rank.rs`, `quant.rs`, `bitmap.rs`, `sign_bitmap.rs` | Caller-controlled query embeddings |
+| **Python FFI** | `ordvec-python` (PyO3 / maturin) | Python ↔ Rust boundary; NumPy buffers |
+| **CI / supply chain** | 12 GitHub Actions workflows; `Cargo.lock`; crates.io + PyPI | GitHub OIDC, crates.io, PyPI trust chains |
+
+The `fuzz/` directory holds **seven** cargo-fuzz targets: `load_rank`,
+`load_rankquant`, `load_bitmap`, `load_sign_bitmap` (deserialization);
+`roundtrip_rankquant` (write→load round-trip); `search_rankquant` (the
+single-rate ingest + asymmetric-search compute path); and `fastscan_b2` (the
+FastScan b=2 block-32 kernel — the one `unsafe`-heavy scan path the others do
+not reach).
+
+### 1.2 Deployment contexts (for integrators)
+
+- **Offline / batch indexing** — a trusted operator encodes a corpus and writes
+  index files. Low risk unless files later cross a trust boundary.
+- **Serving pipeline** — an index loaded at startup, then queried by
+  user-controlled embeddings. Query vectors cross the trust boundary on every
+  search call (see §6).
+- **RAG substrate** — `ordvec` retrieves the *k* nearest documents fed to an
+  LLM. The retrieval layer becomes a target for corpus-level poisoning; this is
+  a **deployment risk**, not a parser risk (see §7).
+- **Multi-tenant / cloud** — tenants sharing one process share SIMD execution
+  units. Microarchitectural isolation is a hosting-platform responsibility
+  (see THREAT-SIMD-002).
+
+---
+
+## 2. Deserialization threats (THREAT-DESER) — library-owned
+
+### 2.1 Existing defenses (code-verified)
+
+`rank_io.rs` implements layered parser hardening:
+
+- Magic + version checks before any allocation.
+- Fallible allocation via `try_reserve_exact` — an attacker-controlled length
+  field returns `InvalidData`, never an OOM abort.
+- All payload sizes computed with `usize::checked_mul`; overflow returns `Err`.
+- A 128 GiB `MAX_PAYLOAD` cap and `MAX_VECTORS` (64 Mi) / `MAX_DIM` caps,
+  enforced on **both** the load and write paths (the write-side cap runs
+  *before* `File::create`, so a rejected write cannot truncate an existing
+  file).
+- Exact file-length match (`check_payload_matches_file`): trailing bytes or
+  short files are rejected.
+- Per-row **structural** invariants: `Rank` rows must be a true permutation of
+  `[0, dim)` (verified by bound + duplicate checks ⇒ pigeonhole);
+  `RankQuant` rows must satisfy constant composition (uniform per-bucket
+  histogram); `Bitmap` rows must have exactly `n_top` bits set.
+- No `panic!` on malformed data — all validation returns
+  `io::Error(InvalidData)`.
+- The raw `rank_io` read/write functions are `pub(crate)`; the only public
+  persistence API is the index types' `write()` / `load()`, making the
+  write→load round-trip a type-level guarantee.
+
+The four loaders are covered by cargo-fuzz targets (the `load_*` targets).
+
+### 2.2 Index-file risk classes
+
+**THREAT-DESER-001 (library-owned, P4): Malformed index file.**
+The loader must reject corrupt/invalid files without panic, OOM, or
+trailing-data acceptance. The current implementation satisfies this for all
+four formats. *Residual:* `file.metadata()?.len()` is sampled at open time;
+on NFS/FUSE mounts with concurrent writers a TOCTOU window exists between
+`metadata()` and the reads. On writable shared mounts the practical outcome is
+a read error or `InvalidData`, not an exploit. *Likelihood:* Very Low.
+*Impact:* error surfaced. 
+
+**THREAT-DESER-002 (deployment-owned, P3 docs): Malicious-but-valid index.**
+A structurally valid index with semantically poisoned contents passes every
+parser check and returns attacker-influenced results. This is a *provenance*
+problem, not a parser problem. *Mitigation (no format change):*
+[`docs/INDEX_PROVENANCE.md`](docs/INDEX_PROVENANCE.md) documents that `ordvec`
+validates structure, not origin, and lists verification options (checksum
+manifest, artifact-store integrity, Sigstore / GitHub artifact attestation)
+for deployments where index files cross trust boundaries. An optional sidecar
+verifier (HMAC / BLAKE3) can be added later without a format bump; it is
+deliberately **not** shipped now (no concrete deployment requires it, and an
+in-format crypto layer would add unowned key management).
+
+---
+
+## 3. Unsafe SIMD and memory-safety threats (THREAT-SIMD) — library-owned
+
+### 3.1 What the FastScan kernel does
+
+`scan_b2_fastscan_avx512` uses unaligned loads (`_mm256_loadu_si256`),
+byte-shuffle LUT lookups (`_mm256_shuffle_epi8` / VPSHUFB), broadcast, widen
+(`_mm256_cvtepu8_epi16`, `_mm512_cvtepu16_epi32`), and accumulate
+(`_mm512_add_epi16/epi32`, `_mm512_storeu_si512`). It is a load/shuffle/widen/
+accumulate sequence with **no gather instructions**. The Intel DOWNFALL (GDS)
+vulnerability is specific to gather-based data sampling and does **not** apply
+to this kernel.
+
+### 3.2 Risks
+
+**THREAT-SIMD-001 (P1, mitigated this cycle; crate-wide rollout tracked):
+Unsafe-kernel invariant preservation under future refactors.**
+`scan_b2_fastscan_avx512` safety depends on caller-established invariants —
+`packed_fs.len() == n_blocks * pairs * 32` (formed via `checked_mul`, overflow
+⇒ caller panics) and `lut_u8.len() == pairs * 16`. These are asserted by the
+`pub(crate)` entry point `search_asymmetric_fastscan_b2` before dispatch, and
+`RankQuantFastscan::search` is the type-level safe wrapper that owns the shape
+by construction. A future refactor calling the inner function directly could
+bypass the asserts. *Mitigations:* the runtime asserts + the type wrapper are
+the primary boundary; the scalar-vs-SIMD equivalence test
+(`fastscan_b2_top10_matches_avx512_kernel`) guards behavior; and
+**`#![deny(unsafe_op_in_unsafe_fn)]` is now enforced in `fastscan.rs`**, so
+every unsafe operation in the kernel sits in an explicit `unsafe {}` block and
+stays visible to future edits. *Open:* roll the lint out crate-wide to the
+other SIMD modules (`bitmap.rs`, `sign_bitmap.rs`, `quant_kernels.rs`,
+`util.rs` NEON) — tracked as a follow-up.
+
+**THREAT-SIMD-002 (P4, deployment note): Microarchitectural side channels in
+co-tenancy.** `ordvec` does not claim protection against microarchitectural
+side channels under hostile multi-tenant co-residency. The kernel uses no
+gather instructions (ruling out DOWNFALL/GDS), but SIMD execution units are
+shared across SMT threads, and port-contention timing channels remain
+theoretically possible on vulnerable hardware. Sensitive deployments should
+avoid sharing physical cores across trust domains and rely on the
+OS/hypervisor side-channel posture. Not a library action item.
+
+**THREAT-SIMD-003 (P3): FastScan approximation is not CPU-dependent
+divergence.** The 8-bit global-affine LUT in `build_fastscan_b2_query`
+introduces `O(span/255)` per-pair approximation error — an intentional
+trade-off matching FAISS FastScan semantics, documented in the code. The
+scalar and AVX-512 paths agree on the same quantized inputs (equivalence test),
+and `TopK` uses `total_cmp` for deterministic tie-breaking across all paths.
+This is approximate *scoring*, not a CPU oracle. FastScan is a `#[doc(hidden)]`
+pre-ranker; callers needing exact scores use `RankQuant::search_asymmetric`.
+
+---
+
+## 4. Python FFI threats (THREAT-FFI) — binding-owned
+
+### 4.1 Existing defenses (code-verified)
+
+The binding takes `PyReadonlyArray`, rejects non-C-contiguous arrays with a
+clear `ValueError`, validates finiteness (`ensure_finite`), maps shape errors
+to `ValueError`, and releases the GIL (`py.detach`) around the pure-Rust
+(Rayon-parallel) compute in every heavy method while reading the input arrays
+in place. PyO3's `&mut self` borrow tracking means a second thread re-entering
+the **same** index object during a released-GIL call gets a clean
+`Already borrowed` `RuntimeError`, never concurrent mutation.
+
+### 4.2 Risks (documented contracts, implemented)
+
+**THREAT-FFI-001 (P2, documented): Concurrent input-array mutation during a
+released-GIL call.** `PyReadonlyArray` keeps the input buffer alive and blocks
+`rust-numpy`-mediated writes for the call's duration, but it cannot stop
+another thread or native extension from mutating the *same backing memory*
+through a reference obtained before the call. This can yield numerically
+inconsistent results — a numeric-extension contract issue, not a UAF. *Status:*
+documented in the module docstring and the per-method docs ("do not mutate an
+input array from another thread while an `ordvec` call is in progress"),
+matching the standard contract for GIL-releasing NumPy extensions. An optional
+`safe_copy=True` hard-isolation parameter remains a possible future ergonomic.
+
+**THREAT-FFI-002 (P2, documented): Unsanitized filesystem-path forwarding.**
+`write()` / `load()` forward the path to the filesystem unmodified (no `..` /
+traversal sanitization). A service exposing these path arguments to user input
+could enable traversal or arbitrary-file overwrite. This is a **caller
+responsibility**. *Status:* documented in the module docstring and on every
+`write`/`load` method ("treat the path as trusted input; web/multi-user
+applications must validate paths before calling"). 
+
+---
+
+## 5. Supply-chain threats (THREAT-SUPPLY)
+
+### 5.1 Existing controls (verified)
+
+**Workflow code (all 12 workflows):** third-party actions pinned by commit
+SHA; `persist-credentials: false` on every checkout; `permissions: contents:
+read` default. **Release workflows** (`release-crate.yml`, `release-python.yml`)
+are `workflow_dispatch`-only (no tag/push trigger), run a `require-ci-green`
+gate against `main`, publish via **OIDC trusted publishing** (no long-lived
+registry tokens), and emit **SLSA build provenance**
+(`actions/attest-build-provenance`) **before** publish — a failed attestation
+fails the release closed. `release-python` additionally gets **PEP 740**
+attestations via Trusted Publishing.
+
+**Static / supply-chain analysis:** **CodeQL** scans Rust, Python, and Actions
+(no-build databases); **OpenSSF Scorecard** publishes SARIF to code scanning
+and the score badge; **zizmor** audits workflow hardening (pinned); a
+`cargo-deny` / audit job gates advisories and licenses. The core crate has near
+zero non-Rust dependencies by design (the `deps` gate greps `cargo tree -p
+ordvec`); the Python binding's larger tree (numpy → ndarray) is intentional and
+scoped to the wheel.
+
+### 5.2 Risks
+
+**THREAT-SUPPLY-001 (mitigated; residual = single-maintainer account
+compromise): Release configuration and ownership.** The release **environments**
+(`pypi`, `crates-io`) now require **approval by the maintainer** and restrict
+deployment to the **`main`** branch only — so a release cannot be dispatched
+from an unmerged or attacker branch, and no publish runs without an explicit
+human approval. The remaining residual is *maintainer-account compromise*: a
+single owner is both dispatcher and approver, so account takeover (or social
+engineering) is not caught by a second human. *Mitigations:* strong 2FA /
+passkeys on the maintainer account; recruiting a **second owner/maintainer**
+(also an open OpenSSF Best-Practices item) — which would additionally make a
+deployment **wait timer** worthwhile (a second party able to cancel a bad
+release during the window). See [`RELEASING.md`](RELEASING.md).
+
+**THREAT-SUPPLY-002 (P3): Release immutability and tag integrity.** Published
+artifacts are **immutable by registry design** — crates.io is yank-only (a
+published version's bytes can never be overwritten) and PyPI burns a version on
+delete (no different artifact may be re-uploaded under the same version). So
+post-publish "silent replacement" of a version is not possible on either
+registry, and consumers can verify artifacts against the SLSA / PEP 740
+provenance above. *Residual (GitHub-side):* `changelog.yml` cuts tagged GitHub
+Releases, but the repo currently has **no tag-protection ruleset and no `main`
+ruleset**, so a tag could be force-moved or a release asset replaced.
+*Mitigation:* add a `v*` **tag ruleset** (block update + deletion) and a basic
+`main` ruleset; optionally enable GitHub immutable releases.
+
+**THREAT-SUPPLY-003 (P3): Typosquatting adjacent names.** Namespace-adjacent
+crate/package names (`ord-vec`, `ordvecs`, `order-vec`) could be registered to
+typosquat dependents. *Mitigation:* publish the first functional release
+promptly; optionally register adjacent names.
+
+---
+
+## 6. Query and resource-exhaustion threats (THREAT-QUERY) — library-adjacent
+
+These arise from correct behavior on large-but-valid inputs from untrusted
+callers, not from parser or unsafe bugs.
+
+**THREAT-QUERY-001 (P2, deployment docs): Caller-controlled batch / `k`
+exhaustion.** `result_buffer_len(nq, k)` checks `nq * k` overflow and panics
+loudly rather than under-allocating; `k` is clamped to `n_vectors`. But a
+serving application can still be CPU/memory-exhausted by large query batches
+(`nq`), large `k`, or concurrent scans over a large corpus. `ordvec` does not
+enforce service-level quotas — by design (it is a library, not a server).
+*Mitigation:* callers exposing search over a network must independently bound
+batch size, `k`, request rate, and corpus size; a configurable `max_nq` /
+`max_k` at the binding level is a possible future convenience.
+
+**THREAT-QUERY-002 (P3): Panic on contract violation in Rust server contexts.**
+Rust APIs fail fast on invalid contract input (non-finite floats, dimension /
+shape violations) via `assert!` / `expect`. In a Rust-native server an
+unhandled panic crashes the thread/process; the Python bindings convert these
+to typed `ValueError`. *Mitigation:* Rust service callers must validate
+untrusted input before calling, or catch panics at the request boundary.
+
+---
+
+## 7. Corpus and embedding poisoning (THREAT-POISON) — deployment-owned
+
+These sit **outside** the library's security perimeter; they are documented as
+context for integrators using `ordvec` as a RAG substrate. Corpus poisoning of
+embedding retrievers is a documented attack class (see PoisonedRAG and OWASP
+LLM08:2025 in the references); the mitigations are corpus provenance, ingestion
+access control, and (where applicable) hybrid lexical + vector retrieval — all
+deployment concerns. The points below are the `ordvec`-specific shape of that
+class.
+
+**THREAT-POISON-001: Ordinal rank inversion.** Because `ordvec` is
+training-free, the rank transform is deterministic and invertible. An attacker
+who controls the embedding pipeline can engineer an embedding whose ordinal
+(Spearman) correlation with target queries is maximized — the ordinal analogue
+of embedding-inversion attacks. `ordvec` has no codebook to protect and cannot
+prevent construction of maximally correlated embeddings; mitigation requires
+access control and provenance on the embedding source.
+
+**THREAT-POISON-002: Top-`n_top` overlap poisoning.** `Bitmap` scores documents
+by `popcount(Q AND D)`. The loader enforces exactly `n_top` bits per row, so an
+injected document cannot set arbitrary bits — the realistic attack is crafting a
+document whose top-`n_top` coordinates maximally overlap the most-queried
+coordinates. Requires knowledge of the query distribution and corpus write
+access.
+
+**THREAT-POISON-003: RankQuant boundary exploitation.** `RankQuant` uses
+equal-width bucket quantization; documents near bucket boundaries can be crafted
+to score highly under the coarse pre-filter yet differ under exact reranking,
+exploiting quantization information loss to pass the coarse stage. Requires
+knowledge of quantization parameters and the document distribution.
+
+---
+
+## 8. Fuzzing coverage (THREAT-FUZZ)
+
+Seven targets cover the four loaders, the write→load round-trip, the
+single-rate compute path, and (new) the FastScan kernel.
+
+**THREAT-FUZZ-001 (closed this cycle): FastScan path was unfuzzed.** The
+`fastscan_b2` target now drives `RankQuantFastscan` (`pack_fastscan_b2` +
+`search_asymmetric_fastscan_b2` + the scalar/AVX-512 kernel), crossing the
+32-doc block boundary so tail-padding blocks are exercised. On
+non-AVX-512 CI runners it exercises the scalar reference kernel; under Intel SDE
+it exercises the AVX-512 kernel.
+
+**THREAT-FUZZ-002 (P3): No CI-bound fuzzing for continuous regression.** Fuzzing
+is run manually; there is no CI gate. A bounded weekly smoke job (e.g.
+`-runs=50000` on `load_rank`, `load_rankquant`, and `fastscan_b2`) would catch
+regressions between manual runs. (Low overhead; weighed against maintenance
+budget.)
+
+*Note on `load_sign_bitmap`:* all bit patterns are structurally valid for sign
+bitmaps (no per-row invariant), so that target is correctly scoped to parser
+robustness — no OOM, no panic, no trailing-data acceptance.
+
+---
+
+## 9. CI/CD pipeline threats (THREAT-CICD)
+
+**THREAT-CICD-001 (P3, mitigated by control): Workflow injection via PR
+metadata.** If a `run:` step interpolated user-controlled context (PR title,
+branch name) into a shell expression via `${{ ... }}` without an `env:` hop, a
+script-injection could run in the runner. *Mitigation:* `zizmor` audits exactly
+this class of issue and runs in CI; pass user-controlled context through `env:`
+rather than inline `${{ }}` in `run:` blocks. SHA-pinned actions bound the
+blast radius of a compromised dependency separately.
+
+---
+
+## 10. Threat register
+
+| ID | Category | Owner | Description | Likelihood | Impact | Status / priority |
+|---|---|---|---|---|---|---|
+| THREAT-SIMD-001 | Memory safety | Library | Unsafe-kernel invariant bypass on refactor | Medium | High | **P1** — lint enforced in `fastscan.rs`; crate-wide rollout tracked |
+| THREAT-FFI-001 | FFI | Binding | Concurrent input mutation during released-GIL call | Medium | Medium | **P2** — documented contract |
+| THREAT-FFI-002 | FFI | Binding | Unsanitized path forwarding | Medium | Medium | **P2** — documented contract |
+| THREAT-SUPPLY-001 | Supply chain | Config | Release config / single-owner | Low | Critical | **Mitigated** (reviewer + main-only); residual = account compromise / 2nd owner |
+| THREAT-SUPPLY-002 | Supply chain | Config | Release immutability / tag integrity | Low | High | **P3** — registries immutable; add tag ruleset |
+| THREAT-SUPPLY-003 | Supply chain | Config | Typosquatting adjacent names | Medium | Medium | P3 |
+| THREAT-QUERY-001 | Resource | Deployment | Batch / `k` exhaustion in serving | Medium | Medium | **P2** — deployment docs |
+| THREAT-QUERY-002 | Resource | Deployment | Panic on contract violation (Rust servers) | Low | Medium | P3 |
+| THREAT-FUZZ-001 | Fuzzing | Library | FastScan path unfuzzed | Medium | High | **Closed** (`fastscan_b2` added) |
+| THREAT-FUZZ-002 | Fuzzing | Library | No CI-bound fuzzing | Medium | Medium | P3 |
+| THREAT-DESER-001 | Deserialization | Library | TOCTOU on shared mounts | Very Low | Low | P4 |
+| THREAT-DESER-002 | Provenance | Deployment | Malicious-but-valid index | Medium | High | P3 (docs — `INDEX_PROVENANCE.md`) |
+| THREAT-CICD-001 | CI/CD | Library | Workflow injection via PR metadata | Low | High | P3 — mitigated by `zizmor` |
+| THREAT-SIMD-002 | Side channel | Deployment | Microarchitectural co-tenancy (no gather) | Low | Medium | P4 |
+| THREAT-SIMD-003 | Semantic | Library | FastScan approximation (doc clarity) | Low | Low | P3 |
+| THREAT-POISON-001 | Index poisoning | Deployment | Ordinal rank inversion | Medium | High | Deployment |
+| THREAT-POISON-002 | Index poisoning | Deployment | Top-`n_top` overlap poisoning | Low | Medium | Deployment |
+| THREAT-POISON-003 | Index poisoning | Deployment | RankQuant boundary exploitation | Low | Low | Deployment |
+
+---
+
+## 11. Open mitigations
+
+**Done this cycle:** `#![deny(unsafe_op_in_unsafe_fn)]` in `fastscan.rs`
+(SIMD-001); `fastscan_b2` fuzz target (FUZZ-001); release-environment reviewers
++ main-only deployment (SUPPLY-001); [`docs/INDEX_PROVENANCE.md`](docs/INDEX_PROVENANCE.md)
+(DESER-002); [`RELEASING.md`](RELEASING.md) (SUPPLY-001).
+
+**Open, low cost:**
+
+1. Add a `v*` tag-protection ruleset (+ basic `main` ruleset) and optionally
+   enable GitHub immutable releases (THREAT-SUPPLY-002).
+2. Roll `#![deny(unsafe_op_in_unsafe_fn)]` out crate-wide across the remaining
+   SIMD modules (THREAT-SIMD-001).
+3. Add a bounded weekly CI fuzz smoke job (THREAT-FUZZ-002).
+4. Document recommended `nq` / `k` / corpus bounds for single-process serving
+   in the Rust and Python API docs (THREAT-QUERY-001).
+
+**Later (not release blockers):** a second maintainer/owner (then a release
+wait timer becomes meaningful); an optional sidecar index verifier
+(`ordvec verify` / external HMAC/BLAKE3 manifest) if a deployment requires
+tamper-evidence (DESER-002); a `safe_copy=True` FFI isolation option
+(FFI-001).
+
+---
+
+## References
+
+Only load-bearing, verifiable sources are listed.
+
+- **PoisonedRAG** — *Knowledge Corruption Attacks to Retrieval-Augmented
+  Generation of Large Language Models* (arXiv:2402.07867). Establishes that
+  injecting a small number of poisoned passages into a retriever corpus
+  achieves high attack-success rates — context for §7.
+- **OWASP LLM08:2025 — Vector and Embedding Weaknesses.** Retrieval-layer risk
+  class (poisoning, embedding inversion, access-control bypass) — context for
+  §7 / scope.
+- **"Memory-Safety Challenge Considered Solved? An In-Depth Study with All Rust
+  CVEs"** (arXiv:2003.03296). Real-world Rust memory-safety bugs require
+  `unsafe` code — the rationale for the §3 focus on the SIMD kernels.
+- **GitHub Security Lab — preventing pwn-requests.** Expression-injection in
+  `run:` steps and untrusted-context handling — basis for THREAT-CICD-001.
diff --git a/codecov.yml b/codecov.yml
new file mode 100644
index 00000000..05f4ee71
--- /dev/null
+++ b/codecov.yml
@@ -0,0 +1,25 @@
+# Codecov is a dashboard + README badge for this repo. The *enforced* coverage
+# gate is the cargo-llvm-cov `--fail-under-lines 78` floor in
+# .github/workflows/coverage.yml — set under the AVX-512-free runner figure:
+# the hosted coverage runner has no AVX-512, so the runtime SIMD dispatch never
+# reaches the AVX-512 kernels (they are exercised by the separate `avx512` job
+# under Intel SDE). See issue #68.
+coverage:
+  status:
+    project:
+      default:
+        target: 78%       # mirror the enforced cargo-llvm-cov floor
+        threshold: 1%
+    patch:
+      default:
+        # The AVX-512 kernels cannot be covered on the no-AVX-512 coverage
+        # runner, so patch coverage on any SIMD-kernel change is a false signal
+        # (touching a kernel re-indents lines the runner never executes — see
+        # #68). Keep patch advisory rather than blocking PRs on it; real
+        # coverage enforcement lives in the workflow floor above.
+        informational: true
+
+# The cargo-fuzz workspace is excluded from the crate build and is not part of
+# the tested surface measured by cargo-llvm-cov.
+ignore:
+  - "fuzz"
diff --git a/docs/INDEX_PROVENANCE.md b/docs/INDEX_PROVENANCE.md
new file mode 100644
index 00000000..ec8d3fce
--- /dev/null
+++ b/docs/INDEX_PROVENANCE.md
@@ -0,0 +1,55 @@
+# Index file provenance
+
+`ordvec` persists indexes as `.tvr` / `.tvrq` / `.tvbm` / `.tvsb` files and
+reloads them through `Rank::load`, `RankQuant::load`, `Bitmap::load`, and
+`SignBitmap::load`. This note states exactly **what the loaders guarantee and
+what they do not**, so you can decide whether an index file needs out-of-band
+verification before you load it.
+
+## What the loaders validate
+
+The loaders treat the byte stream as **untrusted input** and reject malformed
+files without panicking, aborting, or silently accepting garbage:
+
+- magic + version checks before any allocation;
+- fallible allocation (`try_reserve_exact`) — an attacker-controlled length
+  field returns `InvalidData`, never an OOM abort;
+- all payload sizes computed with `checked_mul`; overflow is an error;
+- a 128 GiB `MAX_PAYLOAD` cap plus `MAX_VECTORS` / `MAX_DIM` caps;
+- an exact file-length match (trailing bytes or short files are rejected);
+- per-row **structural** invariants: `Rank` rows must be a true permutation of
+  `[0, dim)`, `RankQuant` rows must satisfy constant composition, `Bitmap` rows
+  must have exactly `n_top` bits set.
+
+A file that survives all of this is **structurally well-formed**. The four
+loaders are exercised by `cargo fuzz` (the `load_*` targets).
+
+## What the loaders do NOT validate
+
+The loaders validate **structure, not origin or truth**:
+
+- They do **not** authenticate who produced the file or whether it was modified
+  in transit or at rest. There is no signature, MAC, or checksum in the format.
+- A **structurally valid but semantically poisoned** index — one whose ranks,
+  buckets, or bitmaps were crafted to bias retrieval — passes every check and
+  returns attacker-influenced results. This is a *provenance* problem, not a
+  parser problem (THREAT-DESER-002 / THREAT-POISON-\* in
+  [../THREAT_MODEL.md](../THREAT_MODEL.md)).
+
+## Guidance for deployments where index files cross a trust boundary
+
+If you load index files that were produced elsewhere, transferred over a
+network, or stored on shared/mutable infrastructure, verify them **before**
+loading using whatever your deployment already trusts:
+
+- a checksum manifest (e.g. SHA-256) recorded by the build that produced the
+  index, verified at load time;
+- your artifact store's integrity controls;
+- a signature / attestation layer (e.g. Sigstore, GitHub artifact attestations)
+  over the index files.
+
+`ordvec` deliberately ships **no** built-in signing/MAC layer today: without a
+concrete deployment requiring it, an in-format crypto layer would add key
+management with no clear owner. A sidecar verifier (e.g. an `ordvec verify`
+utility, or an external HMAC/BLAKE3 manifest) can be added later **without a
+file-format change** if a real deployment needs tamper-evidence.
diff --git a/fuzz/Cargo.lock b/fuzz/Cargo.lock
index 8e0abbb1..14cf0d4a 100644
--- a/fuzz/Cargo.lock
+++ b/fuzz/Cargo.lock
@@ -246,9 +246,9 @@ checksum = "9f7c3e4beb33f85d45ae3e3a1792185706c8e16d043238c593331cc7cd313b50"
 
 [[package]]
 name = "ordered-float"
-version = "4.6.0"
+version = "5.3.0"
 source = "registry+https://github.com/rust-lang/crates.io-index"
-checksum = "7bb71e1b3fa6ca1c61f383464aaf2bb0e2f8e772a1f01d486832464de363b951"
+checksum = "b7d950ca161dc355eaf28f82b11345ed76c6e1f6eb1f4f4479e0323b9e2fbd0e"
 dependencies = [
  "num-traits",
 ]
diff --git a/fuzz/Cargo.toml b/fuzz/Cargo.toml
index 1929471c..f94324be 100644
--- a/fuzz/Cargo.toml
+++ b/fuzz/Cargo.toml
@@ -64,3 +64,12 @@ path = "fuzz_targets/roundtrip_rankquant.rs"
 test = false
 doc = false
 bench = false
+
+# FastScan b=2 compute path (`RankQuantFastscan`): the one unsafe-heavy scan
+# kernel the `search_rankquant` target does not reach.
+[[bin]]
+name = "fastscan_b2"
+path = "fuzz_targets/fastscan_b2.rs"
+test = false
+doc = false
+bench = false
diff --git a/fuzz/fuzz_targets/fastscan_b2.rs b/fuzz/fuzz_targets/fastscan_b2.rs
new file mode 100644
index 00000000..5f7b5178
--- /dev/null
+++ b/fuzz/fuzz_targets/fastscan_b2.rs
@@ -0,0 +1,52 @@
+//! libFuzzer target for the FastScan b=2 compute path (`RankQuantFastscan`):
+//! `add` (rank_transform -> bucket -> block-32 re-pack via `pack_fastscan_b2`)
+//! then `search` (`search_asymmetric_fastscan_b2` -> the scalar / AVX-512
+//! VPSHUFB-LUT kernel -> TopK). This is the one `unsafe`-heavy scan path the
+//! `search_rankquant` target does NOT reach: `RankQuant::search_asymmetric`
+//! dispatches the single-rate kernels, never the FastScan block-32 kernel.
+//!
+//! `dim` is fixed at 64 — `RankQuantFastscan::new` requires `dim % 4 == 0`
+//! (b=2 constant composition) and `dim <= u16::MAX`; 64 also gives a
+//! `dim / 2 = 32`-pair inner loop. The fuzzer shapes the doc count (crossing
+//! the 32-doc block boundary so tail-padding blocks are exercised), the
+//! embedding/query values, and `k` (including `k == 0`). Values map to finite
+//! f32: the public API rejects NaN / ±Inf by contract, so raw float bit
+//! patterns would only re-exercise that guard, not the kernel.
+//!
+//! On CI runners without AVX-512 this drives the scalar reference kernel
+//! (`scan_b2_fastscan_scalar`); under Intel SDE it drives the AVX-512 kernel.
+//!
+//! Contract: no panic, abort, or out-of-bounds access on any input.
+#![no_main]
+
+use libfuzzer_sys::fuzz_target;
+use ordvec::RankQuantFastscan;
+
+fuzz_target!(|data: &[u8]| {
+    if data.len() < 3 {
+        return;
+    }
+    // dim % 4 == 0 and dim <= u16::MAX (RankQuantFastscan::new contract).
+    const DIM: usize = 64;
+    // 1..=100 docs — crosses the 32-doc block boundary (1..=4 blocks) so the
+    // tail-padding path (`n % 32 != 0`) is exercised.
+    let n = (data[0] as usize % 100) + 1;
+    let k = data[1] as usize % (n + 1); // 0..=n
+
+    let payload = &data[2..];
+    let total = (n + 1) * DIM;
+    let floats: Vec<f32> = (0..total)
+        .map(|i| {
+            if payload.is_empty() {
+                0.0
+            } else {
+                payload[i % payload.len()] as f32 - 128.0
+            }
+        })
+        .collect();
+    let (vecs, query) = floats.split_at(n * DIM);
+
+    let mut idx = RankQuantFastscan::new(DIM);
+    idx.add(vecs);
+    let _ = idx.search(query, k);
+});
diff --git a/src/fastscan.rs b/src/fastscan.rs
index d2210082..7a757aed 100644
--- a/src/fastscan.rs
+++ b/src/fastscan.rs
@@ -35,6 +35,13 @@
 //! [`l2_normalise`](crate::util::l2_normalise), and `k` is clamped to
 //! `n_vectors` exactly as the sibling search methods do.
 
+// Make every unsafe operation inside an `unsafe fn` require an explicit
+// `unsafe {}` block rather than leaning on the fn-level `unsafe`. This is
+// defense-in-depth for the AVX-512 FastScan kernel below: it keeps the kernel's
+// unsafe surface visible to future edits. Crate-wide rollout to the other SIMD
+// modules is tracked separately (see THREAT_MODEL.md, THREAT-SIMD-001).
+#![deny(unsafe_op_in_unsafe_fn)]
+
 use rayon::prelude::*;
 
 use crate::rank::{bucket_ranks, rank_transform, rankquant_norm};
@@ -212,27 +219,37 @@ unsafe fn scan_b2_fastscan_avx512(
     // is 255, so FLUSH × 255 must fit in u16: FLUSH ≤ 257. Pick 256.
     const FLUSH: usize = 256;
 
-    for b in 0..n_blocks {
-        let block_ptr = packed_fs.as_ptr().add(b * bytes_per_block);
-
-        // 32-lane u32 accumulators (split across two __m512i, lo/hi 16).
-        let mut acc32_lo = _mm512_setzero_si512();
-        let mut acc32_hi = _mm512_setzero_si512();
-
-        let mut p = 0usize;
-        while p < pairs {
-            let chunk = (pairs - p).min(FLUSH);
-
-            // 32-lane u16 accumulator split: each holds 16 u16 values
-            // in its low 256 bits.
-            let mut acc16_lo = _mm512_setzero_si512(); // lanes 0..16
-            let mut acc16_hi = _mm512_setzero_si512(); // lanes 16..32
-
-            let inner_end = p + chunk;
-            let inner_chunks_4 = chunk / 4;
-            let mut pp = p;
-
-            for _ in 0..inner_chunks_4 {
+    // SAFETY: every raw load/store and AVX-512 intrinsic in this loop is
+    // in-bounds and feature-gated per the function-level SAFETY comment above.
+    // The explicit block is required by `#![deny(unsafe_op_in_unsafe_fn)]`.
+    unsafe {
+        for b in 0..n_blocks {
+            let block_ptr = packed_fs.as_ptr().add(b * bytes_per_block);
+
+            // 32-lane u32 accumulators (split across two __m512i, lo/hi 16).
+            let mut acc32_lo = _mm512_setzero_si512();
+            let mut acc32_hi = _mm512_setzero_si512();
+
+            let mut p = 0usize;
+            while p < pairs {
+                let chunk = (pairs - p).min(FLUSH);
+
+                // 32-lane u16 accumulator split: each holds 16 u16 values
+                // in its low 256 bits.
+                let mut acc16_lo = _mm512_setzero_si512(); // lanes 0..16
+                let mut acc16_hi = _mm512_setzero_si512(); // lanes 16..32
+
+                let inner_end = p + chunk;
+                let inner_chunks_4 = chunk / 4;
+                let mut pp = p;
+
+                // Score one coord-pair across all 32 lanes: VPSHUFB the per-pair
+                // 16-byte LUT (broadcast into both 128-bit halves) by the packed
+                // nibble codes, widen u8 -> u16, accumulate. `pp` / `block_ptr` /
+                // `lut_u8` / `acc16_*` are captured by name at each call site.
+                // (macro_rules is expanded at compile time, so defining it here
+                // has no runtime cost; it keeps the unrolled body in one place and
+                // is reused by the remainder loop below.)
                 macro_rules! step {
                     ($off:expr) => {{
                         let codes256 =
@@ -250,53 +267,48 @@ unsafe fn scan_b2_fastscan_avx512(
                         acc16_hi = _mm512_add_epi16(acc16_hi, _mm512_castsi256_si512(hi256));
                     }};
                 }
-                step!(0);
-                step!(1);
-                step!(2);
-                step!(3);
-                pp += 4;
-            }
 
-            while pp < inner_end {
-                let codes256 = _mm256_loadu_si256(block_ptr.add(pp * 32) as *const __m256i);
-                let lut128 = _mm_loadu_si128(lut_u8.as_ptr().add(pp * 16) as *const __m128i);
-                let lut256 = _mm256_broadcastsi128_si256(lut128);
-                let contrib = _mm256_shuffle_epi8(lut256, codes256);
-                let lo128 = _mm256_castsi256_si128(contrib);
-                let hi128 = _mm256_extracti128_si256(contrib, 1);
-                let lo256 = _mm256_cvtepu8_epi16(lo128);
-                let hi256 = _mm256_cvtepu8_epi16(hi128);
-                acc16_lo = _mm512_add_epi16(acc16_lo, _mm512_castsi256_si512(lo256));
-                acc16_hi = _mm512_add_epi16(acc16_hi, _mm512_castsi256_si512(hi256));
-                pp += 1;
-            }
+                // 4-wide unroll, then the remainder one pair at a time.
+                for _ in 0..inner_chunks_4 {
+                    step!(0);
+                    step!(1);
+                    step!(2);
+                    step!(3);
+                    pp += 4;
+                }
 
-            // Widen u16 → u32. Meaningful u16s sit in the low 256 bits.
-            let lo256_u16 = _mm512_castsi512_si256(acc16_lo);
-            let hi256_u16 = _mm512_castsi512_si256(acc16_hi);
-            let lo32 = _mm512_cvtepu16_epi32(lo256_u16);
-            let hi32 = _mm512_cvtepu16_epi32(hi256_u16);
-            acc32_lo = _mm512_add_epi32(acc32_lo, lo32);
-            acc32_hi = _mm512_add_epi32(acc32_hi, hi32);
+                while pp < inner_end {
+                    step!(0);
+                    pp += 1;
+                }
 
-            p = inner_end;
-        }
+                // Widen u16 → u32. Meaningful u16s sit in the low 256 bits.
+                let lo256_u16 = _mm512_castsi512_si256(acc16_lo);
+                let hi256_u16 = _mm512_castsi512_si256(acc16_hi);
+                let lo32 = _mm512_cvtepu16_epi32(lo256_u16);
+                let hi32 = _mm512_cvtepu16_epi32(hi256_u16);
+                acc32_lo = _mm512_add_epi32(acc32_lo, lo32);
+                acc32_hi = _mm512_add_epi32(acc32_hi, hi32);
 
-        let mut tmp_lo = [0u32; 16];
-        let mut tmp_hi = [0u32; 16];
-        _mm512_storeu_si512(tmp_lo.as_mut_ptr() as *mut _, acc32_lo);
-        _mm512_storeu_si512(tmp_hi.as_mut_ptr() as *mut _, acc32_hi);
+                p = inner_end;
+            }
 
-        let doc_base = b * 32;
-        let docs_in_block = (n - doc_base).min(32);
-        for lane in 0..docs_in_block {
-            let acc = if lane < 16 {
-                tmp_lo[lane]
-            } else {
-                tmp_hi[lane - 16]
-            };
-            let raw = bias_sum + (acc as f32) * inv_q;
-            top.maybe_insert(raw * scale, doc_base + lane);
+            let mut tmp_lo = [0u32; 16];
+            let mut tmp_hi = [0u32; 16];
+            _mm512_storeu_si512(tmp_lo.as_mut_ptr() as *mut _, acc32_lo);
+            _mm512_storeu_si512(tmp_hi.as_mut_ptr() as *mut _, acc32_hi);
+
+            let doc_base = b * 32;
+            let docs_in_block = (n - doc_base).min(32);
+            for lane in 0..docs_in_block {
+                let acc = if lane < 16 {
+                    tmp_lo[lane]
+                } else {
+                    tmp_hi[lane - 16]
+                };
+                let raw = bias_sum + (acc as f32) * inv_q;
+                top.maybe_insert(raw * scale, doc_base + lane);
+            }
         }
     }
 }