Skip to content

fix(security): clear OSV/Scorecard advisories on dev-only benchmark deps#240

Merged
Navi Bot (project-navi-bot) merged 1 commit into
mainfrom
fix/bench-dep-advisories
Jun 15, 2026
Merged

fix(security): clear OSV/Scorecard advisories on dev-only benchmark deps#240
Navi Bot (project-navi-bot) merged 1 commit into
mainfrom
fix/bench-dep-advisories

Conversation

@Fieldnote-Echo

Copy link
Copy Markdown
Member

OpenSSF Scorecard / OSV flagged ~20 advisories on main after the BEIR benchmark landed (#237). Triaged: every one is dev/benchmark tooling — none reach the published ordvec crate or the ordvec PyPI wheel.

Diagnosis (reproduced with cargo audit + the OSV API)

Source Advisories Package Why flagged
benchmarks/beir/requirements.txt ~19 requests, numpy, hnswlib (+ precautionary others) deps were unpinned → OSV flags an unconstrained version against the package's entire historical CVE list
Cargo.lock (via benchmarks/beir-bench) 1 bincode 1.x RUSTSEC-2025-0141 = unmaintained (informational, not a CVE), transitive via hnsw_rs

The actual resolved-latest Python versions are already patched (the flags are unpinned-conservatism, not real exposure), and cargo tree -p ordvec is clean of bincode.

Fix

Python — pin lower-bound floors at the first patched release (clears the flags because OSV excludes a >=fixed range; >= keeps installs on the latest compatible wheel incl. recent CPython):

  • requests>=2.32.4 (GHSA-9hjg-9r4m-mvj7 .netrc leak + all older requests CVEs)
  • hnswlib>=0.8.0 (GHSA-xwc8-rf6m-xr86 double free)
  • numpy>=1.26.0 (symlink-write + incorrect-comparison)
  • safe floors for scipy / pandas / tqdm / tabulate / huggingface-hub / faiss-cpu / pytrec-eval-terrier / matplotlib

Rust — bincode (RUSTSEC-2025-0141, unmaintained): documented deny.toml ignore (cargo-deny is configured to error on unmaintained, so the next run would fail without it). It is not in the published crate; revisit if a maintained HNSW crate without bincode 1.x is adopted. (Deliberately did NOT hide it by de-committing the bench lockfile — that would conceal rather than fix.)

Verification

  • cargo tree -p ordvec --all-features --edges normal,build,dev — clean of bincode (shipped crate unaffected)
  • cargo deny check advisoriesadvisories ok
  • local benchmark venv (CPython 3.14) versions satisfy every new floor (requests 2.34.2, numpy 2.4.6, hnswlib 0.8.0, …)
  • next Scorecard run on main reflects the cleared count

No changes to the published crate or wheel; benchmarks/ is excluded from the packaged crate.

OpenSSF Scorecard / OSV flagged ~20 advisories on main after the BEIR benchmark
landed (#237). ALL are dev/benchmark tooling — none reach the published `ordvec`
crate or the `ordvec` PyPI wheel.

Python (benchmarks/beir/requirements.txt): the deps were UNPINNED, so OSV flagged
each against its entire historical CVE list (an unconstrained version cannot be
ruled non-vulnerable). The actual resolved-latest versions are already patched.
Lower-bound-pin every package at its first patched release — clears the flags
(OSV excludes a `>=fixed` range) while `>=` keeps installs on the latest
compatible wheel, incl. recent CPython:
  - requests>=2.32.4   (GHSA-9hjg-9r4m-mvj7 .netrc leak + all older requests CVEs)
  - hnswlib>=0.8.0     (GHSA-xwc8-rf6m-xr86 double free)
  - numpy>=1.26.0      (symlink-write + incorrect-comparison CVEs)
  - safe floors for scipy/pandas/tqdm/tabulate/huggingface-hub/faiss-cpu/
    pytrec-eval-terrier/matplotlib. Verified the local cp314 venv satisfies all.

Rust (RUSTSEC-2025-0141): bincode 1.x is UNMAINTAINED (informational advisory,
not a vulnerability), pulled only transitively via hnsw_rs in the dev-only
benchmarks/beir-bench harness. `cargo tree -p ordvec` is clean of bincode, so it
does not reach the shipped crate. Add a documented deny.toml ignore so cargo-deny
(configured to error on unmaintained crates) stays green; revisit if a maintained
HNSW crate that does not pull bincode 1.x is adopted.

Verified: `cargo tree -p ordvec` clean of bincode; `cargo deny check advisories`
ok; benchmark venv versions satisfy the new floors.

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses security advisories in the dev-only BEIR benchmark tooling by lower-bound-pinning dependencies in benchmarks/beir/requirements.txt to their first patched releases. It also adds an ignore rule in deny.toml for the informational RUSTSEC-2025-0141 advisory regarding bincode 1.x, which is only transitively used in benchmarks and does not affect the published crate. These changes are documented in CHANGELOG.md. There are no review comments, and I have no feedback to provide.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@project-navi-bot Navi Bot (project-navi-bot) merged commit 24038b7 into main Jun 15, 2026
30 checks passed
@project-navi-bot Navi Bot (project-navi-bot) deleted the fix/bench-dep-advisories branch June 15, 2026 15:49
@qodo-code-review

qodo-code-review Bot commented Jun 15, 2026

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (2) 📘 Rule violations (0)

Grey Divider


Remediation recommended

1. Global advisory ignore drift 🐞 Bug ⛨ Security
Description
deny.toml now ignores RUSTSEC-2025-0141 globally, so cargo deny check will not fail if
bincode 1.x later appears in the shipped ordvec dependency tree. This weakens the supply-chain
gate by relying on human vigilance rather than CI to catch future dependency drift.
Code

deny.toml[25]

+ignore = ["RUSTSEC-2025-0141"]
Evidence
The ignore is configured at the cargo-deny config level (applies to the entire graph that CI
checks). CI’s cargo-deny job explicitly checks the whole workspace, while the only core-crate scoped
dependency check currently greps for other forbidden crates, not bincode—so the ignore could mask
future introduction into ordvec without an additional guard.

deny.toml[15-25]
Cargo.toml[98-101]
.github/workflows/ci.yml[328-349]
.github/workflows/ci.yml[244-261]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`deny.toml` ignores `RUSTSEC-2025-0141` workspace-wide, which is fine for the current benchmark-only exposure, but it also means CI will not catch a future regression where `bincode` 1.x enters the shipped `ordvec` crate.

### Issue Context
CI runs `cargo deny check` for the whole workspace, and the existing core-crate dependency guard only checks for BLAS/faer/ndarray/statrs.

### Fix Focus Areas
- .github/workflows/ci.yml[244-261]

### Suggested fix
Extend the existing `deps` job’s `tree="$(cargo tree -p ordvec ...)"` check to also fail if `bincode` appears in the `-p ordvec` tree (e.g., a case-insensitive boundary grep for `bincode`). This keeps the benchmark-only ignore while preserving a hard guardrail for the shipped crate.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Informational

2. Benchmark Python floor undocumented 🐞 Bug ☼ Reliability
Description
The BEIR benchmark requirements.txt now sets relatively new minimum versions
(numpy/scipy/pandas/etc.), but make bench-beir-setup still uses whatever python3 is and the
benchmark README doesn’t state a minimum Python version. This can cause benchmark setup failures on
environments where python3 is older than what the new dependency floors effectively require.
Code

benchmarks/beir/requirements.txt[R21-26]

+numpy>=1.26.0
+scipy>=1.11.0
+requests>=2.32.4          # GHSA-9hjg-9r4m-mvj7 (.netrc leak) + all older requests CVEs
+tqdm>=4.66.3              # CVE-2024-34062
+pandas>=2.2.0
+tabulate>=0.9.0
Evidence
Benchmark setup installs the requirements via python3 without enforcing a minimum version, while
the benchmark README quick start also omits a Python floor. Separately, the project’s published
Python package declares >=3.10, suggesting a modern Python baseline is expected but not
communicated for benchmarks.

benchmarks/beir/requirements.txt[14-26]
Makefile[22-24]
Makefile[102-109]
benchmarks/beir/README.md[56-66]
ordvec-python/pyproject.toml[5-12]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The benchmark setup path installs `benchmarks/beir/requirements.txt` using `PY ?= python3`, but the docs don’t specify a required Python version. With the new minimum dependency versions, users with an older system `python3` can hit resolution/install errors.

### Issue Context
The repo’s published Python wheel already declares `requires-python = ">=3.10"`, and CI tests the bindings on 3.10+.

### Fix Focus Areas
- Makefile[22-24]
- Makefile[102-109]
- benchmarks/beir/README.md[56-66]

### Suggested fix
Add a clearly stated minimum Python version (e.g., `>=3.10`) to `benchmarks/beir/README.md` (and optionally root README’s benchmark section). Consider adding a small preflight check in `bench-beir-setup` that errors with a clear message if `$(PY)` is below the required version.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

@qodo-code-review

Copy link
Copy Markdown

PR Summary by Qodo

Clear OSV/Scorecard advisories for dev-only BEIR benchmark dependencies
⚙️ Configuration changes 📝 Documentation 🕐 10-20 Minutes

Grey Divider

Walkthroughs

Description
• Add lower-bound pins for BEIR benchmark Python dependencies to satisfy OSV/Scorecard.
• Document and ignore the dev-only bincode unmaintained advisory in cargo-deny.
• Record the security-triage rationale in the changelog for reviewer/auditor clarity.
Diagram
graph TD
  S{{"OSV / Scorecard"}} --> R["BEIR requirements.txt"] --> P["Pinned floors (>= fixed)" ]
  C["cargo-deny" ] --> D["deny.toml" ] --> I["Ignore RUSTSEC-2025-0141" ]
  B["beir-bench (dev)" ] --> L["Cargo.lock deps" ]
  L --> I
  subgraph Legend
    direction LR
    _ext{{"External scanner"}} ~~~ _tool["Tooling" ] ~~~ _file["Repo file" ]
  end
Loading
High-Level Assessment

The following are alternative approaches to this PR:

1. Pin exact versions (fully locked requirements)
  • ➕ Maximizes reproducibility for benchmark runs across machines/CI
  • ➕ Eliminates ambiguity in scanner evaluation (no range interpretation)
  • ➖ Requires frequent updates to keep pace with new CPython wheels/security fixes
  • ➖ Harder for users to install on diverse platforms due to strict pins
2. Use pip-tools (constraints.txt) + hashed lock for benchmarks
  • ➕ Keeps a human-edited input file while generating a reproducible lock
  • ➕ Allows controlled upgrades with clear diffs of transitive changes
  • ➖ Adds workflow/tooling complexity for a dev-only harness
  • ➖ Still needs periodic regeneration and review
3. Remove/avoid committing benchmark lockfiles to hide transitive advisories
  • ➕ May reduce advisory surface area detected via repo scanning
  • ➖ Conceals the true resolved dependency state rather than triaging it
  • ➖ Reduces benchmark reproducibility and hampers debugging

Recommendation: The chosen approach (lower-bound floors for Python + a documented cargo-deny ignore for a dev-only informational Rust advisory) is the best trade-off for this repo: it clears false-positive/noise advisories while preserving install flexibility for modern CPython and keeping the triage explicit. Full locking via pip-tools is a reasonable future enhancement if benchmark reproducibility becomes a higher priority, but it is not necessary to address the current Scorecard/OSV findings.

Grey Divider

File Changes

Documentation (1)
CHANGELOG.md Document security triage for benchmark-only advisory cleanup +12/-0

Document security triage for benchmark-only advisory cleanup

• Adds an Unreleased Security entry explaining why OSV/Scorecard advisories appeared and how they were cleared. Explicitly notes that the affected dependencies are dev-only and do not reach the shipped 'ordvec' crate or PyPI wheel.

CHANGELOG.md


Other (2)
requirements.txt Lower-bound pin BEIR benchmark Python dependencies at first patched releases +20/-14

Lower-bound pin BEIR benchmark Python dependencies at first patched releases

• Replaces unpinned benchmark dependencies with '>=' minimum versions that start at known fixed releases (e.g., requests/hnswlib/numpy). Adds comments explaining the OSV/Scorecard rationale and why lower-bound pins keep compatibility while excluding vulnerable ranges.

benchmarks/beir/requirements.txt


deny.toml Triaged ignore for dev-only bincode unmaintained advisory (RUSTSEC-2025-0141) +9/-3

Triaged ignore for dev-only bincode unmaintained advisory (RUSTSEC-2025-0141)

• Updates the advisories section to allow explicit triaged ignores with rationale. Adds an ignore entry for RUSTSEC-2025-0141 with documentation that it is informational, benchmark-only, and not part of the published crate dependency tree.

deny.toml


Grey Divider

Qodo Logo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants