Skip to content

chore(deps): pin transitive deps to fix June-2026 CVE batch#70

Merged
szjanikowski merged 1 commit into
mainfrom
chore/cve-bump-2026-06
Jun 22, 2026
Merged

chore(deps): pin transitive deps to fix June-2026 CVE batch#70
szjanikowski merged 1 commit into
mainfrom
chore/cve-bump-2026-06

Conversation

@szjanikowski

Copy link
Copy Markdown
Contributor

pip-audit started failing on main as of the June-2026 CVE batch — 13 advisories across 5 transitive packages, published after the last green main run (so this is a fresh-advisory failure, not a code regression). Surfaced while verifying the CI on an unrelated docs PR (#69).

Fix

Added lower-bound floors in pyproject.toml, mirroring the existing idna/urllib3 pattern, and re-locked:

Package Was Now (locked) Advisory
aiohttp 3.14.0 3.14.1 CVE-2026-54273..54280 (8)
cryptography 48.0.0 49.0.0 GHSA-537c-gmf6-5ccf
python-multipart 0.0.30 0.0.32 CVE-2026-53540
pydantic-settings 2.14.1 2.14.2 GHSA-4xgf-cpjx-pc3j
starlette 1.2.1 1.3.1 CVE-2026-54282 / CVE-2026-54283

All transitive via harbor/opik/litellm/supabase.

Verification

  • uv lock resolves cleanly (169 packages, no provider conflicts)
  • pip-audit local: No known vulnerabilities found
  • uv run pytest: 397 passed

Unblocks the Quality Gate ahead of the next release.

🤖 Generated with Claude Code

pip-audit flagged 13 advisories across 5 transitive packages (published
after the last main run, so this is not a regression):

- aiohttp >=3.14.1   — CVE-2026-54273..54280 (8 CVEs)
- cryptography >=48.0.1 (locks to 49.0.0) — GHSA-537c-gmf6-5ccf
- python-multipart >=0.0.31 (locks to 0.0.32) — CVE-2026-53540
- pydantic-settings >=2.14.2 — GHSA-4xgf-cpjx-pc3j
- starlette >=1.3.1  — CVE-2026-54282 / CVE-2026-54283

Floors added in pyproject.toml mirroring the existing idna/urllib3
pattern; re-locked. pip-audit clean, 397 tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@szjanikowski szjanikowski merged commit 3681cc3 into main Jun 22, 2026
9 checks passed
@szjanikowski szjanikowski deleted the chore/cve-bump-2026-06 branch June 22, 2026 11:52
szjanikowski pushed a commit that referenced this pull request Jun 24, 2026
The Unreleased section was missing everything merged after the ADR-010/011/012
batch. Added entries for: layered pricing override + visibility (#71, ADR-013),
results-export + repeated-evaluation accumulation (#57), the NASDE→Nasde rebrand
and Starlight docs migration (#64/#69), the parallel-run job_dir race fix (#62),
and the June-2026 CVE dependency pins (#70, new ### Security section). All
[#NN] references now have link targets.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
szjanikowski added a commit that referenced this pull request Jun 24, 2026
* feat(pricing): convention-based layered override (ADR-013)

Model prices are now overridable post-install without hacking the wheel.
A file literally named pricing.toml at a known location is auto-detected
and merged onto the bundled catalog — no [pricing] config key:

  <project>/pricing.toml > ~/.nasde/pricing.toml > bundled

Per-model whole-entry merge (higher wins): an override lists only the
models it changes/adds; the rest fall through. User layer is a HOME
dotfolder (~/.nasde/, like ~/.claude/~/.codex/~/.gemini), deliberately
NOT platformdirs (maps to ~/Library/Application Support on macOS = app
state, not user config). An applied layer prints a dim transparency line.

Both write paths thread project_dir so run (assessment_summary.json) and
export (metrics.json) agree on cost — the ADR-011 single-extractor
invariant. New load_pricing_layered(project_dir) in pricing.py; wired
through evaluator/runner, results_exporter/cli, calibration_publisher,
and eval_migration. Bundled lru_cache and load_pricing(path) unchanged;
merged catalog is not cached (cheap per-job re-read).

14 new tests incl. a three-layer compose case with user↔project overlap
(project wins) and run+export e2e. ADR-013 + docs (token-cost.md,
configuration.md) + CLAUDE.md. 411 tests, ruff, mypy green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* style: ruff format test_pricing.py

CI runs `ruff format --check` in addition to `ruff check`; the new
three-layer test had a long line the formatter wraps.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(pricing): address cloud-review findings on layered override

- bug_001 (normal): migrate-evals dropped project_dir, so its
  assessment_summary.json cost used bundled+user only, disagreeing with
  the other three write paths (ADR-013 invariant). Thread project_dir
  through cli → migrate_job_evals → migrate_trial_evals → load_pricing_layered.
- bug_006 (normal): switching the exporter to load_pricing_layered made
  pre-existing tests read the developer's real ~/.nasde/pricing.toml.
  Move empty_user_layer to an autouse fixture in tests/conftest.py so the
  whole suite is hermetic by default; layered tests opt in by name.
- bug_004 (nit): calibration publish re-read layered pricing per-trial
  (N×L transparency lines). Hoist load_pricing_layered above the loop,
  thread pricing through _publish_one_trial → _open_pr_for_trial.
- bug_003 (nit): move the orphaned cost_efficiency/token_efficiency
  hasattr guards back into test_assessment_summary_includes_economics.

+1 regression test (migrate-evals threads project pricing). 412 tests,
ruff check + format, mypy green. Verified with a HOME-override hermeticity
check and a migrate-evals project-override smoke.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(pricing): surface the effective merged catalog (show / run / export)

Layered overrides (ADR-013) had no way to inspect the merged result. Add
layer provenance and three read surfaces:

- pricing.py: resolve_pricing_layers(project_dir) → ordered PricingLayer
  stack; effective_pricing_with_source(project_dir) → {model: (price, layer)}.
  load_pricing_layered reimplemented on the same core (DRY, unchanged behaviour
  + transparency line). load_pricing(path) and bundled lru_cache untouched.
- nasde pricing show [--show-source]: new `pricing` sub-app printing the
  effective catalog (Model / In / Out / as_of, +Layer with --show-source).
  Sub-app leaves room for future pricing validate/path.
- nasde run: "Pricing used (effective)" table at the end of the summary,
  filtered to the models actually in the run, with source layer.
- results-export: pricing_used.json next to the trials — effective rate +
  source layer per priced model, so a report is a self-contained cost audit.
- pricing_report.py: shared Rich table renderer (show + run).

Docs: token-cost.md (verifying the catalog), CLAUDE.md CLI reference +
architecture note, ADR-013 provenance note. Website builds clean.
423 tests, ruff check + format, mypy green; smoke-verified pricing show
and pricing_used.json on a real trial.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(pricing): freeze ModelPrice + non-scientific rate formatting

Code-review follow-up (two findings from the /code-review pass):

- ModelPrice is now @DataClass(frozen=True). It was a mutable dataclass
  shared via the bundled lru_cache (resolve_pricing_layers does
  dict(load_pricing()) — a shallow copy sharing the ModelPrice objects),
  so an in-place field mutation would silently corrupt every later lookup
  in the process. A rate is an immutable fact; freezing turns a latent
  cache-corruption footgun into an immediate FrozenInstanceError at the
  mutation site. No code mutates ModelPrice fields (verified), so this is
  safe; build a new instance via dataclasses.replace to adjust a rate.

- pricing_report._fmt_rate no longer uses ${rate:g}, which renders
  scientific notation at the extremes ($1e+06, $5e-05). It now trims a
  fixed 4-decimal format, giving $3 / $2.5 in the normal range and
  $0.0001 / $1000000 at the edges. (.2f was rejected — it drops sub-cent
  cached rates to $0.00.)

+2 regression tests (frozen guard, _fmt_rate edge table). 433 tests,
ruff check + format, mypy green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* refactor(pricing): drop redundant trial walk + unwind comprehension

Final code-review nits (#1/#2/#3):

- _print_pricing_used now derives the used-models set from the economics
  `rows` it is handed, instead of re-walking every trial dir and re-parsing
  every assessment_summary.json a second time in the same _print_job_summary
  call (the rows already carry model_name). _models_used_in_job is deleted.
  Removes a redundant per-trial I/O pass and the caller-after-callee ordering.
- _finalize_economics_row now exposes "model" (already destructured) so the
  pricing-used table can read it without a second source.
- _write_pricing_used: replaced the walrus + `for price, layer in [entry]`
  dict-comprehension with a plain readable loop.

432 tests, ruff check + format, mypy green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(pricing): readable override errors + onboarding (example, name caveat)

Improve the experience of authoring a pricing.toml override:

- Malformed override files now fail fast with a clear Rich message naming
  the file (so you know project vs ~/.nasde layer), the cause, and a hint —
  instead of a raw TOMLDecodeError/KeyError traceback. A decimal comma
  (2,5) and a missing input_per_1m/output_per_1m are the common cases.
  _load_override_models wraps the per-layer load; SystemExit(1), no crash.
- nasde init now scaffolds a fully-commented pricing.toml.example (a real
  bundled model name to copy, the decimal-point hint, and the model-name
  caveat). Named .example so it's inert until copied to pricing.toml.
- token-cost.md: a :::caution that the model name MUST match variant.toml's
  `model` or the override is SILENTLY ignored — verify with
  `pricing show --show-source` (model under `bundled` not `project` = typo).
  Plus a note that malformed files fail loudly.

Out of scope (deliberately later): `nasde pricing validate` (check all
entries + flag unknown model names up front) and `nasde pricing set` (add a
single override via CLI). 436 tests, ruff check + format, mypy, website build green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs: show real pricing-verification output (not just prose)

Rewrite the "Verifying the effective catalog" section of token-cost.md to
include actual command output: a --show-source table for a working override,
a side-by-side example of the silent model-name-typo failure (the real model
stays `bundled` while the typo'd key sits as a dead `project` row), the loud
malformed-file errors (decimal comma, missing field), and a sample
pricing_used.json. Theory alone didn't make the silent-miss case obvious.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(changelog): bring [Unreleased] up to date for the next release

The Unreleased section was missing everything merged after the ADR-010/011/012
batch. Added entries for: layered pricing override + visibility (#71, ADR-013),
results-export + repeated-evaluation accumulation (#57), the NASDE→Nasde rebrand
and Starlight docs migration (#64/#69), the parallel-run job_dir race fix (#62),
and the June-2026 CVE dependency pins (#70, new ### Security section). All
[#NN] references now have link targets.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs: make "update CHANGELOG per feature" an explicit definition-of-done

The Unreleased section drifted 5 PRs behind before v0.5.0 even though the
nasde-dev skill already mentioned updating it — the rule was buried mid-list.
Surface it: a new "Development workflow" section in CLAUDE.md (which had no
release guidance), and a prominent "Definition of done — CHANGELOG first"
callout at the top of the skill's doc-consistency step. A user-visible change
is not done until it has an [Unreleased] entry with a [#NN] ref, in the same PR.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Szymon Janikowski <szymon.janikowski@itlibrium.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant