Skip to content

Release v2.0.0#31

Open
dmichael-fastly wants to merge 2 commits into
mainfrom
release/v2.0.0
Open

Release v2.0.0#31
dmichael-fastly wants to merge 2 commits into
mainfrom
release/v2.0.0

Conversation

@dmichael-fastly

@dmichael-fastly dmichael-fastly commented Jun 21, 2026

Copy link
Copy Markdown
Collaborator

Release v2.0.0

A major architecture-cleanup release plus the feature work that landed
alongside it. The largest backend modules were carved into per-concern
packages (with re-export shims), telemetry moved to OpenTelemetry + structlog,
tenancy gained a typed RequestContext boundary that can't be constructed
without enforcing service access, and the frontend's hydration/navigation
warm-up was replaced with policy. Composite analytics endpoints land as a hard
cutover — frontend and backend ship together.

Full, categorized notes in CHANGELOG.md.

Highlights

  • Package carve-upsmetadata_db, share_db, rollups, admin,
    session_scoring, duckdb, iceberg, and scheduler split into
    per-concern packages with re-export shims so existing imports keep working.
  • Typed tenancy boundary — a single RequestContext dependency that can't
    be built without enforcing service access (23 analytics endpoints across 8
    routers).
  • Telemetry — OpenTelemetry + structlog; uvicorn access logs bridged
    through the structured root handler.
  • Session scoring — in-UI redeploy + edge-drift warning, fail-open
    breakdown card, and explicit operator opt-in for edge Layer-2 enforcement
    (no clock-driven monitoring→blocking ramp; deployment age is advisory only).
    NGWAF skip-inspection on the internal scoring sub-fetch.
  • Observability + ops hardening — every request mints a correlation id
    (rid) that threads through the access log and a persistent slow-query
    history; richer admin health snapshot and a deeper /api/health probe.
  • Human-readable PoP/ASN labels across the network, shielding, and origin
    views, from one shared component seeded by /api/bootstrap.
  • Inline backend-failure surfacing — no more silent spinners or fabricated
    zeros; analytics reads typed through the generated OpenAPI schema.
  • Reliability — opt-in DuckDB instance-recycle to bound the object-cache
    leak, now with backpressure so reads queue rather than fail during the brief
    recycle drain; self-healing reclaim of raw files stranded by an interrupted
    delete; orphaned-sync-row reaper.
  • Consolidation + gates — three SQLite pools collapse into one
    ThreadLocalPool; a shared per-hour rollup writer; cron-tail helpers. New CI
    gates: frontend ESLint ceiling, Rust scorer cargo test, import contracts,
    and an OTEL console-exporter guard.
  • Dependencies — freshness sweep across Python, frontend, and the scorer.

Docs & release prep

  • README, AGENTS.md, and CHANGELOG.md refreshed to match the shipped code.
  • Comment hygiene pass across the tree: stale, redundant, and duplicate-divider
    comments removed and embedded changelog blocks (the CI coverage gates, the
    ESLint ceiling) condensed to their conventions. Load-bearing rationale,
    incident references, and functional directives left intact.
  • Release hygiene: Docker images pinned to the same Python as CI and mypy
    (3.13); dropped an orphaned server-only dependency and its stale knip
    waivers; stopped tracking the regenerated tests/perf/latest.json (the
    committed baseline.json stays the gate input); removed a dead logging
    helper and corrected a stale metric_snapshots docstring.
  • App version is 2.0.0 across pyproject.toml, frontend/package.json,
    backend/main.py, and the committed OpenAPI snapshots; corrected a stale
    1.2.0 reference in docs/adr/12-api-versioning.md.
  • make dev is now a real Makefile target.
  • A bare ./run.sh honors the documented default ports (3000/8000) for fresh
    clones while still guarding an explicitly-chosen tunnel port.
  • Removed the retired localhost.run sharing mode from the share UI (the
    backend dropped it in v2.0; it was still the default radio).

Validation

  • Backend pytest: green. Frontend vitest: 971 passed; tsc --noEmit clean;
    ESLint at ceiling.
  • Full pre-push gate passed: ruff, mypy, OpenAPI regen (no drift), frontend
    typecheck, security-regression count, and the Rust scorer cargo test.

@dmichael-fastly dmichael-fastly force-pushed the release/v2.0.0 branch 8 times, most recently from a2aed5b to 67cbb97 Compare June 21, 2026 19:00
Architecture cleanup + feature release. The largest backend modules were
carved into per-concern packages (with re-export shims), telemetry moved
to OpenTelemetry + structlog, tenancy got a typed RequestContext boundary
that can't be constructed without enforcing service access, and the
frontend's hydration/navigation warm-up was replaced with policy.
Composite analytics endpoints land as a hard cutover — frontend and
backend ship together.

Highlights (see CHANGELOG.md for the complete list):

- Session scoring: in-UI redeploy + edge-drift warning, fail-open
  breakdown card, and explicit operator opt-in for edge Layer-2
  enforcement (no clock-driven monitoring-to-blocking ramp; deployment
  age is advisory only).
- Observability: every request mints a correlation id that threads
  through the access log (now with latency) and a persistent slow-query
  history; richer admin health snapshot and a deeper /api/health probe.
- Human-readable PoP and ASN labels across the network, shielding, and
  origin views, sourced from one shared component seeded by /api/bootstrap.
- Backend failures surface inline (no more silent spinners or fabricated
  zeros) and analytics reads are typed through the generated OpenAPI
  schema so a rename is a compile error.
- Opt-in RUM Web Vitals; a timeout-guarded DuckDB instance-recycle job to
  bound the object-cache leak; self-healing reclaim of raw files stranded
  by an interrupted delete.
- Consolidation: three SQLite pools collapse into one ThreadLocalPool,
  per-hour rollup writers share one path, and cron tails funnel through
  shared helpers. New CI gates: frontend ESLint ceiling, Rust scorer
  cargo-test, and import contracts.
- Dependency freshness sweep across Python, frontend, and the scorer.

Release prep: refreshed README, AGENTS, and CHANGELOG; corrected the
ADR-12 version reference to 2.0.0; made `make dev` a real target; fixed a
bare `./run.sh` so it honors the documented default ports (3000/8000) for
fresh clones while still guarding explicitly-chosen tunnel ports; and
removed the retired localhost.run mode from the share UI.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@dmichael-fastly dmichael-fastly force-pushed the release/v2.0.0 branch 2 times, most recently from 1c91766 to 931fd2a Compare June 21, 2026 20:56
…tions

- Provisioning wizard: allow the first-run flow (no active service yet)
  by exempting /api/provision from the API client's serviceless-request
  guard, which otherwise aborted step 1 with "No active service —
  request aborted". Pinned with a no-active-service regression test.

- Caddy image build: give the custom (ratelimit) build its own image
  tag so it no longer overwrites its `FROM caddy:2-alpine` base. Reusing
  the base tag made later rebuilds resolve FROM to our own non-root image
  and every privileged build step failed. Also drop the redundant
  `apk add libcap` — setcap/addgroup/adduser are already in the base.

- DuckDB memory: cap the connection pool at 4 (matches the 4-core host)
  and lower the recycle RSS threshold to 6000MB. Partial mitigation.

- OOM stopgap: a process-level memory guard that triggers a clean
  self-restart (SIGTERM -> uvicorn drains -> docker restart:unless-stopped)
  when RSS crosses BACKEND_GRACEFUL_RESTART_RSS_MB, converting the
  destructive 12g cgroup OOM-SIGKILL into a graceful ~15s restart. The
  dominant allocation source is still under investigation; this is a
  stopgap, not a root-cause fix.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant