feat: scope stats (#899) + multibyte capability map overlay (#903)#904
feat: scope stats (#899) + multibyte capability map overlay (#903)#904efiten wants to merge 32 commits intoKpa-clawbot:masterfrom
Conversation
Code reviewFound 1 issue:
Both frontend files are modified in this PR but 🤖 Generated with Claude Code - If this code review was useful, please react with 👍. Otherwise, react with 👎. |
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…edback) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…pa-clawbot#899) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…#899) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…clawbot#899) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Register GET /api/scope-stats route with per-window 30s cache; fix COALESCE on SUM in GetScopeStats summary query to handle empty result sets. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…pi-spec 400 message (Kpa-clawbot#899)
…clawbot#899) Add scope_name to transmissionBaseSQL/scanTransmissionRow when the column exists, and render a Scope row in the detail-meta dl for scoped transport packets (matched region = "#name", unmatched = "unknown scope"). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…deploy The backfill processed all historical transport-route rows in a tight loop at startup with no throttling. scope_name is populated at ingest for new packets going forward; historical rows remain NULL. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ot#903) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
fb3db67 to
59101ec
Compare
…old start Load() previously loaded all transmissions regardless of retentionHours, causing buildSubpathIndex() to process the full DB history on every startup. On a DB with 277K paths this produces 13.5M subpath index entries, OOM-killing the process before it ever starts listening. Apply the same retentionHours cutoff to Load()'s SQL that Evict() already uses at runtime. Startup now builds indexes only over the retention window, matching steady-state behaviour and keeping index size proportional to recent activity. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The server opens SQLite with mode=ro so persistMultiByteCapability() silently failed on every UPDATE, leaving all nodes at multibyte_sup=0. Replace DB persist with an in-memory cache: GetHashSizes() stores the computeMultiByteCapability() result in mbCapSnapshot (under cacheMu), GetMultibyteCapMap() exposes a pubkey→entry snapshot, and routes enrich node responses from that map alongside EnrichNodeWithHashSize — the same pattern already used for hash_size. Data refreshes each analytics cycle (~15s); no DB writes needed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Without this, mbCapSnapshot stays empty until the first request to /api/analytics/hash-sizes, leaving all nodes at multibyte_sup=0 on the map overlay until a user visits the analytics page. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
loadNodes() is called on every incoming ADVERT packet, which called renderMarkers() and destroyed any open popup. Track popup visibility via Leaflet's popupopen/popupclose map events; skip renderMarkers() during data refresh while a popup is open so the user can read it. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…yte suspected detection A node whose own adverts confirm hash_size=1 cannot forward multibyte packets. If it appeared as a 'suspected' hop it was due to a prefix collision (1-byte hash prefix shared with a multibyte-capable node), not actual multibyte capability. Mark it 'unknown' instead of 'suspected'. Adds TestMultiByteCapability_SuspectedGuard_OwnHashSize1. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
A node can send hash_size=1 adverts but still have multibyte firmware and forward multibyte packets. The guard incorrectly suppressed legitimate suspected classifications. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ction Pre-Kpa-clawbot#886 ingestor data stored path bytes individually (1-byte entries per hop) even for hs=2 packets. This caused 1-byte node prefixes to match single-byte path fragments from hs=2 packets, incorrectly marking nodes as suspected. Fix: require hop string length (len(pfx)/2) to equal the packet hash_size. A 1-byte hop in an hs=2 packet is stale/malformed data and must not trigger a suspected classification. Adds TestMultiByteCapability_HopLengthMismatch; updates PrefixCollision test to use proper 2-byte hops instead of the now-filtered 1-byte case. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Kpa-clawbot
left a comment
There was a problem hiding this comment.
Independent Review — PR #904
This PR bundles three orthogonal changes (one explicitly admitted in the title, one not in the title at all). The combined diff is 3,784 added / 117 removed across 27 files, of which 2,576 lines are net-new design/plan markdown under docs/superpowers/ that nobody outside the implementer will read. Stripping that internal scratch-pad still leaves ~1,200 lines of production code touching the ingestor, server, store, routes, and frontend across two unrelated features and a startup-OOM hotfix.
This is more PR than reviewable. Below are the structural concerns plus per-feature must-fix items.
Structural concerns (P0 — must address before merge)
1. Scope explosion / unrelated bundling
The PR couples:
- #899 — Scope stats (new ingestor config, new column, migration, ingest-side HMAC matching, new API endpoint, new analytics tab)
- #903 — Multibyte capability map overlay (new ingestor columns, migration, server enrichment, new map toggle, new evidence labels)
- An OOM-on-cold-start hotfix to
Load()not mentioned in the PR title
These should be three separate PRs. Each is independently reviewable, independently revertable, independently bisectable. As bundled, you cannot bisect a regression to one feature without reverting the whole thing — and the OOM hotfix in particular is the kind of change that should land standalone with a clear paper trail (it's the kind of change that gets cherry-picked back to a release branch later).
Must-fix: split into PRs #904a (OOM hotfix), #904b (scope stats), #904c (multibyte overlay). The OOM fix can land today.
2. Multibyte-capability code duplication / not reusing existing computeMultiByteCapability
The repo already has PacketStore.computeMultiByteCapability() in cmd/server/store.go:6161 with full confirmed/suspected/unknown logic, evidence labels, and prefix-collision handling — including the recent #886-aware hop-length-must-match-hash_size guard.
This PR adds:
- New
multibyte_supandmultibyte_evidencecolumns tonodesandinactive_nodeson the ingestor side - A new ingestor migration
multibyte_sup_v1 - A new server-side
persistMultiByteCapability()running on the ~15s analytics cycle - A new
GetMultibyteCapMap()snapshot accessor for routes - A new
enrichNodeWithMultibyte()shim
…all of which appears to shadow the existing code path rather than extend it. The existing MultiByteCapEntry struct + status/evidence labels are exactly what the overlay needs to render. Why is there a second persisted column instead of just exposing the existing computed field through the existing /api/nodes enrichment hook?
Concrete questions for the author:
- What does the persisted
multibyte_supcolumn give us thatcomputeMultiByteCapability()doesn't already deliver per call (cached for 15s)? - The schema comment says "no downgrade possible" — but the existing in-memory computation is downgrade-aware (a confirmed 2-byte advert stays confirmed). Is the persisted column intended to survive DB-only restart-state? If so, the migration needs to also seed from the existing computation; otherwise on first startup the column is meaningless until the analytics cycle runs.
- The frontend already reads
hash_sizeandhash_size_inconsistentfrom/api/nodes. Why doesmultibyte_supneed a separate field instead of being derived from those + the existing capability snapshot?
Must-fix: justify why this isn't a 1-line frontend wire-up of the existing MultiByteCapEntry snapshot. If there's a real reason (e.g., persistence across restart before first analytics cycle completes), document it. Otherwise: delete the new column + migration and consume the existing capability snapshot.
3. Test coverage — specific gaps
The PR adds tests for:
TestGetMultibyteCapMap_Confirmed/_EmptyWhenNoSnapshot(snapshot mechanics)TestEnrichNodeWithMultibyte_Confirmed/_Suspected/_ZeroEntryNoChange(helper unit tests)TestMultiByteCapability_HopLengthMismatch(good — covers the #886 guard)TestLoadRegionKeys(string trimming)- Schema migration smoke tests
What is not tested:
- HMAC scope matching end-to-end — there is
TestLoadRegionKeysbut no test that constructs a transport-route packet with a known region key, runs it throughmatchScope(), and assertsscope_nameis set on the persisted row. The whole feature hinges on this matching being correct, and HMAC truncation bugs are easy to introduce. GET /api/scope-statsshape — no test that hits the route and asserts the returned JSON structure (totals, scoped/unscoped split, per-region breakdown, time-series shape).- 30s TTL invalidation — no test that the cache expires.
Load()retention cutoff — claims to fix the OOM, but there is no test that verifies a DB containing pre-cutoff transmissions yields aLoad()that ignores them. This is the single most important behavior change in the PR (it's a startup-correctness fix); it must have a test.- Migration idempotency — what happens on the second startup after the column already exists? (The check exists in code, but a test that runs migration twice would be cheap insurance.)
- Frontend Scopes-tab rendering — 128 added lines in
analytics.js, zero tests. - Map overlay toggle persistence — claimed in PR body ("survives reload"), no test.
Must-fix: add the four backend tests above (HMAC end-to-end, route shape, TTL, retention). Frontend tests would be nice but the existing repo doesn't have great patterns for that, so accept as follow-up if not.
4. Documentation
docs/superpowers/plans/*.md and docs/superpowers/specs/*.md (2,576 lines of design notes) are author scratch-pad files, not user docs. They don't belong in docs/ at all — they belong as PR description content, GitHub Discussion artifacts, or in a separate internal/ directory.
What's missing:
- No
docs/user-guide/configuration.mdupdate for the newhashRegionsconfig field — operators cannot enable the feature without reading the source. - No
docs/user-guide/map.mdupdate for the new "Show multibyte capability" toggle. - No
docs/user-guide/analytics.mdupdate for the new Scopes tab. - No release notes entry.
- The
docs/api-spec.mdchange is +56 lines — verify it covers both the new/api/scope-statsendpoint AND the new fields on/api/nodes.
Must-fix:
- Move
docs/superpowers/**files out of the public docs tree (or intointernal/specs/if they must be retained — but really, they belong attached to the PR or in a wiki). - Update
docs/user-guide/configuration.md,map.md,analytics.mdto cover all three user-facing surfaces.
Per-feature must-fix items
Scope stats (#899)
- HMAC truncation correctness:
code := uint16(hmacBytes[0]) | uint16(hmacBytes[1])<<8— verify endianness matches firmware. Cite the firmware source line in the PR body. - Region key empty/whitespace filtering —
TestLoadRegionKeyscovers it, good. - No backfill goroutine is documented as intentional, but operators with existing DBs will see zero data in the Scopes tab until 24h of new packets accumulate. At minimum, the Analytics → Scopes UI should display "No scoped data yet — feature applies only to packets received after enabling
hashRegions" when the totals are empty.
Multibyte overlay (#903)
- Beyond the duplication concern (P0 #2 above), the visual encoding solid green = confirmed, light green dashed = suspected, dimmed = unknown needs a CSS-variable check (per
AGENTS.md, all colors via CSS variables). Verifypublic/map.jsdoesn't hardcode hex values. - The popup "evidence label" is marker-state-dependent. What does the popup say when the overlay is OFF? Does the new field still render? It should be hidden when the overlay isn't active to avoid leaking implementation detail to non-power-users.
OOM hotfix (Load() retention)
- The fix shape is right (mirror EvictStale's cutoff), but verify the SQL is safe against
retentionHours == 0(no retention configured). Looking at the diff, it is — guarded byif s.retentionHours > 0. Good. - Verify the OOM-reproducer DB (~277K paths) actually starts post-fix. The PR body claims it does but no proof attached. A regression test that constructs a synthetic DB with N pre-cutoff + M post-cutoff transmissions and asserts
Load()reports M loaded would lock this in.
Summary
| Category | Count |
|---|---|
| Must-fix structural | 4 |
| Must-fix per-feature | 5 |
| Documentation gaps | 5 |
This PR cannot merge as-is. The right path forward is split into three PRs, land the OOM hotfix immediately, and re-review the two features independently — at which point the multibyte feature should also be examined for whether it needs new code at all.
|
Splitting into two focused PRs as requested. Closing this one.
|
Summary
#899 — Scoped/unscoped transport-route statistics
hashRegionsconfig to the ingestor — region names whose HMAC-SHA256 keys are derived at startup, mirroring the existinghashChannelspattern0000are HMAC-matched against configured region keys; result stored in newscope_namecolumnscope_name_v1adds the column; scope matching runs only on new packets (no backfill goroutine)GET /api/scope-stats?window=endpoint (1h/24h/7d, 30s TTL) returning transport totals, scoped/unscoped counts, per-region breakdown, and time-series#903 — Multibyte capability map overlay
multibyte_sup/multibyte_evidencecolumns tonodes+inactive_nodesvia ingestor migration (multibyte_sup_v1)/api/nodesresponses:multibyte_sup= 0/1/2,multibyte_evidence="advert"/"path"/nullpersistMultiByteCapability()upserts status from the existing analytics cycle (~15s) — no startup scan, no per-packet cost, no downgrade possiblefix — OOM crash on cold start with large DB
Load()previously loaded all transmissions from the DB regardless ofretentionHours, sobuildSubpathIndex()processed the full DB history on every startup. On a DB with 277K paths this produces 13.5M subpath index entries and OOM-kills the process before it ever starts listening — causing a supervisord crash loop.Fix: apply the same
retentionHourscutoff toLoad()'s SQL thatEvictStale()already uses at runtime. Startup now builds indexes only over the configured retention window, making startup time proportional to recent activity rather than total DB history.Test plan
cd cmd/ingestor && go test ./...— covers schema migrations for bothscope_name_v1andmultibyte_sup_v1cd cmd/server && go test ./...— covers scope stats, node enrichment,persistMultiByteCapability, and retention-filteredLoad()"hashRegions": ["#yourregion"]to ingestor config, restart, verifyscope_namecolumn gets populatedCloses #899
Closes #903
🤖 Generated with Claude Code