fix: bimodal clock hysteresis — don't flip to 'No Clock' on transient bad bursts by Kpa-clawbot · Pull Request #894 · Kpa-clawbot/CoreScope

Kpa-clawbot · 2026-04-22T15:29:55Z

Symptom

Operator on staging: /#/nodes/c0dedad4… (Kpa Roof Solar) shows 🚫 No Clock even though the most recent advert decoded with a valid 2026 timestamp. Same shape on other historically-bimodal nodes when their last few adverts happen to all be bad.

Root cause

The bimodal classifier from #845 looks at the last 5 samples within the past hour. When a bimodal node hits a transient burst of bad RTC timestamps, recent goodFraction = 0 and severity flips to no_clock — regardless of the long-term picture (16k+ samples, ~38% historically good in this case).

Fix (option C)

Widen the recent window: 5 → 20 samples, 1h → 6h time bound. More data, less jitter, still meaningfully recent.
Add hysteresis: only drop to no_clock when BOTH recent goodFraction AND long-term goodFraction are < 10%. A node with historical good samples stays bimodal_clock even when the recent window is 100% bad.
Fallback skew display: when recent has zero good samples but long-term does, use the long-term good median as the displayed skew so the operator sees a meaningful number instead of stale poison.

API change

Add longTermGoodFraction to /api/nodes/{pk}/clock-skew so operators can see the hysteresis input directly.

Tests

TestBimodalHysteresis — recent all-bad + long-term mixed → bimodal_clock (NEW; reproduces the staging symptom)
TestNoClock_BothWindowsBad — recent all-bad + long-term all-bad → no_clock (NEW; verifies real no-clock nodes still classify correctly)
Updated TestSeverityUsesRecentNotMedian + TestReporterScenario_789 to match the new wider window and accept bimodal_clock for nodes with massive historical poison — feat: bimodal_clock severity — surface flaky-RTC nodes instead of hiding as 'No Clock' #845's whole premise: bimodal deserves a flag, not OK status.

All clock_skew tests pass; full go test ./... green.

… bad bursts Symptom: Kpa Roof Solar (and other historically-bimodal nodes) briefly flip to '🚫 No Clock' when the last 5 adverts in the recent-1h window all happen to have nonsense timestamps, even though the very next advert decoded with a valid 2026 timestamp. Root cause: the bimodal classifier from #845 looks at the last 5 samples within the past hour. When a bimodal node hits a transient burst of bad RTC samples, recent goodFraction = 0 and severity flips to no_clock regardless of the long-term picture (16k+ samples, ~38% historically good). Fix (option C from triage): 1. Widen the recent window: 5 → 20 samples, 1h → 6h time bound (more data, less jitter, still recent). 2. Add hysteresis: only drop to no_clock when BOTH recent goodFraction AND long-term goodFraction are < 10%. A node with historical good samples stays bimodal even when the recent window is 100% bad. 3. When recent has zero good samples, fall back to long-term good median for the displayed skew so the operator sees a meaningful number instead of stale poison. API: add longTermGoodFraction to /api/nodes/{pk}/clock-skew so operators can see the hysteresis input directly. Tests: - TestBimodalHysteresis: recent all-bad + long-term mixed → bimodal_clock - TestNoClock_BothWindowsBad: recent all-bad + long-term all-bad → no_clock - Updated TestSeverityUsesRecentNotMedian + TestReporterScenario_789 to match the new wider window and accept bimodal_clock for nodes with massive historical poison (#845's whole premise: bimodal deserves a flag, not OK status).

Kpa-clawbot · 2026-04-24T23:29:57Z

Superseded by PR #907 (default-detection redesign). Closing.

## Summary UI completion of #690 — surfaces observer clock skew and per-hash evidence that the backend already computes but wasn't exposed in the frontend. **Not related to #845/PR #894** (bimodal detection) — this is the UI surface for the original #690 scope. ## Changes ### Backend: per-hash evidence in node clock-skew API (commit 1) - Extended `GET /api/nodes/{pubkey}/clock-skew` to return `recentHashEvidence` (most recent 10 hashes with per-observer raw/corrected skew and observer offset) and `calibrationSummary` (total/calibrated/uncalibrated counts). - Evidence is cached during `ClockSkewEngine.Recompute()` — route handler is cheap. - Fleet endpoint omits evidence to keep payload small. ### Frontend: observer list page — clock offset column (commit 2) - Added "Clock Offset" column to observers table. - Fetches `/api/observers/clock-skew` once on page load, joins by ObserverID. - Color-coded severity badge + sample count tooltip. - Singleton observers show "—" not "0". ### Frontend: observer-detail clock card (commit 3) - Added clock offset card mirroring node clock card style. - Shows: offset value, sample count, severity badge. - Inline explainer describing how offset is computed from multi-observer packets. ### Frontend: node clock card evidence panel (commit 4) - Collapsible "Evidence" section in existing node clock skew card. - Per-hash breakdown: observer count, median corrected skew, per-observer raw/corrected/offset. - Calibration summary line and plain-English severity reason at top. ## Test Results ``` go test ./... (cmd/server) — PASS (19.3s) go test ./... (cmd/ingestor) — PASS (31.6s) Frontend helpers: 610 passed, 0 failed ``` New test: `TestNodeClockSkew_EvidencePayload` — 3-observer scenario verifying per-hash array shape, corrected = raw + offset math, and median. No frontend JS smoke test added — no existing test harness for clock/observer rendering. Noted for future. ## Screenshots Screenshots TBD ## Perf justification Evidence is computed inside the existing `Recompute()` cycle (already O(n) on samples). The `hashEvidence` map adds ~32 bytes per sample of memory. Evidence is stripped from fleet responses. Per-node endpoint returns at most 10 evidence entries — bounded payload. --------- Co-authored-by: you <you@example.com>

This was referenced Apr 24, 2026

feat(#690): expose observer skew + per-hash evidence in clock UI #906

Merged

feat: clock skew classifier redesign — default-detection model #907

Open

Kpa-clawbot closed this Apr 24, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: bimodal clock hysteresis — don't flip to 'No Clock' on transient bad bursts#894

fix: bimodal clock hysteresis — don't flip to 'No Clock' on transient bad bursts#894
Kpa-clawbot wants to merge 1 commit intomasterfrom
fix/845-no-clock-hysteresis

Kpa-clawbot commented Apr 22, 2026

Uh oh!

Kpa-clawbot commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Kpa-clawbot commented Apr 22, 2026

Symptom

Root cause

Fix (option C)

API change

Tests

Uh oh!

Kpa-clawbot commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant