Skip to content

scanner: listener.rs hardcodes charon_blocks_received_total outside the names module — dashboard uses charon_scanner_blocks_total #328

@obchain

Description

@obchain

Problem

Two block-counter metrics ship with conflicting names and neither side knows about the other:

  • `crates/charon-scanner/src/listener.rs:184` emits `metrics::counter!("charon_blocks_received_total", ...)` — a raw string literal, not routed through `charon_metrics::names::*`.
  • `crates/charon-metrics/src/lib.rs:443` exposes `record_block_scanned()` which emits `charon_scanner_blocks_total` (the value of `names::SCANNER_BLOCKS_TOTAL`). Called from `crates/charon-cli/src/main.rs:1017` once per pipeline-processed block.
  • `deploy/grafana/charon.json` — the "Scanner — blocks per second" panel + the `chain` template variable both query `charon_scanner_blocks_total`.

Impact

P2 — operator visibility on listener throughput is mute by default:

  1. The listener-side counter (`charon_blocks_received_total`) is never read by any panel or alert; it does not show up in the dashboard, the README, or `alerts.yaml`. It silently exists in TSDB and consumes cardinality without being observed.
  2. The pipeline-side counter (`charon_scanner_blocks_total`) is bumped per pipeline pass, which only happens once the WS subscription has both connected AND fired its first `new_heads` event AND the pipeline has run a per-block tick. If any of those steps stalls (e.g. WS reconnect storm, pipeline panic), the dashboard panel goes flat even though the listener is still receiving blocks.
  3. Worse: the chain template variable's `label_values(charon_scanner_blocks_total, chain)` only resolves once the pipeline has run at least once on that chain. Until then, every chain-scoped panel shows "No Data" on a fresh dashboard import — even though prom is scraping happily.

Found while bringing up the local Prometheus + Grafana stack — bot was emitting metrics, prom UP, but Grafana panels were empty for the first ~5 minutes until pipeline ticks landed.

Fix

  1. In `crates/charon-metrics/src/lib.rs`, add a typed helper for listener-level block ingress:
    ```rust
    pub fn record_block_received(chain: &str) {
    counter!(names::LISTENER_BLOCKS_RECEIVED_TOTAL, "chain" => chain.to_owned()).increment(1);
    }
    ```
    with `pub const LISTENER_BLOCKS_RECEIVED_TOTAL: &str = "charon_listener_blocks_received_total";` in `names::`.
  2. In `crates/charon-scanner/src/listener.rs:184`, replace the raw `metrics::counter!(...)` call with `charon_metrics::record_block_received(&self.name);`.
  3. Add a `describe_counter!` for the new name in `describe_all()`.
  4. Add the listener counter to the dashboard as a sibling panel ("Listener — blocks received per second"), and keep the existing `charon_scanner_blocks_total` panel for pipeline throughput. Use the listener counter as the source of `label_values(...)` for the `chain` template var so panels populate immediately on first connect, not on first pipeline tick.
  5. Update `alerts.yaml` if any rule currently fires on `charon_blocks_received_total` (none do today, but greppable to confirm).

Severity

P2-polish. Not a runtime correctness bug, but a real DX / observability gap that bites every fresh demo run.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinglayer:rustRust crates (core / scanner / protocols / executor / cli)priority:p2-polishNice-to-have / polishstatus:readyScoped and ready to pick up

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions