eBPF agent that monitors Ethereum execution and consensus layer processes at the kernel level. Captures syscalls, disk I/O, network I/O, scheduler events, memory faults, and file descriptor activity — aggregated per slot and exported via ClickHouse. Zero client modifications required. Linux only.
Execution Layer: Geth, Reth, Besu, Nethermind, Erigon
Consensus Layer: Prysm, Lighthouse, Teku, Lodestar, Nimbus
| Type | Description |
|---|---|
syscall_read |
read() syscall with latency |
syscall_write |
write() syscall with latency |
syscall_futex |
futex() syscall with latency |
syscall_mmap |
mmap() syscall with latency |
syscall_epoll_wait |
epoll_wait() syscall with latency |
syscall_fsync |
fsync() syscall with latency |
syscall_fdatasync |
fdatasync() syscall with latency |
syscall_pwrite |
pwrite64() syscall with latency |
disk_io |
Block I/O read/write with latency and byte count |
block_merge |
Block I/O request merge |
net_tx |
TCP send with byte count, ports, and inline RTT/cwnd metrics |
net_rx |
TCP receive with byte count and ports |
tcp_retransmit |
TCP retransmission with byte count and ports |
tcp_state |
TCP state transition with ports |
sched_switch |
Context switch with on-CPU time |
sched_runqueue |
Runqueue/off-CPU latency for scheduled threads |
page_fault |
Page fault (major/minor) |
fd_open |
File descriptor opened |
fd_close |
File descriptor closed |
mem_reclaim |
Direct reclaim latency |
mem_compaction |
Compaction latency |
swap_in |
Swap-in event |
swap_out |
Swap-out event |
oom_kill |
OOM kill event |
process_exit |
Process exit with exit code |
memory_usage |
Per-process memory snapshot (VmSize, VmRSS, RssAnon, RssFile, RssShmem, VmSwap) |
process_io_usage |
Per-process I/O snapshot (rchar, wchar, syscall counts, read/write bytes) |
process_fd_usage |
Per-process FD snapshot (open FDs, soft/hard FD limits) |
process_sched_usage |
Per-process scheduler snapshot (threads and context-switch counters) |
host_specs |
Periodic host hardware snapshot (hashed host id, CPU topology, DIMMs, disks) |
See example.config.yaml for a complete configuration reference.
Tiered per-metric aggregation intervals can be configured with
sinks.aggregated.resolution.overrides:
sinks:
aggregated:
resolution:
interval: 100ms
host_specs_poll_interval: 24h
overrides:
- metrics: [syscall_futex, sched_runqueue, mem_reclaim, mem_compaction]
interval: 500ms
- metrics: [page_fault_major, page_fault_minor, swap_in, swap_out, oom_kill, fd_open, fd_close, process_exit, tcp_state_change]
interval: 1sIf overrides is omitted, all metrics use resolution.interval.
By default, Observoor retains 100% of events (sampling.mode=none, sampling.rate=1.0).
This section applies when you explicitly enable sampling.
When sampling is enabled, stored rows contain sampled aggregates, not pre-scaled estimates.
Use sampling_rate at query time to reconstruct additive totals.
estimated_sum = sum / sampling_rateestimated_count = count / sampling_rateestimated_mean = sum(sum / sampling_rate) / sum(count / sampling_rate)
Example: reconstruct counter totals (bytes + events) from net_io:
SELECT
toStartOfMinute(window_start) AS ts,
sum(`sum` / sampling_rate) AS estimated_bytes,
sum(count / sampling_rate) AS estimated_events
FROM net_io
WHERE meta_network_name = 'mainnet'
AND window_start >= now() - INTERVAL 1 HOUR
GROUP BY ts
ORDER BY ts;Example: reconstruct latency mean from syscall_futex:
SELECT
toStartOfMinute(window_start) AS ts,
sum(`sum` / sampling_rate) / nullIf(sum(count / sampling_rate), 0) AS estimated_mean_ns
FROM syscall_futex
WHERE meta_network_name = 'mainnet'
AND window_start >= now() - INTERVAL 1 HOUR
GROUP BY ts
ORDER BY ts;Notes and caveats:
- This reconstruction is appropriate for additive stats (
sum,count) and means derived from them. min/maxare sampled extrema and are not exactly reconstructable.- Exact quantiles are not reconstructable from sampled rows; weighted histogram buckets can provide approximations.
sampling_mode='probability'is generally better for unbiased estimation.sampling_mode='nth'can introduce workload-dependent bias in some dimensions.
Migrations are embedded in the binary from src/migrate/sql/ and use a schema compatible with golang-migrate.
When sinks.aggregated.clickhouse.migrations.enabled: true, Observoor applies pending migrations automatically at startup.
# Requires Linux with kernel headers and libbpf
make buildUse the built-in performance suite to guard correctness, allocations, and CPU hot paths:
# Full blackbox + alloc + Criterion suite (cross-platform default: --no-default-features)
make perf-suite
# Criterion-only full run
make bench
# Fast Criterion smoke run
make bench-smokeIf you want to run benchmarks with default features (for Linux production parity), override:
make bench PERF_CARGO_ARGS=""Pull requests also run .github/workflows/perf-gate.yaml, which compares hot_paths
against the PR base commit and fails on significant regressions in key throughput paths.
sudo ./observoor --config config.yamlRoot (or CAP_BPF + CAP_PERFMON) is required for eBPF program loading.
