monitoring: migrate dashboards to native histograms#5048
Conversation
|
afaiu, after this change users who don't have Should we document this in some way? |
I am not a fan of this aspect either, we could have native dashboards an non native dashboard with ifs in the helm. that's also very ugyl, wdyt? |
…back
Converts all classic Prometheus histogram queries to native histogram
equivalents, with a Helm flag to switch back to classic queries for
environments that haven't enabled scrape_native_histograms yet.
- Add dashboards.nativeHistograms (default: true) to values.yaml
- Wrap every histogram expr block in {{- if .Values.dashboards.nativeHistograms }}
/ {{- else }} / {{- end }} across all four dashboard templates (107 blocks)
- Transform rules per branch:
- histogram_sum/count(rate(X)) ↔ rate(X_sum/count)
- histogram_quantile without _bucket ↔ adds _bucket + by (le, ...)
- histogram_avg(sum(rate(X))) ↔ sum(rate(X_sum)) / sum(rate(X_count))
- Add tenantQueryClassic to values.yaml for the tenant mapping panel
- Add --output.dashboards.path flag to monitoring-chart-extractor
- make helm/check now generates both:
- operations/monitoring/dashboards/ (native, default)
- operations/monitoring/dashboards-classic-histogram/ (classic)
844230e to
d024237
Compare
Bump k8s-monitoring subchart from 3.7.2 to 3.8.8 which added the scrapeNativeHistograms global toggle, and enable it so Alloy scrapes native histograms from Pyroscope's protobuf endpoint. The dashboard queries from the previous commit already use native histogram functions; this unblocks them.
|
TruffleHog Scan Results Summary: Found 1 potential secrets (0 verified, 1 unverified)
Review: Check if unverified secrets are false positives. Ignoring False Positives: This works for files that support line numbers (most source files). After adding the comment, push your changes and the scan will re-run. |
Not too sure if we can ignore this the same way |
|
Okay I have implemented a variant with and without native histograms also our helm chart now supports native histograms since the helm chart update: Bumped k8s-monitoring 3.7.2 → 3.8.8 @marcsanmi please take another look |
Summary
Converts all classic Prometheus histogram queries to native histogram equivalents across all monitoring dashboards and Helm chart templates.
_bucketsuffix andby (le)fromhistogram_quantilecallsrate(metric_count{...})withhistogram_count(rate(metric{...}))rate(metric_sum{...})withhistogram_sum(rate(metric{...}))_sum/_countaverage ratio withhistogram_avg(sum(rate(metric{...})))values.yamltenantQuerydefaultDashboards affected:
operational,v2-metastore,v2-read-path,v2-write-pathThe raw JSON dashboards under
operations/monitoring/dashboards/are generated viamake helm/check— only the Helm templates were hand-edited.Test plan
make helm/checkpasses (already verified locally)scrape_native_histograms: trueon the Prometheus scrape config for Pyroscope)Note
Low Risk
Low risk: this only changes Makefile generation steps for rendered dashboards and does not affect runtime code paths, but it can change generated dashboard artifacts and CI outputs.
Overview
make helm/checknow generates monitoring dashboards/rules for native histograms by default, and additionally renders a classic Prometheus histogram dashboard set by templatingpyroscope-monitoringwith--set dashboards.nativeHistograms=falseand exporting tooperations/monitoring/dashboards-classic-histogram/.Reviewed by Cursor Bugbot for commit 82cbf68. Bugbot is set up for automated code reviews on this repo. Configure here.