Skip to content

SCHED-301 Scale metric collection for KSM#2506

Open
ChessProfessor wants to merge 2 commits intomainfrom
chessprofessor/SCHED-301/scale-metric-collection-for-kube-state-metrics
Open

SCHED-301 Scale metric collection for KSM#2506
ChessProfessor wants to merge 2 commits intomainfrom
chessprofessor/SCHED-301/scale-metric-collection-for-kube-state-metrics

Conversation

@ChessProfessor
Copy link
Copy Markdown
Collaborator

@ChessProfessor ChessProfessor commented May 4, 2026

Problem

On large clusters, kube-state-metrics can return a /metrics response larger than the default vmagent scrape size limit, causing scrape failures around ~1.1k nodes.

Solution

Added a configurable max_scrape_size for the kube-state-metrics main http scrape endpoint. The chart default is not raised: by default, kube-state-metrics inherits the existing global vmagent promscrape.maxScrapeSize. Large-cluster deployments can set observability.vmStack.values.kubeStateMetrics.maxScrapeSize explicitly.

The global vmagent limit remains unchanged for other targets.

Testing

Ran:

- helm template test-release helm/soperator-fluxcd
- helm unittest -f 'tests/kube_state_metrics_test.yaml' helm/soperator-fluxcd
- helm unittest helm/soperator-fluxcd
- helm lint helm/soperator-fluxcd
- git diff --check

Release Notes

@ChessProfessor ChessProfessor self-assigned this May 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant