-
Notifications
You must be signed in to change notification settings - Fork 295
Collect JVM memory metrics #2937
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
core/src/main/java/google/registry/monitoring/whitebox/StackdriverModule.java
Fixed
Show fixed
Hide fixed
ptkach
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good, but did you say we collect it evert 60s? It's for evert pod in every workload, so I think it's way too much metrics for what we need - I'd suggest lowering it every 5mins
@ptkach reviewed 3 files and all commit messages, and made 1 comment.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @jicelhay).
|
Yes, we're just registering new metrics in the default MetricRegistry and piggybacking the logic for writing them. the existing writer is configured to write every 60 secs (see StackdriverModule.provideMetricReporter and writeIntervalSeconds in the default config yaml). From what I see in the metrics explorer, we already pushing 30+ custom metrics, so I don't think pushing 3 more is a big issue. Creating a configuration for a new writer + providing it and initializing as it's own process just to have these new metrics report in a different interval doesn't seem worth it for me, let me know what you think. |
ptkach
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets merge and see how metrics volume changes after a week of running this in prod
@ptkach made 1 comment.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @jicelhay).
Adds JVM memory monitoring by registering new gauge metrics into the existing MetricRegistryImpl object. Metrics in MetricRegistryImpl are already being pushed to Cloud monitoring every 60 secs by a MetricReporter.
Example of metrics in crash: https://console.cloud.google.com/monitoring/metrics-explorer;duration=PT1H?referrer=search&project=domain-registry-crash&pageState=%7B%22xyChart%22:%7B%22constantLines%22:%5B%5D,%22dataSets%22:%5B%7B%22plotType%22:%22LINE%22,%22pointConnectionMethod%22:%22GAP_DETECTION%22,%22targetAxis%22:%22Y1%22,%22timeSeriesFilter%22:%7B%22aggregations%22:%5B%7B%22alignmentPeriod%22:%2260s%22,%22crossSeriesReducer%22:%22REDUCE_SUM%22,%22groupByFields%22:%5B%22metric.label.%5C%22type%5C%22%22,%22resource.label.%5C%22pod_id%5C%22%22%5D,%22perSeriesAligner%22:%22ALIGN_MEAN%22%7D%5D,%22apiSource%22:%22DEFAULT_CLOUD%22,%22crossSeriesReducer%22:%22REDUCE_SUM%22,%22filter%22:%22metric.type%3D%5C%22custom.googleapis.com%2Fjvm%2Fmemory%2Fmax%5C%22%20resource.type%3D%5C%22gke_container%5C%22%22,%22groupByFields%22:%5B%22metric.label.%5C%22type%5C%22%22,%22resource.label.%5C%22pod_id%5C%22%22%5D,%22minAlignmentPeriod%22:%2260s%22,%22perSeriesAligner%22:%22ALIGN_MEAN%22%7D%7D%5D,%22options%22:%7B%22mode%22:%22COLOR%22%7D,%22y1Axis%22:%7B%22label%22:%22%22,%22scale%22:%22LINEAR%22%7D%7D%7D
b/468031702
This change is