[improvement](executor) use real elapsed time to compute workload group metrics refresh interval#63537
Open
bosswnx wants to merge 2 commits into
Open
[improvement](executor) use real elapsed time to compute workload group metrics refresh interval#63537bosswnx wants to merge 2 commits into
bosswnx wants to merge 2 commits into
Conversation
…up metrics refresh interval Replace the fixed config-based interval with the actual monotonic time delta between two refreshes when calculating per-second CPU and scan IO rates in WorkloadGroupMetrics, so the rates stay accurate even when the refresh thread is delayed or the configured interval is changed at runtime. Also add a guard against division by zero when two refreshes happen within less than one second, and add unit tests covering: - Real elapsed time rate computation - Sub-second interval safety (no division by zero) - Proportional rate vs interval relationship - Memory metrics correctness - First-refresh boundary behavior
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Author
|
/review |
1 similar comment
Author
|
/review |
Author
|
run buildall |
Contributor
TPC-H: Total hot run time: 31508 ms |
Contributor
TPC-DS: Total hot run time: 169734 ms |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
The original implementation of
WorkloadGroupMetrics::refresh_metrics()usesconfig::workload_group_metrics_interval_ms / 1000as a fixeddivisor to compute per-second CPU and scan IO rates. This is inaccurate when:
In both cases, the reported per-second CPU/IO rates diverge from reality.
This PR replaces the fixed config-based interval with the actual monotonic time delta between two consecutive refreshes, so the rates stay
accurate regardless of thread scheduling delays or runtime config changes. It also adds a division-by-zero guard for sub-second refresh
intervals and corresponding unit tests.
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)