resource_control: add PredictedReadBytes hint for RC paging pre-charge#10599
resource_control: add PredictedReadBytes hint for RC paging pre-charge#10599YuhaoZhang00 wants to merge 10 commits intotikv:masterfrom
Conversation
Add PagingSizeBytes() to RequestInfo interface. When a read request carries a byte budget (from RC paging), BeforeKVRequest pre-charges the estimated read RU in Phase 1, and AfterKVRequest subtracts it in Phase 2 to maintain correct total cost. This makes concurrent workers throttle at Phase 1 instead of all hitting Phase 2 simultaneously. Signed-off-by: JmPotato <github@ipotato.me>
Test that BeforeKVRequest pre-charges pagingSizeBytes * ReadBytesCost, AfterKVRequest subtracts it, and the net total equals baseCost + actualCost. Signed-off-by: JmPotato <github@ipotato.me>
…ment When paging byte budget (pagingSizeBytes) is enabled, BeforeKVRequest pre-charges ReadBytesCost * pagingSizeBytes RRU into the token limiter. AfterKVRequest then computes the actual cost and subtracts the pre-charge. If the actual cost is less than the pre-charge (common case, since pagingSizeBytes is an upper bound), the settlement delta is negative. Previously, the negative delta was correctly recorded in the consumption counter but silently dropped by the token limiter due to a `v > 0` guard — causing permanent token leakage proportional to (pagingSizeBytes - actualReadBytes) per request. Fix: add Limiter.RefundTokens (inverse of RemoveTokens) and call it from both onResponseImpl and onResponseWaitImpl when the settlement delta is negative. This ensures the limiter's available token balance accurately reflects actual consumption after each request completes. RefundTokens design notes: - No burst cap applied (consistent with Reconfigure; getTokens handles lazy capping on the next limiter operation). - No maybeNotify call (refunding moves balance away from the low-token threshold, never toward it). Signed-off-by: JmPotato <github@ipotato.me>
Introduce an optional predictedReadBytesProvider interface on RequestInfo. When a caller (e.g. TiDB maintaining a per-logical-scan EMA across paging RPCs) supplies a non-zero PredictedReadBytes, BeforeKVRequest/AfterKVRequest use that value as the byte basis for the paging pre-charge instead of PagingSizeBytes. PagingSizeBytes remains the fallback and worst-case cap. This lets TiDB replace the current fixed 4 MiB pre-charge (which matches the paging byte budget but typically overshoots actual scanned bytes and stalls concurrent workers at Phase 1) with a learned estimate, without changing kvproto or TiKV behavior. The hint is added as an optional interface (not a method on RequestInfo) so existing RequestInfo implementations compile unchanged; they continue to fall back to PagingSizeBytes. Ref: per-logical-scan EMA pre-deduction design Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
|
Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Hi @YuhaoZhang00. Thanks for your PR. I'm waiting for a tikv member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #10599 +/- ##
==========================================
+ Coverage 78.88% 78.94% +0.06%
==========================================
Files 530 532 +2
Lines 71548 72069 +521
==========================================
+ Hits 56439 56895 +456
- Misses 11092 11139 +47
- Partials 4017 4035 +18
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
Adds four Prometheus metrics under namespace resource_manager_client to
make the PredictedReadBytes-based paging pre-charge fix directly
observable end-to-end:
- paging_precharge_source_total{source=predicted|fallback}
- paging_precharge_bytes_total{source}
- paging_actual_bytes_total{source}
- paging_prediction_residual_bytes (histogram, predicted path only)
They are wired into onRequestWaitImpl / onResponseImpl so no change to
the ResourceCalculator interface is needed. Labels are cached per group
to keep the hot path allocation-free.
Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
|
Added 4 observability metrics on the PD client side for upcoming e2e validation of the paging pre-charge path:
No public interface change; instrumentation is at the |
The paging actual-bytes counter was only updated from onResponseImpl, but most responses under RC throttling flow through onResponseWaitImpl, so the counter stayed at 0 in practice and the over-charge ratio metric could not be computed. Mirror the observePagingActual call into the wait path. Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
Pre-charge now requires a learned PredictedReadBytes hint. Without a hint (EMA cold start or unhinted caller) the request is not pre-charged and is billed at Phase 2 by actual read bytes only. PagingSizeBytes is no longer consulted for pre-charge so the protocol-level paging cap and the RU-billing pre-charge are fully decoupled. Motivation: Round 2 data showed the fallback path over-charged 4-268x because PagingSizeBytes is a worst-case cap, not an expectation. Under concurrency this produced large artificial Phase 1 throttling, pushing QPS/latency into tier-dependent non-linear regressions. The cold window loses Phase 1 pre-throttling with this change; Phase 2 token-bucket billing still enforces the quota. Rewrote the model tests to reflect the new semantics: - TestPredictedReadBytesPreCharge asserts hint-driven pre-charge is unaffected by PagingSizeBytes. - TestNoPreChargeWithoutPredictedReadBytes asserts PagingSizeBytes alone no longer triggers pre-charge. - Existing refund/settlement tests now drive pre-charge via PredictedReadBytes instead of PagingSizeBytes. Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
…trics Follow-up to removing the PagingSizeBytes fallback: the source label on paging pre-charge metrics had only two values (predicted / fallback) and fallback is no longer reachable. Drop the label dimension entirely and rename PagingPrechargeSourceCounter to PagingPrechargeCounter; the remaining three counters and one histogram now measure pre-charged requests only (those with a PredictedReadBytes hint > 0). Inlined estimatePrechargeSource at its three call sites by calling estimatedReadBytes directly, and simplified observePagingPrecharge / observePagingActual to their single remaining case. Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
Record RPCs that implement the predicted-bytes hint interface but report zero (EMA cold-start or feature-disabled) and therefore skip Phase 1 pre-charge. Two new counters tagged by resource group: - paging_nonprecharge_total: count of bypassed RPCs - paging_nonprecharge_actual_bytes_total: actual bytes read by them Observed from the Phase 2 settlement path alongside the existing pre-charge actual-bytes metric, gated on non-write requests. Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
…g precharge Replace "Phase 1" / "Phase 2" in comments and test names with the existing API terms (BeforeKVRequest pre-charge / AfterKVRequest settle). The Phase 1/2 framing was introduced only in this branch's own commits and is not part of any preexisting controller convention; staying with the BeforeKVRequest/AfterKVRequest vocabulary keeps the prose readable to anyone already familiar with ResourceCalculator. No behavior change; only comments, test variable names, and metric Help strings. Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
|
[FORMAT CHECKER NOTIFICATION] Notice: To remove the 📖 For more info, you can check the "Linking issues" section in the CONTRIBUTING.md. |
Summary
Add an optional
predictedReadBytesProviderinterface onRequestInfo.When a caller (e.g. TiDB maintaining a per-logical-scan EMA across paging
cop RPCs) supplies a non-zero
PredictedReadBytes,BeforeKVRequestandAfterKVRequestuse that value as the byte basis for the paging pre-chargeinstead of
PagingSizeBytes.PagingSizeBytesremains the fallback andworst-case cap.
Existing
RequestInfoimplementations compile unchanged — the hint is anoptional interface, not a method on
RequestInfo.Why
The fixed 4 MiB paging pre-charge introduced in #10548 typically overshoots
actual scanned bytes and stalls concurrent workers at Phase 1 under tight
resource tiers. A learned per-scan estimate from the TiDB side eliminates
the over-estimation without changing kvproto or TiKV behavior.
Status
Draft. Part of a stacked change:
tikv/client-go: addstikvrpc.Request.PredictedReadBytes+RequestInfogetterpingcap/tidb: maintains per-logical-scan TAEMA oncopTask, feeds prediction on every paging RPCNot yet ready for review; e2e validation on the simulation cluster is pending.
Test plan
client/resource_group/controller/cover hint-present and hint-absent paths, plus an excess-pre-charge refund case