store/copr: refine RC paging pre-charge with per-scan EMA by YuhaoZhang00 · Pull Request #67759 · pingcap/tidb

YuhaoZhang00 · 2026-04-14T08:25:30Z

What problem does this PR solve?

Issue Number: ref #N/A (draft — issue to be filed)

Problem Summary:

Under Resource Control, TiDB pre-charges RUs for each paging coprocessor
RPC using a fixed paging_size_bytes (currently 4 MiB, set by #67504). This
worst-case charge typically overshoots the actual scanned bytes by a large
margin, and under a tight resource tier it stalls concurrent workers at
Phase 1 (token acquisition) even when the cluster has spare capacity. The
result is that large paging_size_bytes values collapse QPS on workloads
like a 1500-thread JOIN benchmark, while small values (256 KiB) look fine.

What changed and how does it work?

Maintain a time-aware EMA of the actual per-RPC MVCC read bytes on
copTask, and once enough samples are in (>= 2), pass the prediction to
PD via the new tikvrpc.Request.PredictedReadBytes hint so the pre-charge
uses the learned estimate instead of the 4 MiB cap. On cold start and for
non-paging tasks the behavior is unchanged (falls back to PagingSizeBytes).

The EMA state lives on copTask and is naturally scoped to one logical scan
across its paging rounds — no cross-RPC or cross-process sync. The
prediction is propagated across region/lock retries so learning is not lost.

Stacked with:

tikv/client-go *: improve log #1944: adds the PredictedReadBytes field on
tikvrpc.Request and the RequestInfo getter
tikv/pd store/tikv: collect exited goroutines when shutdown the server #10599: consumes the hint in BeforeKVRequest / AfterKVRequest

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)
No need to test

Side effects

Performance regression: Consumes more CPU
Performance regression: Consumes more Memory
Breaking backward compatibility

Documentation

Release note

None

Draft note: E2E validation on the simulation cluster is pending. Once
the client-go + PD companion PRs land (or are pointed to via go.mod replace
for CI), this PR's CI will turn green. This is a draft for early visibility
and discussion.

When Resource Control is enabled, force-enable paging for eligible coprocessor requests and set a byte budget (paging_size_bytes) per page. This limits the scanned bytes of each page so that RU cost is bounded and pre-charged in Phase 1, preventing concurrent workers from all hitting Phase 2 simultaneously. Signed-off-by: JmPotato <github@ipotato.me>

Signed-off-by: JmPotato <github@ipotato.me>

…ation Test rcPagingEligible conditions (RC enabled, resource group, TiKV, DAG) and verify pagingSizeBytes is correctly propagated to copTasks and cleared when small limit disables paging. Signed-off-by: JmPotato <github@ipotato.me>

Signed-off-by: JmPotato <github@ipotato.me>

Add a new session variable `tidb_rc_paging_size_bytes` to allow dynamically controlling the byte budget per page for RC paging via SQL. - Default: 4MB (paging.MaxPagingSizeBytes) - Set to 0 to disable byte-budget paging - Supports both GLOBAL and SESSION scope Usage: SET @@tidb_rc_paging_size_bytes = 4194304; -- 4MB (default) SET @@tidb_rc_paging_size_bytes = 1048576; -- 1MB SET @@tidb_rc_paging_size_bytes = 0; -- disable Signed-off-by: JmPotato <github@ipotato.me>

Signed-off-by: JmPotato <github@ipotato.me>

Increase the bucket count from 10 (max 512KB) to 19 (max 256MB) for the tidb_distsql_copr_resp_size histogram to improve percentile accuracy when observing large coprocessor responses during paging size tuning. Signed-off-by: JmPotato <github@ipotato.me>

- Rename all RC-prefixed identifiers to generic names (e.g. RCPagingSizeBytes → PagingSizeBytes, rcPagingEligible → pagingBytesEligible) since byte-budget paging is general infrastructure, not RC-specific. - Remove RC gates (EnableResourceControl, ResourceGroupName) from pagingBytesEligible; only TiKV + DAG checks remain. - Remove unused MinPagingSizeBytes/MaxPagingSizeBytes constants. - Change DefPagingSizeBytes default from 4MB to 0 (disabled, opt-in). - Move byte-budget paging force-enable before checkStoreBatchCopr to preserve the paging-disables-batch-copr invariant. - Add tidb_paging_size_bytes to SET_VAR hint allowlist. - Fix copr_resp_size histogram: use float division to preserve sub-KiB precision, extend buckets to 256MB, correct help text unit to KiB. - Fix anonymous Paging struct literal in copr_test to include the new PagingSizeBytes field. - Add PagingSizeBytes to TestContextDetach fixture. - Add integration test for tidb_paging_size_bytes session variable. Signed-off-by: JmPotato <github@ipotato.me>

Update the PD client replace directive to pick up commit fda21466b565 which adds Limiter.RefundTokens and fixes the pre-charge token leakage in paging byte budget scenarios. Signed-off-by: JmPotato <github@ipotato.me>

Maintain a time-aware EMA of actual per-RPC MVCC read bytes on copTask and, once enough samples are in, pass the prediction to PD via the new tikvrpc.Request.PredictedReadBytes hint so pre-charge uses the learned estimate instead of the 4 MiB worst-case cap. Falls back to PagingSizeBytes on cold start. Propagated across region/lock retries; no-op for non-paging tasks. Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>

Point to the demo/ema-precharge working branches in ~/work/client-go and ~/work/pd for local build + e2e simulation. Drop this commit before publishing as a PR; replace with pseudo-versions of pushed forks. Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>

tiprow · 2026-04-14T08:25:47Z

Hi @YuhaoZhang00. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

ti-chi-bot · 2026-04-14T08:26:29Z

Hi @YuhaoZhang00. Thanks for your PR.

I'm waiting for a pingcap member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

coderabbitai · 2026-04-14T08:30:51Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 9fb13af5-fad6-4e23-ac05-4b2e5b734514

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Adds tidb_distsql_copr_ema_observation_total with a {state=cold|ready} label to quantify, in aggregate, how many per-logical-scan copTask EMA Observe() calls happen while the EMA is still below its readiness threshold vs after. This complements the PD-side paging pre-charge source/bytes counters by showing how often the TiDB side could emit a non-zero PredictedReadBytes hint for downstream pre-charge. Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>

YuhaoZhang00 · 2026-04-14T09:22:28Z

Added one observability metric on the TiDB side for upcoming e2e validation of the per-logical-scan EMA path:

tidb_distsql_copr_ema_observation_total{type=cold|ready} — counts each copTask.ema.Observe call, labelled by whether the EMA was already at its readiness threshold before the observation (i.e. whether the corresponding paging pre-charge on the PD side went through the predicted-hint path or the cold fallback).

Paired with the 4 PD-side paging pre-charge metrics (see tikv/pd#10599 comment), this lets the validation report show the cold-start window quantitatively rather than inferring it from QPS alone. Draft status unchanged.

Move ruEMA init from lazy (handleTaskOnce) to eager (copIterator copTask constructors). Guarantees task.ema is non-nil on this path, so callers drop nil-safety in IsReady/Predict and the Observe call site. Trim comments. Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>

Move the RC paging EMA from per-copTask to per-copIterator so every paging RPC under the same kv.Request feeds a single learned estimate. Workers share a pointer to the iterator's EMA; access is now guarded by an internal mutex since multiple workers may Observe/Predict concurrently. Dropped the remain-task ema propagation: remain tasks re-enter through the same worker, which still holds the iterator's EMA, so there is nothing to copy. Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>

…opr_ema_send_total The previous copr_ema_observation_total counter measured EMA readiness at response-handling time (inside the paging branch), which is the wrong place to answer "did this RPC's pre-charge dispatch a hint". The send-time decision (req.PredictedReadBytes) happens earlier and for every cop RPC, while the response-side counter's denominator is filtered to paging-branch responses and its cold/ready flag can flip mid-flight due to concurrent workers' Observe calls on the shared EMA. Introduce copr_ema_send_total{type=cold|ready}, incremented at hint dispatch on every outgoing cop RPC. This is the authoritative TiDB-side basis for pre-charge coverage ratios and aligns with the PD-side M1/(M1+M6a) precharge coverage metric. Drop copr_ema_observation_total entirely: once send-side coverage is the correct metric, the response-side variant has no independent use that isn't better served by the raw EMA cold/ready sample counts exposed elsewhere. Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>

ti-chi-bot · 2026-04-21T06:57:23Z

[FORMAT CHECKER NOTIFICATION]

Notice: To remove the do-not-merge/needs-linked-issue label, please provide the linked issue number on one line in the PR body, for example: Issue Number: close #123 or Issue Number: ref #456.

_{📖 For more info, you can check the "Contribute Code" section in the development guide.}

ti-chi-bot · 2026-04-21T06:57:29Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign cfzjywxk, nolouch, terry1purcell for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

JmPotato and others added 11 commits April 7, 2026 17:35

chore: tidy go.mod replace formatting

284d13c

Signed-off-by: JmPotato <github@ipotato.me>

chore: update client-go and pd replace to latest demo/ru-paging-size

c8922a0

Signed-off-by: JmPotato <github@ipotato.me>

build: update bazel metadata

d74e277

Signed-off-by: JmPotato <github@ipotato.me>

chore: update pd client replace to include token refund fix

7824ff0

Update the PD client replace directive to pick up commit fda21466b565 which adds Limiter.RefundTokens and fixes the pre-charge token leakage in paging byte budget scenarios. Signed-off-by: JmPotato <github@ipotato.me>

ti-chi-bot Bot added do-not-merge/needs-linked-issue release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Apr 14, 2026

ti-chi-bot Bot added needs-ok-to-test Indicates a PR created by contributors and need ORG member send '/ok-to-test' to start testing. contribution This PR is from a community contributor. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Apr 14, 2026

YuhaoZhang00 force-pushed the demo/ema-precharge branch from b482c53 to d21c5b6 Compare April 15, 2026 13:56

YuhaoZhang00 force-pushed the demo/ema-precharge branch from e52110e to 2c03c33 Compare April 20, 2026 05:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

store/copr: refine RC paging pre-charge with per-scan EMA#67759

store/copr: refine RC paging pre-charge with per-scan EMA#67759
YuhaoZhang00 wants to merge 15 commits intopingcap:masterfrom
YuhaoZhang00:demo/ema-precharge

YuhaoZhang00 commented Apr 14, 2026

Uh oh!

tiprow Bot commented Apr 14, 2026

Uh oh!

ti-chi-bot Bot commented Apr 14, 2026

Uh oh!

coderabbitai Bot commented Apr 14, 2026 •

edited

Loading

Review skipped

Uh oh!

YuhaoZhang00 commented Apr 14, 2026

Uh oh!

ti-chi-bot Bot commented Apr 21, 2026

Uh oh!

ti-chi-bot Bot commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

YuhaoZhang00 commented Apr 14, 2026

What problem does this PR solve?

What changed and how does it work?

Check List

Release note

Uh oh!

tiprow Bot commented Apr 14, 2026

Uh oh!

ti-chi-bot Bot commented Apr 14, 2026

Uh oh!

coderabbitai Bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

YuhaoZhang00 commented Apr 14, 2026

Uh oh!

ti-chi-bot Bot commented Apr 21, 2026

Uh oh!

ti-chi-bot Bot commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai Bot commented Apr 14, 2026 •

edited

Loading