Skip to content

[refactor](be) Remove redundant remaining conjunct roots#63525

Draft
mrhhsg wants to merge 1 commit into
apache:masterfrom
mrhhsg:codex/remove-remaining-conjunct-roots-refactor
Draft

[refactor](be) Remove redundant remaining conjunct roots#63525
mrhhsg wants to merge 1 commit into
apache:masterfrom
mrhhsg:codex/remove-remaining-conjunct-roots-refactor

Conversation

@mrhhsg
Copy link
Copy Markdown
Member

@mrhhsg mrhhsg commented May 22, 2026

What problem does this PR solve?

Issue Number: None

Related PR: None

Problem Summary: SegmentIterator kept a separate remaining_conjunct_roots list in addition to common_expr_ctxs_push_down. Both represented the same residual common expression state, so index pruning paths had to keep them synchronized and could drift when an expression was removed after index evaluation. This change removes the duplicated remaining_conjunct_roots plumbing and uses common_expr_ctxs_push_down as the single source of residual common expression state for condition cache decisions, lazy materialization column extraction, and expression execution checks.

Release note

None

Check List (For Author)

  • Test: Build
    • ./build.sh --be
    • build-support/clang-format.sh
    • git diff --check
  • Behavior changed: No
  • Does this need documentation: No

### What problem does this PR solve?

Issue Number: None

Related PR: None

Problem Summary: SegmentIterator kept a separate remaining_conjunct_roots list in addition to common_expr_ctxs_push_down. Both represented the same residual common expression state, so index pruning paths had to keep them synchronized and could drift when an expression was removed after index evaluation. This change removes the duplicated remaining_conjunct_roots plumbing and uses common_expr_ctxs_push_down as the single source of residual common expression state for condition cache decisions, lazy materialization column extraction, and expression execution checks.

### Release note

None

### Check List (For Author)

- Test: Build
    - ./build.sh --be
    - build-support/clang-format.sh
    - git diff --check
- Behavior changed: No
- Does this need documentation: No
@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@mrhhsg
Copy link
Copy Markdown
Member Author

mrhhsg commented May 22, 2026

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review result: no blocking findings identified. The refactor consistently removes the duplicated remaining_conjunct_roots plumbing and uses common_expr_ctxs_push_down as the single residual common-expression state across scan params, tablet/rowset reader context, SegmentIterator condition-cache gating, index-pruned expression removal, lazy-materialization column extraction, and expression execution checks.

Critical checkpoint conclusions:

  • Goal/test: The code accomplishes the stated refactor goal; no leftover references to the removed fields were found. The PR reports BE build/format/diff-check, but I did not rerun a full build in this review.
  • Scope: The change is small and focused on one data path and removes duplicated state rather than adding new behavior.
  • Concurrency/lifecycle: No new threads, shared mutable cross-thread state, locks, or special lifecycle/static initialization concerns are introduced.
  • Configuration/compatibility/protocol: No new config, persisted format, wire protocol, or rolling-upgrade compatibility impact found.
  • Parallel paths: The scan -> tablet reader -> rowset reader -> segment iterator forwarding path is updated consistently for the removed field.
  • Conditional logic: The remaining conditional checks now use common_expr_ctxs_push_down directly; this matches the post-index residual-expression state used for execution.
  • Test coverage/results: No test files were added or changed in the actual PR diff. Given this is a narrow refactor, build coverage is plausible, but an execution/regression case around common expr pushdown with inverted/ANN index pruning would further reduce risk.
  • Observability: No new user-visible path or operational state requiring additional logging/metrics was introduced.
  • Transactions/persistence/data writes: Not applicable; this is read-path scan filtering state only.
  • FE/BE variables: No FE-BE thrift or transmitted variable changes found.
  • Performance: The change removes duplicate vector state and synchronization work; no obvious new hot-path overhead was found.

User focus points: none were provided.

@mrhhsg
Copy link
Copy Markdown
Member Author

mrhhsg commented May 22, 2026

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 31259 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit a6d98094ef484315eb564a9b9d8e1569426ff3a4, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17615	3852	3830	3830
q2	q3	10741	1375	827	827
q4	4687	483	344	344
q5	7691	2338	2145	2145
q6	242	177	136	136
q7	923	790	635	635
q8	9378	1629	1541	1541
q9	5225	4934	4980	4934
q10	6393	2082	1797	1797
q11	440	284	250	250
q12	628	429	294	294
q13	18149	3623	2772	2772
q14	264	259	237	237
q15	q16	822	786	702	702
q17	1020	867	1021	867
q18	6942	5737	5553	5553
q19	1339	1250	1075	1075
q20	535	511	327	327
q21	6175	2827	2672	2672
q22	455	384	321	321
Total cold run time: 99664 ms
Total hot run time: 31259 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4856	4518	4503	4503
q2	q3	4832	5261	4627	4627
q4	2121	2189	1415	1415
q5	4787	4823	4608	4608
q6	228	170	130	130
q7	1817	1756	1604	1604
q8	2416	2090	2054	2054
q9	7747	7575	7185	7185
q10	4527	4430	3971	3971
q11	525	373	352	352
q12	707	714	505	505
q13	3064	3438	2742	2742
q14	276	285	246	246
q15	q16	680	688	610	610
q17	1278	1239	1239	1239
q18	7285	6946	6748	6748
q19	1128	1126	1104	1104
q20	2235	2205	1918	1918
q21	5298	4595	4532	4532
q22	522	448	430	430
Total cold run time: 56329 ms
Total hot run time: 50523 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 170104 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit a6d98094ef484315eb564a9b9d8e1569426ff3a4, data reload: false

query5	4325	670	529	529
query6	327	229	201	201
query7	4234	565	330	330
query8	351	240	219	219
query9	8830	4066	4020	4020
query10	455	348	300	300
query11	5751	2416	2162	2162
query12	191	132	125	125
query13	1266	612	471	471
query14	6022	5329	5014	5014
query14_1	4391	4351	4350	4350
query15	212	202	183	183
query16	1020	460	426	426
query17	1020	731	593	593
query18	2452	478	361	361
query19	212	201	162	162
query20	141	130	127	127
query21	214	141	120	120
query22	13544	13691	13344	13344
query23	17297	16297	16034	16034
query23_1	16153	16168	16131	16131
query24	7450	1767	1312	1312
query24_1	1283	1311	1321	1311
query25	598	496	449	449
query26	1327	319	175	175
query27	2712	574	350	350
query28	4456	1950	1986	1950
query29	1000	654	521	521
query30	312	241	203	203
query31	1138	1070	947	947
query32	87	79	80	79
query33	572	365	317	317
query34	1212	1125	657	657
query35	785	801	697	697
query36	1336	1343	1151	1151
query37	156	109	92	92
query38	3244	3168	3065	3065
query39	936	931	900	900
query39_1	876	872	885	872
query40	232	154	134	134
query41	75	71	69	69
query42	114	114	113	113
query43	326	326	289	289
query44	
query45	219	211	199	199
query46	1054	1153	730	730
query47	2331	2328	2179	2179
query48	420	425	321	321
query49	663	505	404	404
query50	1006	358	253	253
query51	4390	4343	4317	4317
query52	106	108	99	99
query53	253	291	203	203
query54	332	287	277	277
query55	98	94	89	89
query56	331	339	315	315
query57	1422	1382	1324	1324
query58	334	299	290	290
query59	1561	1660	1389	1389
query60	343	390	318	318
query61	159	152	157	152
query62	672	612	559	559
query63	242	203	208	203
query64	2429	789	648	648
query65	
query66	1723	471	348	348
query67	29474	29921	29866	29866
query68	
query69	484	352	307	307
query70	1002	1000	1024	1000
query71	307	272	273	272
query72	2982	2729	2463	2463
query73	855	757	416	416
query74	5100	4951	4749	4749
query75	2683	2578	2238	2238
query76	2263	1148	770	770
query77	418	422	340	340
query78	12239	12166	11626	11626
query79	1466	1070	737	737
query80	656	532	455	455
query81	456	273	242	242
query82	1389	159	122	122
query83	357	281	246	246
query84	303	143	115	115
query85	899	558	452	452
query86	390	321	318	318
query87	3447	3359	3228	3228
query88	3544	2695	2690	2690
query89	440	395	334	334
query90	1995	206	196	196
query91	195	170	144	144
query92	83	81	74	74
query93	1512	1467	935	935
query94	538	341	310	310
query95	658	464	348	348
query96	1025	846	346	346
query97	2715	2675	2568	2568
query98	239	229	253	229
query99	1102	1110	979	979
Total cold run time: 252932 ms
Total hot run time: 170104 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 57.14% (4/7) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.70% (20768/38677)
Line Coverage 37.28% (196751/527749)
Region Coverage 33.60% (154212/458903)
Branch Coverage 34.61% (67172/194088)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (7/7) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.95% (27258/37883)
Line Coverage 55.26% (290909/526412)
Region Coverage 52.40% (242791/463339)
Branch Coverage 53.60% (104424/194829)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants