Skip to content

[fix](variant) preserve TIMESTAMPTZ values in sparse path#63522

Open
csun5285 wants to merge 1 commit into
apache:masterfrom
csun5285:fix/DORIS-25915-variant-timestamptz-sparse
Open

[fix](variant) preserve TIMESTAMPTZ values in sparse path#63522
csun5285 wants to merge 1 commit into
apache:masterfrom
csun5285:fix/DORIS-25915-variant-timestamptz-sparse

Conversation

@csun5285
Copy link
Copy Markdown
Contributor

@csun5285 csun5285 commented May 22, 2026

Add the missing write_one_cell_to_binary override mirroring DataTypeDateTimeV2SerDe so the writer also emits the scale byte. Reader is already correct.

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

DataTypeTimeStampTzSerDe inherited DataTypeNumberSerDe's default
write_one_cell_to_binary, which emits [type:1][value:8]. The matching
reader branch in DataTypeNumberSerDe<TYPE_TIMESTAMPTZ>::deserialize_binary_to_*
skips a scale byte before reading the value, expecting [type:1][scale:1][value:8].
The 1-byte layout mismatch shifted every read by one byte, leaving only the
timezone-offset bits intact, so CAST(var['ts'] AS string) on a variant typed
path that fell to sparse returned just "+08:00" (DORIS-25915).

Add the missing write_one_cell_to_binary override mirroring
DataTypeDateTimeV2SerDe so the writer also emits the scale byte. Reader is
already correct.

Tests:
- regression-test/suites/variant_p0/test_variant_timestamptz_sparse.groovy
  reproduces the Jira repro (typed paths > variant_max_subcolumns_count with
  variant_enable_typed_paths_to_sparse=true) and asserts the read value
  contains the date portion.
- BE UT data_type_serde_timestamptz_test.cpp adds binary_roundtrip covering
  scale=0/3/6, checking the 10-byte layout and roundtrip via both
  DataTypeSerDe::deserialize_binary_to_column and ::deserialize_binary_to_field.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@csun5285
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 31308 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 807ea93d508aa0cc3842a48ffe84893a4a26fab5, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17658	3864	3868	3864
q2	q3	10787	1393	810	810
q4	4690	486	351	351
q5	7610	2292	2110	2110
q6	382	179	141	141
q7	932	787	646	646
q8	9574	1696	1656	1656
q9	7040	4912	4951	4912
q10	6446	2098	1814	1814
q11	432	268	243	243
q12	638	439	306	306
q13	18123	3487	2741	2741
q14	257	252	229	229
q15	q16	814	781	707	707
q17	1008	896	924	896
q18	6897	5686	5463	5463
q19	1174	1240	1241	1240
q20	553	424	291	291
q21	5898	2748	2565	2565
q22	454	374	323	323
Total cold run time: 101367 ms
Total hot run time: 31308 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4601	4526	4664	4526
q2	q3	4827	5117	4682	4682
q4	2186	2204	1451	1451
q5	4877	4662	4647	4647
q6	227	184	132	132
q7	1884	1730	1510	1510
q8	2296	1915	1915	1915
q9	7259	7309	7174	7174
q10	4499	4405	4008	4008
q11	532	383	357	357
q12	712	732	514	514
q13	2995	3351	2832	2832
q14	284	288	248	248
q15	q16	673	697	604	604
q17	1311	1252	1252	1252
q18	7404	6942	6870	6870
q19	1121	1113	1107	1107
q20	2224	2207	1920	1920
q21	5374	4647	4520	4520
q22	526	475	434	434
Total cold run time: 55812 ms
Total hot run time: 50703 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 169416 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 807ea93d508aa0cc3842a48ffe84893a4a26fab5, data reload: false

query5	4333	656	515	515
query6	328	217	201	201
query7	4226	589	295	295
query8	328	234	221	221
query9	8853	3964	3975	3964
query10	451	345	300	300
query11	5793	2381	2181	2181
query12	181	126	122	122
query13	1277	598	416	416
query14	5968	5360	5028	5028
query14_1	4342	4340	4314	4314
query15	216	206	185	185
query16	1041	459	428	428
query17	935	739	590	590
query18	2437	494	374	374
query19	233	215	169	169
query20	136	132	132	132
query21	219	139	121	121
query22	13627	13558	13335	13335
query23	17281	16301	16029	16029
query23_1	16115	16277	16234	16234
query24	7552	1789	1308	1308
query24_1	1308	1307	1322	1307
query25	582	505	443	443
query26	1318	314	172	172
query27	2743	556	341	341
query28	4540	1958	1954	1954
query29	991	640	522	522
query30	307	244	204	204
query31	1124	1062	933	933
query32	90	78	76	76
query33	562	362	310	310
query34	1209	1171	659	659
query35	760	784	675	675
query36	1357	1338	1271	1271
query37	156	113	91	91
query38	3206	3148	3037	3037
query39	941	921	898	898
query39_1	885	882	868	868
query40	233	148	129	129
query41	74	70	67	67
query42	114	121	114	114
query43	326	323	287	287
query44	
query45	214	204	200	200
query46	1062	1220	734	734
query47	2315	2377	2226	2226
query48	412	402	314	314
query49	651	530	376	376
query50	977	338	250	250
query51	4379	4328	4199	4199
query52	102	103	93	93
query53	252	272	204	204
query54	318	276	263	263
query55	91	86	86	86
query56	286	334	295	295
query57	1439	1407	1330	1330
query58	288	270	267	267
query59	1558	1672	1412	1412
query60	320	311	318	311
query61	165	161	164	161
query62	666	632	551	551
query63	253	198	204	198
query64	2415	800	640	640
query65	
query66	1718	487	355	355
query67	30076	29951	29843	29843
query68	
query69	472	342	308	308
query70	1032	999	977	977
query71	301	271	269	269
query72	3041	2780	2432	2432
query73	819	763	419	419
query74	5061	4949	4732	4732
query75	2678	2598	2264	2264
query76	2307	1115	757	757
query77	397	423	341	341
query78	12298	12068	11637	11637
query79	1434	1083	729	729
query80	668	550	457	457
query81	458	275	242	242
query82	1367	156	122	122
query83	355	274	247	247
query84	312	140	113	113
query85	900	551	446	446
query86	401	334	305	305
query87	3413	3428	3221	3221
query88	3520	2648	2657	2648
query89	433	387	334	334
query90	1987	187	182	182
query91	178	172	141	141
query92	77	73	72	72
query93	1495	1429	818	818
query94	557	357	305	305
query95	662	472	357	357
query96	1047	795	333	333
query97	2703	2693	2572	2572
query98	231	225	231	225
query99	1137	1091	986	986
Total cold run time: 253396 ms
Total hot run time: 169416 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 100.00% (13/13) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.66% (20756/38678)
Line Coverage 37.26% (196639/527778)
Region Coverage 33.59% (154157/458908)
Branch Coverage 34.58% (67123/194094)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (13/13) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.75% (27940/37884)
Line Coverage 57.64% (303425/526432)
Region Coverage 54.72% (253524/463337)
Branch Coverage 56.31% (109709/194821)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants