[regression-test](streaming-job) add cdc cases for source/jdbc timezone and TIMESTAMP/timestamptz pk#63543
Conversation
…ne and TIMESTAMP/timestamptz pk
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
|
/review |
There was a problem hiding this comment.
I found test-stability issues that should be fixed before this regression-only PR is merged.
Critical checkpoint conclusions:
- Goal/test proof: The PR aims to add CDC timezone and timestamp/timestamptz PK regression coverage. The coverage is useful, but two suites can fail for reasons unrelated to the intended behavior.
- Scope/focus: The change is focused on regression tests and expected result files.
- Concurrency/lifecycle/config/storage/transaction compatibility: Not applicable; this PR does not change production concurrency, lifecycle, config, persistence, or storage behavior.
- Parallel code paths: MySQL and PostgreSQL variants are mostly covered in parallel; one timezone-output stability concern applies to both jdbc_servertimezone suites.
- Test coverage/results: Coverage is broad, but
test_streaming_postgres_job_source_timezoneincludes atimetzexpectation while current CDC conversion still lacks explicitio.debezium.time.ZonedTimehandling, and the jdbc timezone suites derive runtime timezone-dependent output while committing fixed+08:00.outfiles. - Observability/performance: Not applicable for test-only changes beyond existing Awaitility diagnostics.
User focus: No additional user-provided review focus was supplied.
| tstz0 timestamptz(0), | ||
| tstz3 timestamptz(3), | ||
| tstz6 timestamptz(6), | ||
| ttz time with time zone, |
There was a problem hiding this comment.
This adds time with time zone to a P0 regression suite, but the current CDC converter still does not have a specific branch for Debezium io.debezium.time.ZonedTime (the nearby comment also says the expected values assume an upstream-fix behavior). In the current code path, named schemas not matched by Time, Timestamp, ZonedTimestamp, etc. fall through to dbzObj.toString(), so this test can fail or encode the offset differently from the committed .out (+08 / -05). Please either add/land the actual ZonedTime conversion before enabling this assertion, or remove ttz from this P0 case until the behavior is implemented.
| String driver_url = "https://${bucket}.${s3_endpoint}/regression/jdbc_driver/mysql-connector-j-8.4.0.jar" | ||
|
|
||
| // Read Doris session tz so the cdc job aligns with it. | ||
| def dorisTz = (sql "select @@time_zone")[0][0] |
There was a problem hiding this comment.
The test reads @@time_zone at runtime and uses it as serverTimezone, but the committed .out is fixed for Doris +08:00 (ts0 is expected as 2024-06-15T18:00). On any runner whose Doris session timezone is not +08:00, the CDC job will correctly render a different wall clock and this regression will fail. The PostgreSQL jdbc_servertimezone case has the same pattern. Please make the suite deterministic, for example by setting the Doris session timezone to the value used by the .out before reading it, or by using a fixed timezone in both the URL and expected output.
What problem does this PR solve?
Add CDC streaming-job regression coverage for timezone behavior and TIMESTAMP / timestamptz chunk-key paths that the existing suites do not exercise.
Cases added
*_source_timezone(mysql + pg)timetz)timetzcolumn kept as a regression guard for the upstream JVM-tz handling*_jdbc_servertimezone(mysql + pg)jdbc_url'sserverTimezone/timezonewith Doris sessiontime_zone(read at runtime so it works on any default tz)*_timestamp_pk(mysql + pg)LocalDateTime/OffsetDateTimechunk-bound restore inAbstractCdcSourceReader.convertBoundRelease note
Add CDC streaming-job regression suites for source/jdbc timezone and TIMESTAMP / timestamptz chunk-key.
Check List (For Author)
Test
Behavior changed:
Does this need documentation?