[fix](fe) Reject non-null defaults for complex columns#63528
Open
mrhhsg wants to merge 3 commits into
Open
Conversation
### What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary: Complex columns could be created or added with non-null string defaults such as ARRAY DEFAULT '[]'. The default value is stored as a literal string instead of a typed complex value, which makes CREATE TABLE, ALTER TABLE ADD COLUMN, and partial-update default behavior inconsistent. Reject non-null defaults for ARRAY, MAP, STRUCT, JSON, and VARIANT columns during column definition validation while preserving no default and explicit DEFAULT NULL. Update existing regression DDLs and expected outputs that previously relied on empty array defaults.
### Release note
Reject non-null default literals for complex type columns.
### Check List (For Author)
- Test:
- Unit Test: ./run-fe-ut.sh --run org.apache.doris.nereids.trees.plans.commands.info.ColumnDefinitionTest
- Regression test: ./run-regression-test.sh --conf output/local-regression/regression-conf-46001.groovy --run -d datatype_p0/complex_types -s test_complex_default_value
- Regression test: ./run-regression-test.sh --conf output/local-regression/regression-conf-46001.groovy --run -d query_p0/expression -s test_default_expr
- Regression test: ./run-regression-test.sh --conf output/local-regression/regression-conf-46001.groovy --run -d schema_change_p0 -s test_alter_table_column
- Regression test: ./run-regression-test.sh --conf output/local-regression/regression-conf-46001.groovy --run -d unique_with_mow_p0/partial_update -s test_primary_key_partial_update_default_value
- Regression test: ./run-regression-test.sh --conf output/local-regression/regression-conf-46001.groovy --run -d unique_with_mow_p0/partial_update -s test_primary_key_partial_update_complex_type
- Regression test: ./run-regression-test.sh --conf output/local-regression/regression-conf-46001.groovy --run -d unique_with_mow_c_p0/partial_update -s test_primary_key_partial_update_complex_type
- Regression test: ./run-regression-test.sh --conf output/local-regression/regression-conf-46001.groovy --run -d query_p0/sql_functions/table_function -s explode
- Regression test: ./run-regression-test.sh --conf output/local-regression/regression-conf-46001.groovy --run -d schema_change_p0 -s test_dup_schema_key_add
- Regression test: ./run-regression-test.sh --conf output/local-regression/regression-conf-46001.groovy --run -d schema_change_p0 -s test_unique_schema_key_change_add
- Regression test: ./run-regression-test.sh --conf output/local-regression/regression-conf-46001.groovy --run -d schema_change_p0 -s test_agg_schema_key_add
- Regression test: ./run-regression-test.sh --conf output/local-regression/regression-conf-46001.groovy --run -d schema_change_p0 -s test_modify_struct
- Regression test: ./run-regression-test.sh --conf output/local-regression/regression-conf-46001.groovy --run -d mysql_fulltext_array_contains -s load
- Regression test: ./run-regression-test.sh --conf output/local-regression/regression-conf-46001.groovy --run -d nereids_function_p0 -s load
- Behavior changed: Yes. Non-null default literals for complex columns are rejected.
- Does this need documentation: No
Member
Author
|
/review |
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Contributor
There was a problem hiding this comment.
No blocking issues found.
Checkpoint conclusions:
- Goal and tests: The PR rejects non-null defaults for ARRAY, MAP, STRUCT, JSON, and VARIANT columns while preserving omitted defaults and explicit DEFAULT NULL. The new unit test and regression case cover create-table failures, allowed null/no-default cases, and ALTER ADD COLUMN rejection; existing expected outputs were updated for the new NULL default behavior.
- Scope and clarity: The main validation change is small and focused in ColumnDefinition, with broad test DDL updates required by the behavior change.
- Concurrency and lifecycle: Not applicable; this is DDL validation and test data update logic with no new shared mutable state, locks, or lifecycle-sensitive objects.
- Configuration and compatibility: No new configuration items. Existing persisted tables/defaults are not migrated here; the change affects new DDL validation only.
- Parallel paths: Create table and schema-change ADD COLUMN paths both flow through ColumnDefinition validation. Legacy ColumnDef translation also delegates into ColumnDefinition, so the old and Nereids paths appear covered.
- Data correctness: Rejecting string literals for complex defaults avoids storing untyped complex defaults and aligns DEFAULT() / partial-update behavior with nullable complex columns.
- Test coverage: Coverage is reasonable for the changed behavior. I did not run the test suite locally in this review runner; I reviewed the author-listed test coverage and checked the patch for whitespace issues with git diff --check.
- Observability: Not applicable; this is user-facing analysis validation with explicit error messages.
- User focus: No additional user-provided review focus was present.
Member
Author
|
run buildall |
Contributor
TPC-H: Total hot run time: 31359 ms |
Contributor
TPC-DS: Total hot run time: 169370 ms |
### What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary: Complex array columns no longer use non-null empty array defaults. Some Cloud P0 expected outputs still assumed omitted array columns were filled with empty arrays, and the array-function expected output still assumed the nullable boolean-array load table kept missing values. Refresh the affected stream-load, HTTP-stream, and array-function expected outputs to match the new DEFAULT NULL/no-default behavior and explicitly loaded boolean-array data.
### Release note
None
### Check List (For Author)
- Test:
- Regression test: ./run-regression-test.sh --conf output/local-regression/regression-conf-46001.groovy --run -d load_p0/stream_load -s test_stream_load_properties
- Regression test: ./run-regression-test.sh --conf output/local-regression/regression-conf-46001.groovy --run -d load_p0/http_stream -s test_http_stream_properties
- Regression test: ./run-regression-test.sh --conf output/local-regression/regression-conf-46001.groovy --run -d nereids_function_p0/scalar_function -s nereids_scalar_fn_Array1
- Behavior changed: No
- Does this need documentation: No
Member
Author
|
run cloud_p0 |
Contributor
Possible file(s) that should be tracked in LFS detected: 🚨The following file(s) exceeds the file size limit:
Consider using |
Member
Author
|
run compile |
Contributor
FE Regression Coverage ReportIncrement line coverage |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary: Complex columns could be created or added with non-null string defaults such as ARRAY DEFAULT '[]'. The default value is stored as a literal string instead of a typed complex value, which makes CREATE TABLE, ALTER TABLE ADD COLUMN, and partial-update default behavior inconsistent. Reject non-null defaults for ARRAY, MAP, STRUCT, JSON, and VARIANT columns during column definition validation while preserving no default and explicit DEFAULT NULL. Update existing regression DDLs and expected outputs that previously relied on empty array defaults.
Release note
Reject non-null default literals for complex type columns.
Check List (For Author)