Skip to content

HIVE-29372: Iceberg: [V3] Fix inconsistencies when vectorizartion is enabled in case of variant shredding.#6245

Merged
ayushtkn merged 1 commit intoapache:masterfrom
ayushtkn:HIVE-29372
Dec 19, 2025
Merged

HIVE-29372: Iceberg: [V3] Fix inconsistencies when vectorizartion is enabled in case of variant shredding.#6245
ayushtkn merged 1 commit intoapache:masterfrom
ayushtkn:HIVE-29372

Conversation

@ayushtkn
Copy link
Copy Markdown
Member

What changes were proposed in this pull request?

As of now Vectorization isn't supported for Shredded Variants, so we fallback to non-vectorized mode in that case

Why are the changes needed?

#6152 (comment)

Does this PR introduce any user-facing change?

Vectorization enabled/disabled doesn't bother.

How was this patch tested?

UT

}

public static boolean shouldUseVariantShredding(Properties tableProps, Schema tableSchema) {
return isVariantShreddingEnabled(tableProps) && hasVariantFields(tableSchema);
Copy link
Copy Markdown
Member

@deniskuzZ deniskuzZ Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe use Maps.fromProperties to avoid code duplication. or introduce a generic signature UnaryOperator<String> tableProps -> tableProps::getProperty

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Copy Markdown
Member

@deniskuzZ deniskuzZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, pending tests

(1, parse_json('{"a": 1}')),
(2, parse_json('{"b": 2}'));

SELECT
Copy link
Copy Markdown
Member

@deniskuzZ deniskuzZ Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it cover some specific test-case? with qtest you can't test the physical layout anyways.

Copy link
Copy Markdown
Member Author

@ayushtkn ayushtkn Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

earlier it was giving wrong results:

NULL	2
NULL	NULL

now it is correct, I added the scenario that I tried in #6152 (comment)

I didn't put like all cases to show output is same, just it is correct, the previous original case was throwing exception, this was giving wrong result.

Copy link
Copy Markdown
Member

@deniskuzZ deniskuzZ Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wonder if we same issue could be reproduced on existing table tbl_shredded_variant by inserting (1, parse_json('{"name": "John", "age": 30, "active": true}')) with complete schema last

SELECT 
   variant_get(data, '$.name') as name;
   variant_get(data, '$.age', 'int') as age;
FROM tbl_shredded_variant

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, added vectorization support in #6224

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No Denys, it throws the same MALFORMED_VARIANT exception like now, I tried pulling the full one below, added a new record but any query is giving the same exception as in the previous thread.

Copy link
Copy Markdown
Member

@deniskuzZ deniskuzZ Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IDK, works fine for me: 1062dc5#diff-779297369ca87b0f30855401004c6819e0b64fb82167861f08dcba08e2ae3859

Execution mode: vectorized
1	NULL
NULL	2

@sonarqubecloud
Copy link
Copy Markdown

@ayushtkn ayushtkn merged commit 3d58586 into apache:master Dec 19, 2025
2 of 3 checks passed
DanielZhu58 pushed a commit to DanielZhu58/hive that referenced this pull request Jan 12, 2026
nareshpr pushed a commit to nareshpr/hive-1 that referenced this pull request Feb 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants