-
Notifications
You must be signed in to change notification settings - Fork 4.8k
HIVE-29262: Incorrect column ordering output in case of different ordering of mutual columns in query & window function in vectorized PTF #6512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
tanishq-chugh
wants to merge
5
commits into
apache:master
Choose a base branch
from
tanishq-chugh:HIVE-29262
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+14,534
−0
Open
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
636bbfc
HIVE-29262: Incorrect column ordering output in case of different ord…
tanishq-chugh f859a18
Fix failing test cases
tanishq-chugh 2cab7be
Update reorderPartitionColumnsToMatchOutputOrder not reorder partitio…
tanishq-chugh 4c9733f
Address Sonarqube - 1
tanishq-chugh a5e3bfc
Address Review comments
tanishq-chugh File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
155 changes: 155 additions & 0 deletions
155
ql/src/test/queries/clientpositive/vector_ptf_cols_order.q
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,155 @@ | ||
| SET hive.vectorized.execution.enabled=true; | ||
| create table web_sales_txt | ||
| ( | ||
| ws_sold_date_sk int, | ||
| ws_sold_time_sk int, | ||
| ws_ship_date_sk int, | ||
| ws_item_sk int, | ||
| ws_bill_customer_sk int, | ||
| ws_bill_cdemo_sk int, | ||
| ws_bill_hdemo_sk int, | ||
| ws_bill_addr_sk int, | ||
| ws_ship_customer_sk int, | ||
| ws_ship_cdemo_sk int, | ||
| ws_ship_hdemo_sk int, | ||
| ws_ship_addr_sk int, | ||
| ws_web_page_sk int, | ||
| ws_web_site_sk int, | ||
| ws_ship_mode_sk int, | ||
| ws_warehouse_sk int, | ||
| ws_promo_sk int, | ||
| ws_order_number int, | ||
| ws_quantity int, | ||
| ws_wholesale_cost decimal(7,2), | ||
| ws_list_price decimal(7,2), | ||
| ws_sales_price decimal(7,2), | ||
| ws_ext_discount_amt decimal(7,2), | ||
| ws_ext_sales_price decimal(7,2), | ||
| ws_ext_wholesale_cost decimal(7,2), | ||
| ws_ext_list_price decimal(7,2), | ||
| ws_ext_tax decimal(7,2), | ||
| ws_coupon_amt decimal(7,2), | ||
| ws_ext_ship_cost decimal(7,2), | ||
| ws_net_paid decimal(7,2), | ||
| ws_net_paid_inc_tax decimal(7,2), | ||
| ws_net_paid_inc_ship decimal(7,2), | ||
| ws_net_paid_inc_ship_tax decimal(7,2), | ||
| ws_net_profit decimal(7,2) | ||
| ) | ||
| row format delimited fields terminated by '|' | ||
| stored as textfile; | ||
|
|
||
| LOAD DATA LOCAL INPATH '../../data/files/web_sales_2k' OVERWRITE INTO TABLE web_sales_txt; | ||
|
|
||
| -- Baseline query to verify data load. | ||
| select ws_bill_customer_sk, ws_item_sk from web_sales_txt; | ||
|
|
||
| -- Vectorized LAG: verify output column order follows the SELECT list | ||
| -- (ws_bill_customer_sk, ws_item_sk), not the PARTITION BY order, when window | ||
| -- functions use different PARTITION BY column orderings. | ||
| SELECT | ||
| ws_bill_customer_sk, | ||
| ws_item_sk, | ||
| ws_sold_date_sk, | ||
| ws_sales_price, | ||
| LAG(ws_sales_price) OVER ( | ||
| PARTITION BY ws_item_sk, ws_bill_customer_sk | ||
| ORDER BY ws_sold_date_sk | ||
| ) AS prev_sales_price, | ||
| ws_sales_price - LAG(ws_sales_price) OVER ( | ||
| PARTITION BY ws_bill_customer_sk, ws_item_sk | ||
| ORDER BY ws_sold_date_sk | ||
| ) AS sales_price_diff | ||
| FROM | ||
| web_sales_txt; | ||
|
|
||
| -- Vectorized LEAD: same column-ordering check using LEAD window functions. | ||
| SELECT | ||
| ws_bill_customer_sk, | ||
| ws_item_sk, | ||
| ws_sold_date_sk, | ||
| ws_sales_price, | ||
| LEAD(ws_sales_price) OVER ( | ||
| PARTITION BY ws_item_sk, ws_bill_customer_sk | ||
| ORDER BY ws_sold_date_sk | ||
| ) AS next_sales_price, | ||
| LEAD(ws_sales_price) OVER ( | ||
| PARTITION BY ws_bill_customer_sk, ws_item_sk | ||
| ORDER BY ws_sold_date_sk | ||
| ) - ws_sales_price AS sales_price_diff | ||
| FROM | ||
| web_sales_txt; | ||
|
|
||
| -- Vectorized FIRST_VALUE/LAST_VALUE: same column-ordering check using FIRST_VALUE and | ||
| -- LAST_VALUE window functions with differing PARTITION BY orderings. | ||
| SELECT | ||
| ws_bill_customer_sk, | ||
| ws_item_sk, | ||
| ws_sold_date_sk, | ||
| ws_sales_price, | ||
| FIRST_VALUE(ws_sales_price) OVER ( | ||
| PARTITION BY ws_item_sk, ws_bill_customer_sk | ||
| ORDER BY ws_sold_date_sk | ||
| ) AS first_price, | ||
| LAST_VALUE(ws_sales_price) OVER ( | ||
| PARTITION BY ws_bill_customer_sk, ws_item_sk | ||
| ORDER BY ws_sold_date_sk | ||
| ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING | ||
| ) AS last_price | ||
| FROM | ||
| web_sales_txt; | ||
|
|
||
| SET hive.vectorized.execution.enabled=false; | ||
|
|
||
| -- Non-vectorized LAG: reference results for validating vectorized LAG query execution above. | ||
| SELECT | ||
| ws_bill_customer_sk, | ||
| ws_item_sk, | ||
| ws_sold_date_sk, | ||
| ws_sales_price, | ||
| LAG(ws_sales_price) OVER ( | ||
| PARTITION BY ws_item_sk, ws_bill_customer_sk | ||
| ORDER BY ws_sold_date_sk | ||
| ) AS prev_sales_price, | ||
| ws_sales_price - LAG(ws_sales_price) OVER ( | ||
| PARTITION BY ws_bill_customer_sk, ws_item_sk | ||
| ORDER BY ws_sold_date_sk | ||
| ) AS sales_price_diff | ||
| FROM | ||
| web_sales_txt; | ||
|
|
||
| -- Non-vectorized LEAD: reference results for validating vectorized LEAD query execution above. | ||
| SELECT | ||
| ws_bill_customer_sk, | ||
| ws_item_sk, | ||
| ws_sold_date_sk, | ||
| ws_sales_price, | ||
| LEAD(ws_sales_price) OVER ( | ||
| PARTITION BY ws_item_sk, ws_bill_customer_sk | ||
| ORDER BY ws_sold_date_sk | ||
| ) AS next_sales_price, | ||
| LEAD(ws_sales_price) OVER ( | ||
| PARTITION BY ws_bill_customer_sk, ws_item_sk | ||
| ORDER BY ws_sold_date_sk | ||
| ) - ws_sales_price AS sales_price_diff | ||
| FROM | ||
| web_sales_txt; | ||
|
|
||
| -- Non-vectorized FIRST_VALUE/LAST_VALUE: reference results for validating vectorized | ||
| -- FIRST_VALUE/LAST_VALUE query execution above. | ||
| SELECT | ||
| ws_bill_customer_sk, | ||
| ws_item_sk, | ||
| ws_sold_date_sk, | ||
| ws_sales_price, | ||
| FIRST_VALUE(ws_sales_price) OVER ( | ||
| PARTITION BY ws_item_sk, ws_bill_customer_sk | ||
| ORDER BY ws_sold_date_sk | ||
| ) AS first_price, | ||
| LAST_VALUE(ws_sales_price) OVER ( | ||
| PARTITION BY ws_bill_customer_sk, ws_item_sk | ||
| ORDER BY ws_sold_date_sk | ||
| ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING | ||
| ) AS last_price | ||
| FROM | ||
| web_sales_txt; | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add comments explaining what test case is being tested before this and other SELECT queries in this .q file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, comments would be essential to explain this Qtest. Added comments to explain all the SELECT queries being run in the Qtest in commit: a5e3bfc