Support global sort and improve UDF overload resolution in Beam SQL#38832
Support global sort and improve UDF overload resolution in Beam SQL#38832damccorm wants to merge 5 commits into
Conversation
- Enable ORDER BY without LIMIT (global sort) in BeamSortRel by sorting in-memory. - Add AssertSorted test helper and testOrderBy_noLimit to verify global sort. - Map common Java classes to Calcite SqlTypeName in CalciteUtils.sqlTypeWithAutoCast, and add testSqlTypeWithAutoCast. - Prioritize overloaded methods with maximum parameter count in UdfImpl lookup, and add UdfImplTest. - Update LazyAggregateCombineFnTest to expect SQL BIGINT type instead of Java Long class. TAG=agy CONV=0df243da-2867-4795-9889-6334ba7d1599
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces several enhancements to Beam SQL, primarily focusing on expanding query capabilities and improving type system robustness. Key changes include enabling global sorting for queries without a LIMIT clause, refining how overloaded user-defined functions are resolved, and improving the mapping between Java types and Calcite SQL types. Highlights
New Features🧠 You can now enable Memory (public preview) to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request adds support for ORDER BY without a LIMIT in BeamSortRel by grouping all rows under a dummy key and sorting them in-memory. It also updates UdfImpl to prioritize overloaded methods with the maximum number of parameters and maps standard Java classes to Calcite SQL types in CalciteUtils. The reviewer suggests introducing a deterministic tie-breaker in UdfImpl to handle overloaded methods with the same parameter count consistently, and simplifying the long if-else chain in CalciteUtils using a switch statement and dynamic nullability checks.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
…terministic UdfImpl lookup - Simplify Java class to Calcite type mapping in CalciteUtils using a switch statement and dynamic nullability check. - Use deterministic tie-breaker in UdfImpl.findMethod when resolving overloaded methods with the same parameter count. TAG=agy CONV=0df243da-2867-4795-9889-6334ba7d1599
|
Assigning reviewers: R: @ahmedabu98 for label java. Note: If you would like to opt out of this review, comment Available commands:
The PR bot will only process comments in the main thread (not review comments). |
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces support for ORDER BY without a LIMIT in Beam SQL by routing all rows to a single dummy key and sorting them in-memory. It also updates UDF method resolution to prioritize overloads with the maximum number of parameters and maps standard Java classes to Calcite SQL types with proper nullability. The review feedback highlights a critical scalability bottleneck in the global sorting implementation that could lead to OutOfMemoryError on large datasets. Additionally, the reviewer suggests using direct Class comparisons instead of string-based class name checks in CalciteUtils for better performance, and warns that the new UDF overload resolution strategy may shadow valid overloads with fewer parameters of different types.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
Use a static ImmutableMap for Java to SQL type mapping instead of a switch statement, improving lookup performance and code readability. TAG=agy CONV=0df243da-2867-4795-9889-6334ba7d1599
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces support for ORDER BY without a LIMIT in BeamSortRel by grouping all rows under a dummy key and sorting them in-memory. It also improves overloaded method resolution in UdfImpl to prioritize methods with more parameters, and maps standard Java classes to Calcite SQL types in CalciteUtils. Feedback on these changes includes addressing a missing import for WithKeys in BeamSortRel that would cause a compilation error, and optimizing UdfImpl by using getParameterCount() instead of getParameterTypes().length to avoid unnecessary array allocations.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
…ensions/sql/impl/UdfImpl.java Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Add a few SQL improvements