Add Prefix and Wildcard Query Support to DSL Query Executor#5
Open
abhishek00159 wants to merge 1 commit intovinaykpud:mainfrom
Open
Add Prefix and Wildcard Query Support to DSL Query Executor#5abhishek00159 wants to merge 1 commit intovinaykpud:mainfrom
abhishek00159 wants to merge 1 commit intovinaykpud:mainfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements PrefixQueryTranslator and WildcardQueryTranslator that convert OpenSearch prefix and wildcard queries to Calcite LIKE expressions with full support for case-insensitive matching and proper SQL special character
escaping.
Features
Prefix Query
Converts prefix queries to SQL LIKE with trailing wildcard:
json
{"prefix": {"name": "lap"}}
// Converts to: name LIKE 'lap%'
{"prefix": {"name": {"value": "LAP", "case_insensitive": true}}}
// Converts to: LOWER(name) LIKE 'lap%'
Supported parameters:
• value - The prefix string
• case_insensitive - Case-insensitive matching (default: false)
Unsupported parameters (throw ConversionException):
• boost - Query boosting not supported
• rewrite - Rewrite method not supported
Wildcard Query
Converts wildcard patterns to SQL LIKE with pattern translation:
json
{"wildcard": {"name": "lap*"}}
// Converts to: name LIKE 'lap%'
{"wildcard": {"name": "l?ptop"}}
// Converts to: name LIKE 'l_ptop'
{"wildcard": {"name": {"value": "BOOK", "case_insensitive": true}}}
// Converts to: LOWER(name) LIKE '%book%'
Wildcard characters:
• * - Matches any character sequence → SQL %
• ? - Matches any single character → SQL _
Supported parameters:
• value - The wildcard pattern
• case_insensitive - Case-insensitive matching (default: false)
Unsupported parameters (throw ConversionException):
• boost - Query boosting not supported
• rewrite - Rewrite method not supported
Special Character Escaping
Both translators properly escape SQL LIKE special characters:
• % → % (SQL wildcard for any characters)
• _ → _ (SQL wildcard for single character)
• \ → \ (escape character)
Example:
json
{"prefix": {"path": "C:\test_"}}
// Converts to: path LIKE 'C:\\test\_%'
{"wildcard": {"name": "a%b_c\d*"}}
// Converts to: name LIKE 'a%b_c\\d%'
Case Insensitive Support
When case_insensitive: true:
• Applies LOWER() function to field reference
• Converts pattern to lowercase
• Works for both prefix and wildcard queries
Examples
Prefix in Bool Query
json
{
"bool": {
"must": [{"prefix": {"category": "electron"}}],
"should": [
{"prefix": {"brand": "sam"}},
{"prefix": {"brand": "app"}}
]
}
}
Converts to: (category LIKE 'electron%') AND ((brand LIKE 'sam%') OR (brand LIKE 'app%'))
Complex Wildcard Pattern
json
{"wildcard": {"sku": "-2021-"}}
Converts to: sku LIKE '%-2021-%'
Testing
Unit Tests:
• tests for PrefixQueryTranslator
• tests for WildcardQueryTranslator
• Coverage: basic patterns, case sensitivity, escaping, edge cases, error handling
Integration Tests:
• End-to-end validation with real index data for DslPrefixQueryIT
• End-to-end validation with real index data for DslWildcardQueryIT
Documentation
Implementation Docs:
•Updated README.md file for supported queries.
Technical Notes
• Both translators validate field existence in schema
• Throw ConversionException for unsupported parameters (boost, rewrite)
• Character escaping performed before wildcard conversion to prevent conflicts
• Case-insensitive queries use SQL LOWER() function
• Registered in QueryRegistryFactory for O(1) lookup
Performance Considerations
• LIKE patterns with leading characters are generally efficient
• Case-insensitive queries using LOWER() may not use indexes effectively
• Wildcard patterns starting with * may require full table scans
Related Issues
Resolves #[Issue number to be closed when this PR is merged]
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.