expression, planner: allow more MATCH AGAINST boolean queries | tidb-test=13ccf8de48e8db2290ff884598444d0508606bbf tiflash=feature-fts#67242
Conversation
📝 WalkthroughWalkthroughThe pull request modifies full-text search functionality to lift restrictions on multi-term MATCH AGAINST boolean queries. Changes include conditional SHOULD clause handling in Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
pkg/expression/fts_helper.go (1)
329-333: Make the SHOULD-pruning contract explicitly TiCI-specific.Dropping
group.Shouldhere is right for the no-score TiCI filter path, but this helper sits behind a generic-lookingRewriteMySQLMatchAgainstRecursivelyAPI inpkg/expression. Please make it explicit that this rule is TiCI-only and applies to the current boolean group, so it does not get reused later as a general MATCH rewrite.Suggested wording
- // TiCI currently executes MATCH ... AGAINST as a no-score filter. When the - // boolean query contains at least one MUST term, MySQL treats SHOULD terms as - // optional score contributors, which cannot be represented in the filter-only - // path. In that case we keep the MUST / MUST NOT filters and ignore SHOULD. + // TiCI currently executes MATCH ... AGAINST as a no-score filter. + // This is intentionally a TiCI-specific rule for the current boolean group: + // once the group contains at least one MUST term, MySQL treats SHOULD terms + // as optional score contributors. The filter-only path cannot represent that, + // so we keep MUST / MUST NOT filters and drop SHOULD terms here.As per coding guidelines, "Code SHOULD remain maintainable for future readers with basic TiDB familiarity" and "Comments SHOULD explain non-obvious intent, constraints, invariants, concurrency guarantees, SQL/compatibility contracts, or important performance trade-offs."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/expression/fts_helper.go` around lines 329 - 333, Update the comment around includeShouldInFilter := len(group.Must) == 0 to explicitly mark the SHOULD-pruning as a TiCI-specific behavior and local to the current boolean group used by the TiCI no-score filter path; mention the rule does NOT change the global semantics of RewriteMySQLMatchAgainstRecursively and only applies when TiCI is executing MATCH ... AGAINST as a filter (so group.Should may be dropped here but should not be assumed safe for other callers). Reference the boolean group (group.Should / group.Must) and the RewriteMySQLMatchAgainstRecursively API in the comment so future readers know this is a TiCI-only, per-group optimization and not a general MATCH rewrite.pkg/expression/util_test.go (1)
810-818: Add one mixed MUST/MUST NOT/SHOULD rewrite case.The new checks cover SHOULD-only and MUST+SHOULD, but not the branch where SHOULD is dropped while MUST NOT is still preserved. A
+hello -world foocase would pin thelen(group.MustNot)path inrewriteBooleanGroupToFTSExprand catch regressions in thestartIdxslicing logic.Minimal test addition
expr, _, err = RewriteMySQLMatchAgainstRecursively(ctx, buildMatchAgainst("+hello world"), model.FullTextParserTypeStandardV1) require.NoError(t, err) require.False(t, hasMySQLMatchAgainst(expr)) root, ok = expr.(*ScalarFunction) require.True(t, ok) require.Equal(t, ast.FTSMatchWord, root.FuncName.L) require.ElementsMatch(t, []ftsLeaf{ {funcName: ast.FTSMatchWord, query: "hello", columns: []string{"title"}}, }, collectFTSLeaves(expr)) +expr, _, err = RewriteMySQLMatchAgainstRecursively(ctx, buildMatchAgainst("+hello -world foo"), model.FullTextParserTypeStandardV1) +require.NoError(t, err) +require.False(t, hasMySQLMatchAgainst(expr)) +require.ElementsMatch(t, []ftsLeaf{ + {funcName: ast.FTSMatchWord, query: "hello", columns: []string{"title"}}, + {funcName: ast.FTSMatchWord, query: "world", columns: []string{"title"}, underNot: true}, +}, collectFTSLeaves(expr))As per coding guidelines, "Prefer extending existing test suites and fixtures over creating new scaffolding" and "Keep test changes minimal and deterministic; avoid broad golden/testdata churn unless required."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/expression/util_test.go` around lines 810 - 818, Add a new test case in util_test.go alongside the existing RewriteMySQLMatchAgainstRecursively checks that calls RewriteMySQLMatchAgainstRecursively with buildMatchAgainst("+hello -world foo"), asserts no error and require.False(hasMySQLMatchAgainst(expr)), casts root to *ScalarFunction and checks root.FuncName (as in the nearby tests), and then assert collectFTSLeaves(expr) contains both the MUST leaf for "hello" and the MUST NOT leaf for "world" (mirroring the existing ftsLeaf expectations) so the branch in rewriteBooleanGroupToFTSExpr that preserves MustNot while dropping SHOULD is exercised.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pkg/expression/fts_helper.go`:
- Around line 329-333: Update the comment around includeShouldInFilter :=
len(group.Must) == 0 to explicitly mark the SHOULD-pruning as a TiCI-specific
behavior and local to the current boolean group used by the TiCI no-score filter
path; mention the rule does NOT change the global semantics of
RewriteMySQLMatchAgainstRecursively and only applies when TiCI is executing
MATCH ... AGAINST as a filter (so group.Should may be dropped here but should
not be assumed safe for other callers). Reference the boolean group
(group.Should / group.Must) and the RewriteMySQLMatchAgainstRecursively API in
the comment so future readers know this is a TiCI-only, per-group optimization
and not a general MATCH rewrite.
In `@pkg/expression/util_test.go`:
- Around line 810-818: Add a new test case in util_test.go alongside the
existing RewriteMySQLMatchAgainstRecursively checks that calls
RewriteMySQLMatchAgainstRecursively with buildMatchAgainst("+hello -world foo"),
asserts no error and require.False(hasMySQLMatchAgainst(expr)), casts root to
*ScalarFunction and checks root.FuncName (as in the nearby tests), and then
assert collectFTSLeaves(expr) contains both the MUST leaf for "hello" and the
MUST NOT leaf for "world" (mirroring the existing ftsLeaf expectations) so the
branch in rewriteBooleanGroupToFTSExpr that preserves MustNot while dropping
SHOULD is exercised.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 72cd12a2-1a29-4367-a7fc-f8a540387344
📒 Files selected for processing (3)
pkg/expression/fts_helper.gopkg/expression/util_test.gopkg/planner/core/casetest/tici/tici_test.go
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## feature/fts #67242 +/- ##
===================================================
- Coverage 76.8610% 75.1893% -1.6718%
===================================================
Files 1960 1936 -24
Lines 555677 557304 +1627
===================================================
- Hits 427099 419033 -8066
- Misses 127116 138265 +11149
+ Partials 1462 6 -1456
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
|
/retest |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: hawkingrei, wshwsh12 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
|
/retest |
1 similar comment
|
/retest |
|
/cherry-pick release-fts-202602 |
|
@winoros: new pull request created to branch DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
Co-authored-by: Yiding Cui <winoros@gmail.com>
What problem does this PR solve?
Issue Number: ref #65626
Problem Summary:
MATCH ... AGAINST (... IN BOOLEAN MODE)still rejected some valid boolean queries during TiCI rewrite. Queries likehello worldand+hello worldreturnedTiDB only supports multiple terms with +/- modifiers.What changed and how does it work?
Check List
Tests
go test -run TestRewriteMySQLMatchAgainst --tags=intest ./pkg/expressiongo test -run TestTiCIMatchAgainstValidation --tags=intest ./pkg/planner/core/casetest/ticiSide effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.
Summary by CodeRabbit