expression, planner: allow more MATCH AGAINST boolean queries (#67242) | tidb-test=13ccf8de48e8db2290ff884598444d0508606bbf tiflash=feature-fts by ti-chi-bot · Pull Request #67266 · pingcap/tidb

ti-chi-bot · 2026-03-24T18:45:53Z

This is an automated cherry-pick of #67242

What problem does this PR solve?

Issue Number: ref #65626

Problem Summary:

MATCH ... AGAINST (... IN BOOLEAN MODE) still rejected some valid boolean queries during TiCI rewrite. Queries like hello world and +hello world returned TiDB only supports multiple terms with +/- modifiers.

What changed and how does it work?

Removed the extra BOOLEAN MODE validation that rejected multiple SHOULD clauses and mixed MUST/SHOULD clauses.
Rewrote pure SHOULD queries into OR-based FTS filters.
Kept MUST / MUST NOT filters when SHOULD terms appear in a no-score TiCI query, so optional terms no longer become mandatory filters.
Added regression coverage in expression rewrite tests and planner TiCI validation tests.

Check List

Tests

Unit test
go test -run TestRewriteMySQLMatchAgainst --tags=intest ./pkg/expression
go test -run TestTiCIMatchAgainstValidation --tags=intest ./pkg/planner/core/casetest/tici
Integration test
Manual test (add detailed scripts or steps below)
No need to test
- I checked and no code files have been changed.

Side effects

Performance regression: Consumes more CPU
Performance regression: Consumes more Memory
Breaking backward compatibility

Documentation

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

Allow more valid MATCH ... AGAINST BOOLEAN MODE queries during TiCI rewrite, including multiple SHOULD terms and mixed MUST/SHOULD queries.

Summary by CodeRabbit

New Features
- Enhanced fulltext search in BOOLEAN MODE to support more flexible queries with multiple unmodified search terms, eliminating the requirement for explicit +/- modifiers on all terms.
Bug Fixes
- Simplified and relaxed validation logic for fulltext search boolean group handling, removing unnecessary restrictions on query term combinations.

ti-chi-bot · 2026-03-24T18:47:05Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: winoros
Once this PR has been reviewed and has the lgtm label, please assign windtalker for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

pkg/expression/OWNERS
~~pkg/planner/OWNERS~~ [winoros]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

ti-chi-bot · 2026-03-24T18:47:08Z

[LGTM Timeline notifier]

Timeline:

2026-03-24 18:47:08.591399949 +0000 UTC m=+294024.627470209: ☑️ agreed by winoros.

coderabbitai · 2026-03-24T18:47:18Z

📝 Walkthrough

Walkthrough

Modified MATCH...AGAINST boolean group rewriting to conditionally include SHOULD terms based on MUST term presence. Removed prior validation restrictions on SHOULD-with-MUST/MUST-NOT combinations and multiple SHOULD-only terms. Updated tests to verify successful plan generation for previously rejected multi-term queries.

Changes

Cohort / File(s)	Summary
FTS Boolean Group Rewriting Logic `pkg/expression/fts_helper.go`	Modified `rewriteBooleanGroupToFTSExpr` to conditionally include SHOULD terms only when MUST terms are absent. Updated `validateMySQLMatchAgainstBooleanGroup` to remove TiDB limitation checks on SHOULD-MUST combinations and multiple SHOULD terms, now validating only non-nil input.
FTS Rewrite Test Expectations `pkg/expression/util_test.go`	Updated `TestRewriteMySQLMatchAgainst` to assert successful rewriting (no errors) and correct FTS structure for multi-term patterns `"hello world"` and `"+hello world"`, with proper leaf composition using `FTSMatchWord` and `LogicOr` nodes.
TiCI Plan Generation Tests `pkg/planner/core/casetest/tici/tici_test.go`	Changed `TestTiCIMatchAgainstValidation` from expecting errors to verifying successful query plan generation for BOOLEAN MODE multi-term queries, validating `fts_match_word` composition and `istrue_with_null` guards.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

pingcap/tidb#67242: Makes identical code-level changes to rewriteBooleanGroupToFTSExpr and validateMySQLMatchAgainstBooleanGroup for SHOULD-only rewrite support.
pingcap/tidb#67128: Also modifies MATCH...AGAINST rewriting and validation logic in pkg/expression/fts_helper.go.
pingcap/tidb#67208: Modifies boolean-group to FTS rewrite logic in rewriteBooleanGroupToFTSExpr and related clause handling.

Suggested labels

approved, lgtm

Suggested reviewers

winoros
wshwsh12
hawkingrei

Poem

🐰 A boolean dream now takes its flight,
SHOULD terms dance when MUST's not in sight,
No more limits on "hello world"—
FTS queries bloom and gently swirl,
TiDB searches now run more right! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description check	✅ Passed	The description follows the template structure with Issue Number, Problem Summary, What Changed, completed Check List (unit tests marked), Side effects evaluated, and a release note provided.
Title check	✅ Passed	The title 'expression, planner: allow more MATCH AGAINST boolean queries' accurately reflects the main change: relaxing validation restrictions to support more valid boolean queries in MATCH AGAINST expressions.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-03-24T18:59:27Z

Codecov Report

❌ Patch coverage is 84.61538% with 2 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (release-fts-202602@1a76dfb). Learn more about missing BASE report.

Additional details and impacted files

@@                   Coverage Diff                   @@
##             release-fts-202602     #67266   +/-   ##
=======================================================
  Coverage                      ?   75.1052%           
=======================================================
  Files                         ?       1936           
  Lines                         ?     556494           
  Branches                      ?          0           
=======================================================
  Hits                          ?     417956           
  Misses                        ?     138532           
  Partials                      ?          6

Flag	Coverage Δ
integration	`39.6360% <0.0000%> (?)`
unit	`74.1821% <84.6153%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
dumpling	`56.7974% <0.0000%> (?)`
parser	`∅ <0.0000%> (?)`
br	`48.7657% <0.0000%> (?)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

coderabbitai

🧹 Nitpick comments (3)

pkg/planner/core/casetest/tici/tici_test.go (1)

265-267: Use a narrower negative match than plain world.

CheckNotContain("world") is broader than the behavior under test and can start failing on unrelated explain-output changes. Matching the specific helper fragment instead, e.g. fts_match_word("world", test.t1.title), would keep this regression focused on the rewrite.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/planner/core/casetest/tici/tici_test.go` around lines 265 - 267, The
negative assertion in the tk.MustQuery(...) call is too broad: replace
CheckNotContain("world") with a narrower negative match targeting the specific
explain helper fragment (e.g. CheckNotContain(`fts_match_word("world",
test.t1.title)`)); update the assertion that follows the explain for the query
using match(title) against ('+hello world' IN BOOLEAN MODE) so it only rejects
the exact helper output rather than any occurrence of "world", keeping the test
focused on the rewrite logic around tk.MustQuery and CheckNotContain for the
t1.title FTS helper.

pkg/expression/fts_helper.go (1)

334-367: Capture the SHOULD slice boundary from the built slice.

startIdx := len(group.Must) + len(group.MustNot) assumes every earlier clause contributes exactly one appended expression. That is true today, but it is a subtle invariant across recursive rewrites. Recording the boundary from len(searchFuncs) before appending group.Should would keep the compaction logic local to the actual appends and make this path less fragile if clause rewriting ever stops being 1:1.

Suggested refactor

-	if includeShouldInFilter {
-		for _, item := range group.Should {
-			f, err := rewriteBooleanClauseToFTSExpr(bctx, matchCols, item, parserType)
-			if err != nil {
-				return nil, err
-			}
-			searchFuncs = append(searchFuncs, f)
-		}
-	}
-	if includeShouldInFilter && len(group.Should) > 0 {
-		startIdx := len(group.Must) + len(group.MustNot)
-		endIdx := startIdx + len(group.Should)
-		orShould := ComposeDNFCondition(bctx, searchFuncs[startIdx:endIdx]...)
-		searchFuncs[startIdx] = orShould
-		searchFuncs = searchFuncs[:startIdx+1]
+	if includeShouldInFilter {
+		shouldStart := len(searchFuncs)
+		for _, item := range group.Should {
+			f, err := rewriteBooleanClauseToFTSExpr(bctx, matchCols, item, parserType)
+			if err != nil {
+				return nil, err
+			}
+			searchFuncs = append(searchFuncs, f)
+		}
+		if len(group.Should) > 0 {
+			orShould := ComposeDNFCondition(bctx, searchFuncs[shouldStart:]...)
+			searchFuncs[shouldStart] = orShould
+			searchFuncs = searchFuncs[:shouldStart+1]
+		}
 	}

As per coding guidelines, code SHOULD be self-documenting through clear naming and structure.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/expression/fts_helper.go` around lines 334 - 367, The code assumes
earlier clauses append exactly one expression by computing startIdx :=
len(group.Must) + len(group.MustNot); instead capture the actual boundary before
appending the SHOULD items: record startIdx := len(searchFuncs) right before the
loop that appends group.Should, then append each rewritten expression via
rewriteBooleanClauseToFTSExpr into searchFuncs and compute endIdx :=
len(searchFuncs) after the loop; pass searchFuncs[startIdx:endIdx] to
ComposeDNFCondition and replace searchFuncs[startIdx] with the composed OR and
truncate searchFuncs to startIdx+1—this keeps the boundary tied to the real
appends and references symbols searchFuncs, group.Should,
rewriteBooleanClauseToFTSExpr, includeShouldInFilter, and ComposeDNFCondition.

pkg/expression/util_test.go (1)

799-818: Add one mixed SHOULD + MUST NOT regression here.

These new assertions pin down pure SHOULD and MUST+SHOULD, but not the new path where negative terms must survive SHOULD compaction. A case like hello -world and/or +hello world -banana would cover the startIdx := len(group.Must)+len(group.MustNot) slice logic and catch accidental dropping of MUST NOT filters.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/expression/util_test.go` around lines 799 - 818, Add a regression test
that covers mixed SHOULD + MUST NOT cases so negative terms survive SHOULD
compaction: call RewriteMySQLMatchAgainstRecursively with
buildMatchAgainst("hello -world") and with buildMatchAgainst("+hello world
-banana"), require no error and hasMySQLMatchAgainst(expr) is false, then use
collectFTSLeaves(expr) (and the same root/type checks as the other tests) to
assert the returned []ftsLeaf contains entries for both positive terms (e.g.,
"hello", "world" or "hello","world") and the negative terms (e.g., "world",
"banana") so that MUST NOT leaves are not dropped; reference functions
RewriteMySQLMatchAgainstRecursively, buildMatchAgainst, hasMySQLMatchAgainst,
collectFTSLeaves and the ftsLeaf struct when adding the assertions.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@pkg/expression/fts_helper.go`:
- Around line 334-367: The code assumes earlier clauses append exactly one
expression by computing startIdx := len(group.Must) + len(group.MustNot);
instead capture the actual boundary before appending the SHOULD items: record
startIdx := len(searchFuncs) right before the loop that appends group.Should,
then append each rewritten expression via rewriteBooleanClauseToFTSExpr into
searchFuncs and compute endIdx := len(searchFuncs) after the loop; pass
searchFuncs[startIdx:endIdx] to ComposeDNFCondition and replace
searchFuncs[startIdx] with the composed OR and truncate searchFuncs to
startIdx+1—this keeps the boundary tied to the real appends and references
symbols searchFuncs, group.Should, rewriteBooleanClauseToFTSExpr,
includeShouldInFilter, and ComposeDNFCondition.

In `@pkg/expression/util_test.go`:
- Around line 799-818: Add a regression test that covers mixed SHOULD + MUST NOT
cases so negative terms survive SHOULD compaction: call
RewriteMySQLMatchAgainstRecursively with buildMatchAgainst("hello -world") and
with buildMatchAgainst("+hello world -banana"), require no error and
hasMySQLMatchAgainst(expr) is false, then use collectFTSLeaves(expr) (and the
same root/type checks as the other tests) to assert the returned []ftsLeaf
contains entries for both positive terms (e.g., "hello", "world" or
"hello","world") and the negative terms (e.g., "world", "banana") so that MUST
NOT leaves are not dropped; reference functions
RewriteMySQLMatchAgainstRecursively, buildMatchAgainst, hasMySQLMatchAgainst,
collectFTSLeaves and the ftsLeaf struct when adding the assertions.

In `@pkg/planner/core/casetest/tici/tici_test.go`:
- Around line 265-267: The negative assertion in the tk.MustQuery(...) call is
too broad: replace CheckNotContain("world") with a narrower negative match
targeting the specific explain helper fragment (e.g.
CheckNotContain(`fts_match_word("world", test.t1.title)`)); update the assertion
that follows the explain for the query using match(title) against ('+hello
world' IN BOOLEAN MODE) so it only rejects the exact helper output rather than
any occurrence of "world", keeping the test focused on the rewrite logic around
tk.MustQuery and CheckNotContain for the t1.title FTS helper.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: dd7ba501-25f5-4f2c-8f26-beaa7f725750

📥 Commits

Reviewing files that changed from the base of the PR and between 1a76dfb and d44c32f.

📒 Files selected for processing (3)

pkg/expression/fts_helper.go
pkg/expression/util_test.go
pkg/planner/core/casetest/tici/tici_test.go

winoros · 2026-03-24T19:59:50Z

/retest

winoros · 2026-03-25T11:53:28Z

/retest

winoros · 2026-03-25T18:08:53Z

/retest

expression, planner: allow more MATCH AGAINST boolean queries

d44c32f

ti-chi-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/planner SIG: Planner size/M Denotes a PR that changes 30-99 lines, ignoring generated files. type/cherry-pick-for-release-fts-202602 labels Mar 24, 2026

ti-chi-bot mentioned this pull request Mar 24, 2026

expression, planner: allow more MATCH AGAINST boolean queries | tidb-test=13ccf8de48e8db2290ff884598444d0508606bbf tiflash=feature-fts #67242

Merged

13 tasks

ti-chi-bot assigned winoros Mar 24, 2026

winoros approved these changes Mar 24, 2026

View reviewed changes

ti-chi-bot Bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Mar 24, 2026

coderabbitai Bot reviewed Mar 24, 2026

View reviewed changes

winoros removed their assignment Mar 25, 2026

winoros merged commit 45244ec into pingcap:release-fts-202602 Mar 25, 2026
22 of 23 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

expression, planner: allow more MATCH AGAINST boolean queries (#67242) | tidb-test=13ccf8de48e8db2290ff884598444d0508606bbf tiflash=feature-fts#67266

expression, planner: allow more MATCH AGAINST boolean queries (#67242) | tidb-test=13ccf8de48e8db2290ff884598444d0508606bbf tiflash=feature-fts#67266
winoros merged 1 commit intopingcap:release-fts-202602from
ti-chi-bot:cherry-pick-67242-to-release-fts-202602

ti-chi-bot commented Mar 24, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

ti-chi-bot Bot commented Mar 24, 2026

Uh oh!

ti-chi-bot Bot commented Mar 24, 2026

Uh oh!

coderabbitai Bot commented Mar 24, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

codecov Bot commented Mar 24, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

winoros commented Mar 24, 2026

Uh oh!

winoros commented Mar 25, 2026

Uh oh!

winoros commented Mar 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ti-chi-bot commented Mar 24, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

What changed and how does it work?

Check List

Release note

Summary by CodeRabbit

Uh oh!

ti-chi-bot Bot commented Mar 24, 2026

Uh oh!

ti-chi-bot Bot commented Mar 24, 2026

[LGTM Timeline notifier]

Uh oh!

coderabbitai Bot commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

codecov Bot commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

winoros commented Mar 24, 2026

Uh oh!

winoros commented Mar 25, 2026

Uh oh!

winoros commented Mar 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ti-chi-bot commented Mar 24, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 24, 2026 •

edited

Loading

codecov Bot commented Mar 24, 2026 •

edited

Loading