Skip to content

expression, planner: allow more MATCH AGAINST boolean queries | tidb-test=13ccf8de48e8db2290ff884598444d0508606bbf tiflash=feature-fts#67242

Merged
ti-chi-bot[bot] merged 1 commit intopingcap:feature/ftsfrom
winoros:mysql-match-against
Mar 24, 2026

Conversation

@winoros
Copy link
Copy Markdown
Member

@winoros winoros commented Mar 23, 2026

What problem does this PR solve?

Issue Number: ref #65626

Problem Summary:

MATCH ... AGAINST (... IN BOOLEAN MODE) still rejected some valid boolean queries during TiCI rewrite. Queries like hello world and +hello world returned TiDB only supports multiple terms with +/- modifiers.

What changed and how does it work?

  • Removed the extra BOOLEAN MODE validation that rejected multiple SHOULD clauses and mixed MUST/SHOULD clauses.
  • Rewrote pure SHOULD queries into OR-based FTS filters.
  • Kept MUST / MUST NOT filters when SHOULD terms appear in a no-score TiCI query, so optional terms no longer become mandatory filters.
  • Added regression coverage in expression rewrite tests and planner TiCI validation tests.

Check List

Tests

  • Unit test
    go test -run TestRewriteMySQLMatchAgainst --tags=intest ./pkg/expression
    go test -run TestTiCIMatchAgainstValidation --tags=intest ./pkg/planner/core/casetest/tici
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

Allow more valid MATCH ... AGAINST BOOLEAN MODE queries during TiCI rewrite, including multiple SHOULD terms and mixed MUST/SHOULD queries.

Summary by CodeRabbit

  • Bug Fixes
    • Removed previous limitations on full-text search in BOOLEAN MODE. MATCH AGAINST now supports multi-term search patterns ("hello world", "+hello world") without requiring explicit +/- modifiers, improving alignment with standard MySQL full-text search functionality and expanding supported query types.

@ti-chi-bot ti-chi-bot Bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/planner SIG: Planner size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Mar 23, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 23, 2026

📝 Walkthrough

Walkthrough

The pull request modifies full-text search functionality to lift restrictions on multi-term MATCH AGAINST boolean queries. Changes include conditional SHOULD clause handling in rewriteBooleanGroupToFTSExpr, removal of validation errors for previously unsupported query patterns, and updated test expectations to verify proper plan generation instead of error cases.

Changes

Cohort / File(s) Summary
FTS Boolean Group Rewriting
pkg/expression/fts_helper.go
Modified rewriteBooleanGroupToFTSExpr to conditionally include group.Should clauses based on group.Must state and simplified validateMySQLMatchAgainstBooleanGroup by removing prior TiDB-specific limitation checks for MUST/MUST NOT with SHOULD term combinations.
FTS Rewrite Test Expectations
pkg/expression/util_test.go
Updated TestRewriteMySQLMatchAgainst to expect successful rewrites for multi-term patterns like "hello world" and "+hello world" instead of errors, with assertions verifying correct FTSMatchWord AST structure and absence of FTSMysqlMatchAgainst nodes.
TiCI Plan Validation
pkg/planner/core/casetest/tici/tici_test.go
Converted negative assertions in TestTiCIMatchAgainstValidation to positive plan-content checks, validating that no-score and +-prefixed boolean queries generate expected or(...) and fts_match_word search function structures.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

Suggested labels

sig/planner, size/M, approved, lgtm

Suggested reviewers

  • AilinKid
  • wshwsh12
  • hawkingrei

Poem

🐰✨ A rabbit hops through search fields free,
No more restrictions on SHOULD we see!
Multi-terms now dance in boolean light,
Where MUST and SHOULD clauses unite
The query rewrites, oh what a sight!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description check ✅ Passed The PR description addresses the template requirements: it references issue #65626, explains the problem and changes clearly, marks required tests as completed (unit tests), and provides an appropriate release note for the user-facing behavior change.
Title check ✅ Passed The title accurately captures the main objective: allowing more valid MATCH AGAINST boolean queries in TiDB by removing overly restrictive validation and enabling OR-based FTS filters for SHOULD clauses.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
pkg/expression/fts_helper.go (1)

329-333: Make the SHOULD-pruning contract explicitly TiCI-specific.

Dropping group.Should here is right for the no-score TiCI filter path, but this helper sits behind a generic-looking RewriteMySQLMatchAgainstRecursively API in pkg/expression. Please make it explicit that this rule is TiCI-only and applies to the current boolean group, so it does not get reused later as a general MATCH rewrite.

Suggested wording
-	// TiCI currently executes MATCH ... AGAINST as a no-score filter. When the
-	// boolean query contains at least one MUST term, MySQL treats SHOULD terms as
-	// optional score contributors, which cannot be represented in the filter-only
-	// path. In that case we keep the MUST / MUST NOT filters and ignore SHOULD.
+	// TiCI currently executes MATCH ... AGAINST as a no-score filter.
+	// This is intentionally a TiCI-specific rule for the current boolean group:
+	// once the group contains at least one MUST term, MySQL treats SHOULD terms
+	// as optional score contributors. The filter-only path cannot represent that,
+	// so we keep MUST / MUST NOT filters and drop SHOULD terms here.

As per coding guidelines, "Code SHOULD remain maintainable for future readers with basic TiDB familiarity" and "Comments SHOULD explain non-obvious intent, constraints, invariants, concurrency guarantees, SQL/compatibility contracts, or important performance trade-offs."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/expression/fts_helper.go` around lines 329 - 333, Update the comment
around includeShouldInFilter := len(group.Must) == 0 to explicitly mark the
SHOULD-pruning as a TiCI-specific behavior and local to the current boolean
group used by the TiCI no-score filter path; mention the rule does NOT change
the global semantics of RewriteMySQLMatchAgainstRecursively and only applies
when TiCI is executing MATCH ... AGAINST as a filter (so group.Should may be
dropped here but should not be assumed safe for other callers). Reference the
boolean group (group.Should / group.Must) and the
RewriteMySQLMatchAgainstRecursively API in the comment so future readers know
this is a TiCI-only, per-group optimization and not a general MATCH rewrite.
pkg/expression/util_test.go (1)

810-818: Add one mixed MUST/MUST NOT/SHOULD rewrite case.

The new checks cover SHOULD-only and MUST+SHOULD, but not the branch where SHOULD is dropped while MUST NOT is still preserved. A +hello -world foo case would pin the len(group.MustNot) path in rewriteBooleanGroupToFTSExpr and catch regressions in the startIdx slicing logic.

Minimal test addition
 expr, _, err = RewriteMySQLMatchAgainstRecursively(ctx, buildMatchAgainst("+hello world"), model.FullTextParserTypeStandardV1)
 require.NoError(t, err)
 require.False(t, hasMySQLMatchAgainst(expr))
 root, ok = expr.(*ScalarFunction)
 require.True(t, ok)
 require.Equal(t, ast.FTSMatchWord, root.FuncName.L)
 require.ElementsMatch(t, []ftsLeaf{
 	{funcName: ast.FTSMatchWord, query: "hello", columns: []string{"title"}},
 }, collectFTSLeaves(expr))

+expr, _, err = RewriteMySQLMatchAgainstRecursively(ctx, buildMatchAgainst("+hello -world foo"), model.FullTextParserTypeStandardV1)
+require.NoError(t, err)
+require.False(t, hasMySQLMatchAgainst(expr))
+require.ElementsMatch(t, []ftsLeaf{
+	{funcName: ast.FTSMatchWord, query: "hello", columns: []string{"title"}},
+	{funcName: ast.FTSMatchWord, query: "world", columns: []string{"title"}, underNot: true},
+}, collectFTSLeaves(expr))

As per coding guidelines, "Prefer extending existing test suites and fixtures over creating new scaffolding" and "Keep test changes minimal and deterministic; avoid broad golden/testdata churn unless required."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/expression/util_test.go` around lines 810 - 818, Add a new test case in
util_test.go alongside the existing RewriteMySQLMatchAgainstRecursively checks
that calls RewriteMySQLMatchAgainstRecursively with buildMatchAgainst("+hello
-world foo"), asserts no error and require.False(hasMySQLMatchAgainst(expr)),
casts root to *ScalarFunction and checks root.FuncName (as in the nearby tests),
and then assert collectFTSLeaves(expr) contains both the MUST leaf for "hello"
and the MUST NOT leaf for "world" (mirroring the existing ftsLeaf expectations)
so the branch in rewriteBooleanGroupToFTSExpr that preserves MustNot while
dropping SHOULD is exercised.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@pkg/expression/fts_helper.go`:
- Around line 329-333: Update the comment around includeShouldInFilter :=
len(group.Must) == 0 to explicitly mark the SHOULD-pruning as a TiCI-specific
behavior and local to the current boolean group used by the TiCI no-score filter
path; mention the rule does NOT change the global semantics of
RewriteMySQLMatchAgainstRecursively and only applies when TiCI is executing
MATCH ... AGAINST as a filter (so group.Should may be dropped here but should
not be assumed safe for other callers). Reference the boolean group
(group.Should / group.Must) and the RewriteMySQLMatchAgainstRecursively API in
the comment so future readers know this is a TiCI-only, per-group optimization
and not a general MATCH rewrite.

In `@pkg/expression/util_test.go`:
- Around line 810-818: Add a new test case in util_test.go alongside the
existing RewriteMySQLMatchAgainstRecursively checks that calls
RewriteMySQLMatchAgainstRecursively with buildMatchAgainst("+hello -world foo"),
asserts no error and require.False(hasMySQLMatchAgainst(expr)), casts root to
*ScalarFunction and checks root.FuncName (as in the nearby tests), and then
assert collectFTSLeaves(expr) contains both the MUST leaf for "hello" and the
MUST NOT leaf for "world" (mirroring the existing ftsLeaf expectations) so the
branch in rewriteBooleanGroupToFTSExpr that preserves MustNot while dropping
SHOULD is exercised.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 72cd12a2-1a29-4367-a7fc-f8a540387344

📥 Commits

Reviewing files that changed from the base of the PR and between f3f0b09 and dbb2750.

📒 Files selected for processing (3)
  • pkg/expression/fts_helper.go
  • pkg/expression/util_test.go
  • pkg/planner/core/casetest/tici/tici_test.go

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 23, 2026

Codecov Report

❌ Patch coverage is 84.61538% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.1893%. Comparing base (bd3b320) to head (dbb2750).
⚠️ Report is 24 commits behind head on feature/fts.

Additional details and impacted files
@@                 Coverage Diff                 @@
##           feature/fts     #67242        +/-   ##
===================================================
- Coverage      76.8610%   75.1893%   -1.6718%     
===================================================
  Files             1960       1936        -24     
  Lines           555677     557304      +1627     
===================================================
- Hits            427099     419033      -8066     
- Misses          127116     138265     +11149     
+ Partials          1462          6      -1456     
Flag Coverage Δ
integration 39.6923% <0.0000%> (-5.9171%) ⬇️
unit 74.2870% <84.6153%> (+0.2388%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 56.7974% <ø> (+0.0929%) ⬆️
parser ∅ <ø> (∅)
br 48.7657% <ø> (-17.4790%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ti-chi-bot ti-chi-bot Bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Mar 24, 2026
@winoros winoros changed the title expression, planner: allow more MATCH AGAINST boolean queries expression, planner: allow more MATCH AGAINST boolean queries | tidb-test=13ccf8de48e8db2290ff884598444d0508606bbf tiflash=feature-fts Mar 24, 2026
@winoros
Copy link
Copy Markdown
Member Author

winoros commented Mar 24, 2026

/retest

@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented Mar 24, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: hawkingrei, wshwsh12
Once this PR has been reviewed and has the lgtm label, please assign xuhuaiyu for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot Bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Mar 24, 2026
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented Mar 24, 2026

[LGTM Timeline notifier]

Timeline:

  • 2026-03-24 02:13:24.889855203 +0000 UTC m=+234400.925925453: ☑️ agreed by wshwsh12.
  • 2026-03-24 07:23:18.845986963 +0000 UTC m=+252994.882057213: ☑️ agreed by hawkingrei.

@winoros
Copy link
Copy Markdown
Member Author

winoros commented Mar 24, 2026

/retest

1 similar comment
@winoros
Copy link
Copy Markdown
Member Author

winoros commented Mar 24, 2026

/retest

@ti-chi-bot ti-chi-bot Bot merged commit 15ec875 into pingcap:feature/fts Mar 24, 2026
23 checks passed
@winoros winoros deleted the mysql-match-against branch March 24, 2026 18:45
@winoros
Copy link
Copy Markdown
Member Author

winoros commented Mar 24, 2026

/cherry-pick release-fts-202602

@ti-chi-bot
Copy link
Copy Markdown
Member

@winoros: new pull request created to branch release-fts-202602: #67266.

Details

In response to this:

/cherry-pick release-fts-202602

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

winoros added a commit that referenced this pull request Mar 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved lgtm release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/planner SIG: Planner size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants