Fixes #27158: ingestion slowdown from tag_usage seq-scan on Postgres#27745
Fixes #27158: ingestion slowdown from tag_usage seq-scan on Postgres#27745sonika-shah wants to merge 1 commit intomainfrom
Conversation
6243f22 to
5e01f10
Compare
There was a problem hiding this comment.
Pull request overview
Restores efficient Postgres execution for tag_usage prefix-LIKE lookups by reintroducing a usable index for the current query shape and removing the brittle coupling between query predicates and partial index predicates.
Changes:
- Add a non-partial btree index on
tag_usage.targetfqnhash_lowerusingtext_pattern_opsto serve prefixLIKEqueries. - Rebuild the existing
tag_usagepartial indexes (previouslyWHERE state = 1) as non-partial indexes to avoid future predicate-coupling regressions. - Rebuild the existing
gin_tag_usage_targetfqn_trgmindex without the partial predicate.
The 1.11.0 perf migration (#23054) added four `WHERE state = 1` partial indexes on tag_usage; #24063 dropped the matching `state = 1` predicate from getTagsInternalByPrefix (Suggested-state rows are valid for both classification and glossary derivation), leaving every partial index inapplicable. Postgres fell back to a parallel seq scan; MySQL was unaffected because its 1.11.0 indexes were never partial. Adds a non-partial single-col btree on targetfqnhash_lower (mirrors MySQL's idx_targetfqnhash_lower) and rebuilds the four partials as non-partial -- same shape, same INCLUDE columns, predicate coupling removed so future query changes can't silently invalidate them. Verified end-to-end against a local Postgres with 50k rows: seq scan reproduced before the fix (matches reporter's EXPLAIN), bitmap index scan after, both for inline and prepared-statement paths.
9520cc4 to
bc73c29
Compare
Code Review ✅ ApprovedRestores the prefix-LIKE index on the tag_usage table to resolve performance regressions. No issues found. OptionsDisplay: compact → Showing less information. Comment with these commands to change:
Was this helpful? React with 👍 / 👎 | Gitar |
| -- Issue #27158: tag_usage seq-scan on Postgres. #24063 dropped the | ||
| -- `state = 1` predicate that 1.11.0's partial indexes required. | ||
| -- Fix: add a single-col index, and drop the `WHERE state = 1` filter | ||
| -- from the existing partials so query changes can't invalidate them. |
There was a problem hiding this comment.
PR description references applying the fix under native/2.0.1/postgres/schemaChanges.sql, but the actual change is introduced in native/1.12.8/postgres/schemaChanges.sql. Please align the PR description (and/or title) with the versioned migration directory that is actually being modified to avoid confusion during verification/rollout.
🔴 Playwright Results — 1 failure(s), 18 flaky✅ 3954 passed · ❌ 1 failed · 🟡 18 flaky · ⏭️ 86 skipped
Genuine Failures (failed on all attempts)❌
|
Fixes #27158
Summary
getTagsInternalByPrefixparallel seq-scanstag_usageon Postgres,causing RDS CPU spikes during ingestion (#27158).
Cause: 1.11.0 perf migration (#23054) added four partial indexes on
tag_usagefilteredWHERE state = 1. #24063 dropped the matchingAND tu.state = 1from the query (Suggested rows are valid for bothclassification and glossary derivation), leaving every partial index
inapplicable. MySQL was unaffected because its 1.11.0 indexes were never
partial (no partial-index syntax in MySQL).
Fix
bootstrap/sql/migrations/native/1.12.8/postgres/schemaChanges.sql:targetfqnhash_lower(mirrorsMySQL's
idx_targetfqnhash_lower) — serves prefix-LIKE queries withno
sourcepredicate.INCLUDE columns; only
WHERE state = 1removed so predicate changescan't silently invalidate them again. No current
tag_usagequeryfilters
state = 1, so this is purely additive in coverage.All DDL
CONCURRENTLYand idempotent. New single-col index createdfirst so
getTagsInternalByPrefixstays served during compositerebuilds. No Java/query change. No MySQL change (1.12.8/mysql holds
placeholders only — MySQL's 1.11.0 indexes were already non-partial).
Verification
50k synthetic rows in local Postgres:
Seq Scan(Rows Removed by Filter: 49010)Bitmap Index Scan on idx_tag_usage_targetfqnhash_lower_patternLIKE LOWER($1)→~>=~/~<~rangeTest plan
Summary by Gitar
pg_trgmextension to support GIN indexing for partial string matching.gin_tag_usage_targetfqn_trgmGIN index ontargetFQNHashto improve search performance.This will update automatically on new commits.