Skip to content

fix(profiler): N+1 / missing-index regression on /tables/.../columns?fields=profile (#3488)#27746

Open
sonika-shah wants to merge 1 commit intomainfrom
fix/profiler-n1-missing-index-3488
Open

fix(profiler): N+1 / missing-index regression on /tables/.../columns?fields=profile (#3488)#27746
sonika-shah wants to merge 1 commit intomainfrom
fix/profiler-n1-missing-index-3488

Conversation

@sonika-shah
Copy link
Copy Markdown
Collaborator

Fixes collate#3488
Related to / extends #26855

Root Cause

The 1.9.9 migration introduced two separate index regressions on profiler_data_time_series that weren't addressed by #26855 (which fixed tag N+1 on the list-tables path but left the profiler index and customMetrics N+1 on the /tables/name/{fqn}/columns sub-path).

1. PostgreSQL: unique constraint dropped, never recreated

bootstrap/sql/migrations/native/1.9.9/postgres/schemaChanges.sql explicitly dropped profiler_data_time_series_unique_hash_extension_ts (entityFQNHash, extension, operation, timestamp) to allow changing the generated operation column expression, but never recreated it. After the migration only the (extension, timestamp) index remained — useless for queries that filter by entityFQNHash.

2. MySQL + PostgreSQL: all temporary indexes dropped at end of postDataMigration

bootstrap/sql/migrations/native/1.9.9/mysql(postgres)/postDataMigrationSQLScript.sql created idx_pdts_entityFQNHash, idx_pdts_composite, and related indexes for its bulk UPDATE pass, then dropped all of them at the end of the migration, including the only index covering entityFQNHash.

Impact

The batch query in getLatestExtensionsBatch(), called whenever fields=profile is requested on the columns endpoint:

SELECT entityFQNHash, MAX(timestamp) FROM profiler_data_time_series
WHERE entityFQNHash IN (...50 hashes...) AND extension = 'table.columnProfile'
GROUP BY entityFQNHash

needs an (entityFQNHash, extension, timestamp) index. Without it the database performs a full table scan. On production deployments with millions of profiler rows this caused 106 770 ms response time (99% DB; 93 dbOps). The same endpoint without profile in fields returns in ~150-220 ms.

Secondary: customMetrics N+1

getTableColumnsInternal called getCustomMetrics(table, column.getName()) once per paginated column (up to 50 calls), each fetching the same full set of custom metrics for the table and filtering in Java. This is N identical queries instead of 1.

Secondary: duplicate populateEntityFieldTags

When both tags and profile are in the fields param, populateEntityFieldTags was called twice — once in the tags block and again unconditionally inside the profile block.

Changes

File Change
bootstrap/sql/migrations/native/2.0.2/mysql/schemaChanges.sql CREATE INDEX IF NOT EXISTS idx_pdts_fqnhash_ext_ts ON profiler_data_time_series(entityFQNHash, extension, timestamp)
bootstrap/sql/migrations/native/2.0.2/postgres/schemaChanges.sql Same index for PostgreSQL
TableRepository.java Batch customMetrics fetch: one query for all column metrics, grouped in Java
TableRepository.java Skip duplicate populateEntityFieldTags call when tags already fetched
TableResourceIT.java Regression test: test_getColumnsWithProfileField_correctnessAndNoBatchRegression — verifies profile data correctness for fields=profile, fields=tags,customMetrics,extension,profile, and fields=tags,profile within a 30s timeout

Migration Note

CREATE INDEX IF NOT EXISTS is idempotent — safe on all upgrade paths including fresh installs and systems that already ran 1.9.9.

Test Plan

  • Regression test covers all three field combinations (profile, tags,customMetrics,extension,profile, tags,profile)
  • Verifies column profiles returned correctly (min/max values, null for unprofile columns)
  • Verifies tags populated when both tags and profile are requested
  • mvn spotless:apply passes clean
  • Migration files use IF NOT EXISTS guard

Generated by Claude Code

…fields=profile (#3488)

Root cause
----------
The 1.9.9 migration introduced two separate index regressions on
`profiler_data_time_series`:

1. **PostgreSQL**: `schemaChanges.sql` explicitly dropped the unique
   constraint `profiler_data_time_series_unique_hash_extension_ts`
   (entityFQNHash, extension, operation, timestamp) to allow altering the
   generated `operation` column expression, but never recreated it.  After
   the migration the table kept only the `(extension, timestamp)` index,
   which is useless for queries filtering by `entityFQNHash`.

2. **MySQL/both**: `postDataMigrationSQLScript.sql` created temporary indexes
   (idx_pdts_entityFQNHash, idx_pdts_composite, etc.) for its bulk UPDATE
   pass and then dropped **all** of them, including the only index covering
   `entityFQNHash`.

The batch query issued by `getLatestExtensionsBatch()` when
`fields=profile` is requested:

  SELECT entityFQNHash, MAX(timestamp) FROM profiler_data_time_series
  WHERE entityFQNHash IN (...N hashes...) AND extension = 'table.columnProfile'
  GROUP BY entityFQNHash

required an `(entityFQNHash, extension, timestamp)` index.  Without it the
database performs a full table scan.  On production deployments with
millions of profiler rows this caused 100+ second response times (Grafana:
106 770 ms; 99 % in DB; 93 dbOps).  Without `profile` in the fields param
the same endpoint returned in ~150-220 ms.

A secondary N+1 bug existed independently of the index: `customMetrics`
in fields called `getCustomMetrics(table, column)` once per paginated
column, issuing up to N identical queries against `entity_extension` and
then filtering in Java.

Fix
---
* **migration 2.0.2** (MySQL + PostgreSQL): `CREATE INDEX IF NOT EXISTS
  idx_pdts_fqnhash_ext_ts ON profiler_data_time_series(entityFQNHash,
  extension, timestamp)`.  The `IF NOT EXISTS` guard makes the migration
  safe to re-run and handles both upgrade and fresh-install paths.

* **`getTableColumnsInternal`** — `customMetrics` block: fetch all column
  custom metrics for the table in one query, group by column name in Java,
  then distribute.  Reduces N queries to 1.

* **`getTableColumnsInternal`** — `profile` block: skip the duplicate
  `populateEntityFieldTags` call when `tags` was already fetched earlier in
  the same request, saving one prefix-scan on `tag_usage` per request.

Related: PR #26855 (fixed N+1 tag queries on the list-tables path but left
the profiler-index and customMetrics N+1 untouched on the columns sub-path).
Copilot AI review requested due to automatic review settings April 26, 2026 20:55
@github-actions github-actions Bot added backend safe to test Add this label to run secure Github workflows on PRs labels Apr 26, 2026
@sonika-shah
Copy link
Copy Markdown
Collaborator Author

Fix Summary

Root cause confirmed

The 106-second latency is caused by a full table scan on profiler_data_time_series introduced by the 1.9.9 migration.

PostgreSQL path (most severe)
1.9.9/postgres/schemaChanges.sql drops profiler_data_time_series_unique_hash_extension_ts (entityFQNHash, extension, operation, timestamp) to allow changing the generated operation column expression. The constraint is never recreated. After the migration the table has only profiler_data_time_series_combined_id_ts (extension, timestamp) — useless for predicates filtering by entityFQNHash.

Both databases
1.9.9/*/postDataMigrationSQLScript.sql creates idx_pdts_entityFQNHash, idx_pdts_composite, etc. for its bulk UPDATE pass, then drops all of them at the end (lines 167-171 in MySQL, equivalent in Postgres), leaving no index on entityFQNHash.

The net effect: getLatestExtensionsBatch() runs

SELECT entityFQNHash, MAX(timestamp) FROM profiler_data_time_series
WHERE entityFQNHash IN (...50 hashes...) AND extension = 'table.columnProfile'
GROUP BY entityFQNHash

with a full-table scan on every request that includes fields=profile.

What this PR adds / changes

Details
Migration 2.0.2 CREATE INDEX IF NOT EXISTS idx_pdts_fqnhash_ext_ts ON profiler_data_time_series(entityFQNHash, extension, timestamp) for both MySQL and PostgreSQL
customMetrics N+1 fix Was: N calls to getCustomMetrics(table, col) each fetching all metrics then filtering in Java → Now: 1 query, grouped in Java
Duplicate tag fetch fix populateEntityFieldTags was called twice when both tags and profile were in fields → skip the second call

Expected timing improvement (from production symptom analysis)

Scenario Before After (expected)
fields=profile (50 cols, large table) ~106 770 ms ~50-200 ms (batch index scan)
fields=tags,customMetrics,extension,profile ~106 770 ms ~50-200 ms
fields=tags,customMetrics,extension (no profile) ~150-220 ms ~150-220 ms (no regression)

Note: local reproduction requires a large profiler_data_time_series table (millions of rows) to surface the index-scan vs. full-scan difference. The regression test verifies correctness and uses a 30s timeout as a safety net.

Migration ID added

2.0.2bootstrap/sql/migrations/native/2.0.2/{mysql,postgres}/schemaChanges.sql


Generated by Claude Code

Comment on lines +18 to +19
CREATE INDEX IF NOT EXISTS idx_pdts_fqnhash_ext_ts
ON profiler_data_time_series (entityFQNHash, extension, timestamp);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚨 Bug: MySQL migration uses unsupported IF NOT EXISTS for CREATE INDEX

The MySQL migration uses CREATE INDEX IF NOT EXISTS, but this syntax is not supported in MySQL prior to 8.0.29. No other MySQL migration in the project uses this pattern — all 148 other CREATE INDEX statements in mysql/*.sql files use plain CREATE INDEX without IF NOT EXISTS. The project's docker-compose pins mysql:8.0, which today resolves to ≥8.0.29, but users running older 8.0.x versions (e.g., 8.0.28 or below) will get a syntax error and the migration will fail, blocking the upgrade.

PostgreSQL correctly supports CREATE INDEX IF NOT EXISTS in all supported versions, so the Postgres file is fine.

Suggested fix:

-- Use a conditional check via stored procedure,
-- or match existing project pattern with plain CREATE INDEX
-- wrapped in a procedure that tolerates 'duplicate key' error:
SET @index_exists = (SELECT COUNT(1) FROM information_schema.STATISTICS
  WHERE TABLE_SCHEMA = DATABASE()
    AND TABLE_NAME = 'profiler_data_time_series'
    AND INDEX_NAME = 'idx_pdts_fqnhash_ext_ts');
SET @sql = IF(@index_exists = 0,
  'CREATE INDEX idx_pdts_fqnhash_ext_ts ON profiler_data_time_series (entityFQNHash, extension, timestamp)',
  'SELECT 1');
PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion

}
for (Column column : paginatedColumns) {
column.setCustomMetrics(getCustomMetrics(table, column.getName()));
column.setCustomMetrics(metricsByColumn.getOrDefault(column.getName(), new ArrayList<>()));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Quality: Unnecessary ArrayList allocation per column without metrics

getOrDefault(column.getName(), new ArrayList<>()) allocates a new ArrayList for every column that has no custom metrics. Since the returned list is only set on the column and never mutated afterwards, an immutable empty list would avoid unnecessary allocations.

Suggested fix:

column.setCustomMetrics(
    metricsByColumn.getOrDefault(column.getName(), List.of()));

Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion

@gitar-bot
Copy link
Copy Markdown

gitar-bot Bot commented Apr 26, 2026

Code Review 🚫 Blocked 0 resolved / 2 findings

Adds index and query optimizations for the profiler API but contains an unsupported MySQL migration syntax and redundant ArrayList allocations.

🚨 Bug: MySQL migration uses unsupported IF NOT EXISTS for CREATE INDEX

📄 bootstrap/sql/migrations/native/2.0.2/mysql/schemaChanges.sql:18-19

The MySQL migration uses CREATE INDEX IF NOT EXISTS, but this syntax is not supported in MySQL prior to 8.0.29. No other MySQL migration in the project uses this pattern — all 148 other CREATE INDEX statements in mysql/*.sql files use plain CREATE INDEX without IF NOT EXISTS. The project's docker-compose pins mysql:8.0, which today resolves to ≥8.0.29, but users running older 8.0.x versions (e.g., 8.0.28 or below) will get a syntax error and the migration will fail, blocking the upgrade.

PostgreSQL correctly supports CREATE INDEX IF NOT EXISTS in all supported versions, so the Postgres file is fine.

Suggested fix
-- Use a conditional check via stored procedure,
-- or match existing project pattern with plain CREATE INDEX
-- wrapped in a procedure that tolerates 'duplicate key' error:
SET @index_exists = (SELECT COUNT(1) FROM information_schema.STATISTICS
  WHERE TABLE_SCHEMA = DATABASE()
    AND TABLE_NAME = 'profiler_data_time_series'
    AND INDEX_NAME = 'idx_pdts_fqnhash_ext_ts');
SET @sql = IF(@index_exists = 0,
  'CREATE INDEX idx_pdts_fqnhash_ext_ts ON profiler_data_time_series (entityFQNHash, extension, timestamp)',
  'SELECT 1');
PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
💡 Quality: Unnecessary ArrayList allocation per column without metrics

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TableRepository.java:2908

getOrDefault(column.getName(), new ArrayList<>()) allocates a new ArrayList for every column that has no custom metrics. Since the returned list is only set on the column and never mutated afterwards, an immutable empty list would avoid unnecessary allocations.

Suggested fix
column.setCustomMetrics(
    metricsByColumn.getOrDefault(column.getName(), List.of()));
🤖 Prompt for agents
Code Review: Adds index and query optimizations for the profiler API but contains an unsupported MySQL migration syntax and redundant ArrayList allocations.

1. 🚨 Bug: MySQL migration uses unsupported `IF NOT EXISTS` for CREATE INDEX
   Files: bootstrap/sql/migrations/native/2.0.2/mysql/schemaChanges.sql:18-19

   The MySQL migration uses `CREATE INDEX IF NOT EXISTS`, but this syntax is **not supported in MySQL prior to 8.0.29**. No other MySQL migration in the project uses this pattern — all 148 other `CREATE INDEX` statements in `mysql/*.sql` files use plain `CREATE INDEX` without `IF NOT EXISTS`. The project's docker-compose pins `mysql:8.0`, which today resolves to ≥8.0.29, but users running older 8.0.x versions (e.g., 8.0.28 or below) will get a syntax error and the migration will fail, blocking the upgrade.
   
   PostgreSQL correctly supports `CREATE INDEX IF NOT EXISTS` in all supported versions, so the Postgres file is fine.

   Suggested fix:
   -- Use a conditional check via stored procedure,
   -- or match existing project pattern with plain CREATE INDEX
   -- wrapped in a procedure that tolerates 'duplicate key' error:
   SET @index_exists = (SELECT COUNT(1) FROM information_schema.STATISTICS
     WHERE TABLE_SCHEMA = DATABASE()
       AND TABLE_NAME = 'profiler_data_time_series'
       AND INDEX_NAME = 'idx_pdts_fqnhash_ext_ts');
   SET @sql = IF(@index_exists = 0,
     'CREATE INDEX idx_pdts_fqnhash_ext_ts ON profiler_data_time_series (entityFQNHash, extension, timestamp)',
     'SELECT 1');
   PREPARE stmt FROM @sql;
   EXECUTE stmt;
   DEALLOCATE PREPARE stmt;

2. 💡 Quality: Unnecessary ArrayList allocation per column without metrics
   Files: openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TableRepository.java:2908

   `getOrDefault(column.getName(), new ArrayList<>())` allocates a new `ArrayList` for every column that has no custom metrics. Since the returned list is only set on the column and never mutated afterwards, an immutable empty list would avoid unnecessary allocations.

   Suggested fix:
   column.setCustomMetrics(
       metricsByColumn.getOrDefault(column.getName(), List.of()));

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Addresses a performance regression on the /tables/.../columns?fields=profile path by restoring an efficient composite index for profiler lookups and removing N+1 behavior when loading column customMetrics.

Changes:

  • Adds a composite index on profiler_data_time_series(entityFQNHash, extension, timestamp) for PostgreSQL and MySQL migrations.
  • Updates TableRepository#getTableColumnsInternal to fetch all column custom-metrics in one query and avoid duplicate tag population when tags is already requested.
  • Adds an integration regression test covering fields=profile and combinations with tags/customMetrics/extension.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TableRepository.java Batches column custom-metrics retrieval and avoids duplicate tag population when profile is requested alongside tags.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/TableResourceIT.java Adds regression coverage for columns API with fields=profile and related combinations.
bootstrap/sql/migrations/native/2.0.2/postgres/schemaChanges.sql Adds composite index to prevent profiler full-table scans on PostgreSQL.
bootstrap/sql/migrations/native/2.0.2/mysql/schemaChanges.sql Adds composite index intended to prevent profiler full-table scans on MySQL.

Comment on lines +18 to +19
CREATE INDEX IF NOT EXISTS idx_pdts_fqnhash_ext_ts
ON profiler_data_time_series (entityFQNHash, extension, timestamp);
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MySQL does not support CREATE INDEX IF NOT EXISTS syntax (unlike PostgreSQL). Existing MySQL migrations use a conditional information_schema.statistics check + prepared statement to keep index creation idempotent (e.g., bootstrap/sql/migrations/native/1.13.0/mysql/schemaChanges.sql:27-40). This statement will fail on MySQL during migration; please rewrite it using the established conditional pattern (or equivalent idempotent ALTER TABLE logic).

Suggested change
CREATE INDEX IF NOT EXISTS idx_pdts_fqnhash_ext_ts
ON profiler_data_time_series (entityFQNHash, extension, timestamp);
SET @sql = (
SELECT IF(
EXISTS(
SELECT 1
FROM information_schema.statistics
WHERE table_schema = DATABASE()
AND table_name = 'profiler_data_time_series'
AND index_name = 'idx_pdts_fqnhash_ext_ts'
),
'SELECT 1',
'CREATE INDEX idx_pdts_fqnhash_ext_ts ON profiler_data_time_series (entityFQNHash, extension, timestamp)'
)
);
PREPARE statement FROM @sql;
EXECUTE statement;
DEALLOCATE PREPARE statement;

Copilot uses AI. Check for mistakes.
}
for (Column column : paginatedColumns) {
column.setCustomMetrics(getCustomMetrics(table, column.getName()));
column.setCustomMetrics(metricsByColumn.getOrDefault(column.getName(), new ArrayList<>()));
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

metricsByColumn.getOrDefault(column.getName(), new ArrayList<>()) allocates a new list for every column without metrics. Prefer Collections.emptyList() (or a shared constant empty list) to avoid unnecessary allocations and to signal the list is intentionally empty.

Suggested change
column.setCustomMetrics(metricsByColumn.getOrDefault(column.getName(), new ArrayList<>()));
column.setCustomMetrics(
metricsByColumn.getOrDefault(column.getName(), Collections.emptyList()));

Copilot uses AI. Check for mistakes.
Comment on lines +5969 to +5975
// Verify all four field combinations don't regress:
// (a) fields=profile — profile data returned, no full-table-scan on profiler_data_time_series
TableColumnList withProfile =
assertTimeout(
Duration.ofSeconds(30),
() -> client.tables().getColumns(table.getId(), "profile"),
"columns?fields=profile should complete within 30s");
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test comment says it verifies “all four field combinations”, but this test exercises three calls: profile, tags,customMetrics,extension,profile, and tags,profile. Please correct the comment (or add the missing 4th combination) so the test documentation matches what’s actually asserted.

Copilot uses AI. Check for mistakes.
Comment on lines +6018 to +6023
// (c) fields=tags,profile — duplicate populateEntityFieldTags must not run twice
TableColumnList withTagsAndProfile =
assertTimeout(
Duration.ofSeconds(30),
() -> client.tables().getColumns(table.getId(), "tags,profile"),
"columns?fields=tags,profile should complete within 30s");
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment claims this call validates that duplicate populateEntityFieldTags “must not run twice”, but the assertions only check correctness of returned tags/profile (no assertion around number of calls/queries). Please reword the comment to describe what is actually verified, or add an assertion/metric that specifically detects the duplicate-tag population regression.

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Copy Markdown
Contributor

🟡 Playwright Results — all passed (14 flaky)

✅ 3959 passed · ❌ 0 failed · 🟡 14 flaky · ⏭️ 86 skipped

Shard Passed Failed Flaky Skipped
🟡 Shard 1 297 0 2 4
🟡 Shard 2 755 0 4 8
🟡 Shard 3 730 0 2 7
🟡 Shard 4 758 0 1 18
✅ Shard 5 687 0 0 41
🟡 Shard 6 732 0 5 8
🟡 14 flaky test(s) (passed on retry)
  • Pages/AuditLogs.spec.ts › should apply both User and EntityType filters simultaneously (shard 1, 1 retry)
  • Pages/UserCreationWithPersona.spec.ts › Create user with persona and verify on profile (shard 1, 1 retry)
  • Features/ActivityAPI.spec.ts › Activity event is created when description is updated (shard 2, 1 retry)
  • Features/ActivityAPI.spec.ts › Activity event shows the actor who made the change (shard 2, 1 retry)
  • Features/DataQuality/TestLibrary.spec.ts › should maintain page on edit and reset to first page on delete (shard 2, 1 retry)
  • Features/Glossary/GlossaryWorkflow.spec.ts › should display correct status badge color and icon (shard 2, 1 retry)
  • Features/RTL.spec.ts › Verify Following widget functionality (shard 3, 1 retry)
  • Features/Table.spec.ts › Tags term should be consistent for search (shard 3, 1 retry)
  • Pages/Entity.spec.ts › Tier Add, Update and Remove (shard 4, 1 retry)
  • Pages/Lineage/LineageFilters.spec.ts › Verify lineage schema filter selection (shard 6, 1 retry)
  • Pages/Lineage/LineageRightPanel.spec.ts › Verify custom properties tab IS visible for supported type: searchIndex (shard 6, 1 retry)
  • Pages/ServiceEntity.spec.ts › Tier Add, Update and Remove (shard 6, 1 retry)
  • Pages/Users.spec.ts › Create and Delete user (shard 6, 1 retry)
  • Pages/Users.spec.ts › Check permissions for Data Steward (shard 6, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants