Skip to content

fix: external role duplicate rows#1890

Merged
moshloop merged 2 commits intomainfrom
fix-ext-role
Apr 16, 2026
Merged

fix: external role duplicate rows#1890
moshloop merged 2 commits intomainfrom
fix-ext-role

Conversation

@yashmehrotra
Copy link
Copy Markdown
Member

@yashmehrotra yashmehrotra commented Apr 16, 2026

Summary by CodeRabbit

  • Bug Fixes
    • Remove stale duplicate external role records created prior to a migration cutoff, improving data consistency and system reliability. The cleanup targets legacy entries left in an incomplete state and uses migration history to determine the safe cutoff for removal.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 16, 2026

Benchstat (RLS)

Base: a1bbbec68c7cdb747816c5c54868f5bdca5667b0
Head: 189df1160b180c7d8d3850947e1f3bbe7eca72e9

📊 7 minor regression(s) (all within 5% threshold)

Benchmark Base Head Change p-value
RLS/Sample-15000/config_summary/Without_RLS-4 40.88m 42.42m +3.79% 0.002
RLS/Sample-15000/change_types/With_RLS-4 3.403m 3.456m +1.56% 0.015
RLS/Sample-15000/configs/Without_RLS-4 4.897m 4.961m +1.29% 0.004
RLS/Sample-15000/config_names/Without_RLS-4 8.720m 8.803m +0.94% 0.009
RLS/Sample-15000/config_summary/With_RLS-4 490.4m 494.5m +0.85% 0.002
RLS/Sample-15000/config_names/With_RLS-4 84.84m 85.38m +0.64% 0.002
RLS/Sample-15000/analysis_types/Without_RLS-4 2.594m 2.609m +0.60% 0.041
✅ 2 improvement(s)
Benchmark Base Head Change p-value
RLS/Sample-15000/catalog_changes/With_RLS-4 90.03m 87.82m -2.45% 0.002
RLS/Sample-15000/config_classes/With_RLS-4 85.56m 84.83m -0.85% 0.002
Full benchstat output
goos: linux
goarch: amd64
pkg: github.com/flanksource/duty/bench
cpu: AMD EPYC 9V74 80-Core Processor                
                                               │ bench-base.txt │          bench-head.txt           │
                                               │     sec/op     │   sec/op     vs base              │
RLS/Sample-15000/catalog_changes/Without_RLS-4      3.389m ± 1%   3.404m ± 2%       ~ (p=0.699 n=6)
RLS/Sample-15000/catalog_changes/With_RLS-4         90.03m ± 1%   87.82m ± 1%  -2.45% (p=0.002 n=6)
RLS/Sample-15000/config_changes/Without_RLS-4       3.376m ± 1%   3.405m ± 1%       ~ (p=0.240 n=6)
RLS/Sample-15000/config_changes/With_RLS-4          87.78m ± 0%   87.70m ± 0%       ~ (p=0.132 n=6)
RLS/Sample-15000/config_detail/Without_RLS-4        2.643m ± 1%   2.639m ± 1%       ~ (p=0.394 n=6)
RLS/Sample-15000/config_detail/With_RLS-4           84.02m ± 1%   84.51m ± 0%       ~ (p=0.093 n=6)
RLS/Sample-15000/config_names/Without_RLS-4         8.720m ± 1%   8.803m ± 1%  +0.94% (p=0.009 n=6)
RLS/Sample-15000/config_names/With_RLS-4            84.84m ± 1%   85.38m ± 0%  +0.64% (p=0.002 n=6)
RLS/Sample-15000/config_summary/Without_RLS-4       40.88m ± 1%   42.42m ± 1%  +3.79% (p=0.002 n=6)
RLS/Sample-15000/config_summary/With_RLS-4          490.4m ± 0%   494.5m ± 1%  +0.85% (p=0.002 n=6)
RLS/Sample-15000/configs/Without_RLS-4              4.897m ± 1%   4.961m ± 1%  +1.29% (p=0.004 n=6)
RLS/Sample-15000/configs/With_RLS-4                 84.75m ± 0%   84.10m ± 1%       ~ (p=0.093 n=6)
RLS/Sample-15000/analysis_types/Without_RLS-4       2.594m ± 1%   2.609m ± 2%  +0.60% (p=0.041 n=6)
RLS/Sample-15000/analysis_types/With_RLS-4          2.628m ± 1%   2.640m ± 1%       ~ (p=0.699 n=6)
RLS/Sample-15000/analyzer_types/Without_RLS-4       2.483m ± 1%   2.488m ± 2%       ~ (p=0.065 n=6)
RLS/Sample-15000/analyzer_types/With_RLS-4          2.495m ± 1%   2.519m ± 4%       ~ (p=0.065 n=6)
RLS/Sample-15000/change_types/Without_RLS-4         3.428m ± 1%   3.429m ± 1%       ~ (p=0.818 n=6)
RLS/Sample-15000/change_types/With_RLS-4            3.403m ± 1%   3.456m ± 2%  +1.56% (p=0.015 n=6)
RLS/Sample-15000/config_classes/Without_RLS-4       2.172m ± 3%   2.191m ± 1%       ~ (p=0.485 n=6)
RLS/Sample-15000/config_classes/With_RLS-4          85.56m ± 1%   84.83m ± 0%  -0.85% (p=0.002 n=6)
RLS/Sample-15000/config_types/Without_RLS-4         2.618m ± 1%   2.624m ± 1%       ~ (p=0.394 n=6)
RLS/Sample-15000/config_types/With_RLS-4            84.91m ± 3%   84.92m ± 0%       ~ (p=0.699 n=6)
geomean                                             12.86m        12.91m       +0.45%

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 16, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7a0476b8-2268-4e42-89a3-153c1562f8d8

📥 Commits

Reviewing files that changed from the base of the PR and between 1f297ed and 189df11.

📒 Files selected for processing (1)
  • views/047_external_role_duplicate_rows_fix.sql
✅ Files skipped from review due to trivial changes (1)
  • views/047_external_role_duplicate_rows_fix.sql

Walkthrough

Added a SQL migration that deletes rows from external_roles where array_length(aliases, 1) = 1, updated_at IS NULL, role_type is ClusterRole or Role, and created_at is earlier than a cutoff computed as COALESCE((SELECT MAX(updated_at) FROM migration_logs), NOW()).

Changes

Cohort / File(s) Summary
SQL Migration
views/047_external_role_duplicate_rows_fix.sql
New migration deleting external_roles rows with single-element aliases, updated_at = NULL, role_type in (ClusterRole, Role), and created_at before cutoff derived from migration_logs.updated_at or NOW().
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix: external role duplicate rows' directly and clearly describes the main change—fixing duplicate rows in the external role table, which matches the SQL migration's purpose.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix-ext-role
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch fix-ext-role

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 16, 2026

Benchstat (Other)

Base: a1bbbec68c7cdb747816c5c54868f5bdca5667b0
Head: 189df1160b180c7d8d3850947e1f3bbe7eca72e9

✅ No significant performance changes detected

Full benchstat output
goos: linux
goarch: amd64
pkg: github.com/flanksource/duty/bench
cpu: AMD EPYC 7763 64-Core Processor                
                                                       │ bench-base.txt │           bench-head.txt           │
                                                       │     sec/op     │    sec/op     vs base              │
InsertionForRowsWithAliases/external_users.aliases-4       578.4µ ±  1%   579.3µ ±  7%       ~ (p=0.937 n=6)
InsertionForRowsWithAliases/config_items.external_id-4     1.092m ± 11%   1.096m ± 10%       ~ (p=0.937 n=6)
ResourceSelectorConfigs/name-4                             218.8µ ±  2%   219.1µ ±  2%       ~ (p=0.937 n=6)
ResourceSelectorConfigs/name_and_type-4                    238.8µ ±  3%   231.7µ ±  3%       ~ (p=0.065 n=6)
ResourceSelectorConfigs/tags-4                             29.40m ±  6%   28.83m ±  4%       ~ (p=0.589 n=6)
ResourceSelectorQueryBuild/name-4                          42.72µ ±  5%   42.33µ ±  1%       ~ (p=0.180 n=6)
ResourceSelectorQueryBuild/name_and_type-4                 62.30µ ±  1%   62.84µ ±  1%       ~ (p=0.132 n=6)
ResourceSelectorQueryBuild/tags-4                          16.77µ ±  1%   16.84µ ±  2%       ~ (p=0.190 n=6)
geomean                                                    284.8µ         283.4µ        -0.50%

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
functions/external_role_duplicate_rows_fix.sql (1)

1-11: Consider soft delete and explicit transaction handling.

The current implementation uses a hard DELETE, which permanently removes data and loses the audit trail. The external_roles table has a deleted_at column (per the model definition), suggesting soft deletes are supported. Additionally, explicit transaction handling with error recovery would improve reliability.

♻️ Alternative approach using soft delete
 DO $$
+DECLARE
+  affected_count INT;
 BEGIN
   IF EXISTS (
       SELECT 1
       FROM pg_tables
       WHERE schemaname = 'public'
       AND tablename = 'external_roles'
   ) THEN
-      DELETE FROM external_roles WHERE array_length(aliases, 1) = 1 AND updated_at IS NULL AND role_type in ('ClusterRole','Role') AND created_at < '2026-05-01' ;
+      -- Soft delete instead of hard delete to preserve audit trail
+      UPDATE external_roles 
+      SET deleted_at = NOW()
+      WHERE array_length(aliases, 1) = 1 
+        AND updated_at IS NULL 
+        AND role_type IN ('ClusterRole','Role') 
+        AND created_at < '2026-05-01'
+        AND deleted_at IS NULL;
+      
+      GET DIAGNOSTICS affected_count = ROW_COUNT;
+      RAISE NOTICE 'Soft deleted % duplicate external_roles rows', affected_count;
   END IF;
+EXCEPTION
+  WHEN OTHERS THEN
+    RAISE NOTICE 'Error during external_roles cleanup: %', SQLERRM;
+    RAISE;
 END $$;

This approach:

  • Preserves data for audit purposes
  • Avoids foreign key constraint issues (soft-deleted rows can still be referenced)
  • Includes error handling and logging
  • Can be reverted if needed by setting deleted_at = NULL
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@functions/external_role_duplicate_rows_fix.sql` around lines 1 - 11, Replace
the hard DELETE with a soft-delete UPDATE on external_roles: set deleted_at =
current_timestamp (and optionally updated_at = current_timestamp) for rows
matching array_length(aliases,1) = 1, updated_at IS NULL, role_type IN
('ClusterRole','Role') and created_at < '2026-05-01'; wrap the operation in an
explicit transaction block (BEGIN; ... COMMIT;) and add an EXCEPTION handler to
ROLLBACK on error and RAISE or RAISE NOTICE the error for logging so failures
are recoverable and auditable; keep the initial IF EXISTS (...) check to avoid
running when the table is absent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@functions/external_role_duplicate_rows_fix.sql`:
- Line 9: The DELETE currently runs silently; modify the statement to use
RETURNING to capture deleted rows and persist observability (either by inserting
deleted row metadata into an audit table like external_role_deletions or at
minimum emitting a RAISE NOTICE with the count). Concretely, wrap the DELETE
FROM external_roles ... in a CTE or transaction, use DELETE ... RETURNING id,
aliases, role_type, created_at, updated_at to collect deleted rows, then INSERT
those rows into an audit/log table (external_role_deletions) with a timestamp
and actor, or use RAISE NOTICE to log the total deleted count; ensure the SQL in
functions/external_role_duplicate_rows_fix.sql references the same DELETE and
RETURNING logic so you can verify what was removed.

---

Nitpick comments:
In `@functions/external_role_duplicate_rows_fix.sql`:
- Around line 1-11: Replace the hard DELETE with a soft-delete UPDATE on
external_roles: set deleted_at = current_timestamp (and optionally updated_at =
current_timestamp) for rows matching array_length(aliases,1) = 1, updated_at IS
NULL, role_type IN ('ClusterRole','Role') and created_at < '2026-05-01'; wrap
the operation in an explicit transaction block (BEGIN; ... COMMIT;) and add an
EXCEPTION handler to ROLLBACK on error and RAISE or RAISE NOTICE the error for
logging so failures are recoverable and auditable; keep the initial IF EXISTS
(...) check to avoid running when the table is absent.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0ebfe151-dd8a-4718-96e4-ce182bd4fc63

📥 Commits

Reviewing files that changed from the base of the PR and between 03a5b64 and 0e661de.

📒 Files selected for processing (1)
  • functions/external_role_duplicate_rows_fix.sql

WHERE schemaname = 'public'
AND tablename = 'external_roles'
) THEN
DELETE FROM external_roles WHERE array_length(aliases, 1) = 1 AND updated_at IS NULL AND role_type in ('ClusterRole','Role') AND created_at < '2026-05-01' ;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add observability for the deletion operation.

The deletion happens silently with no logging or output. This makes it difficult to verify what was deleted, troubleshoot issues, or maintain an audit trail.

📊 Proposed fix to add logging
 DO $$
+DECLARE
+  deleted_count INT;
 BEGIN
   IF EXISTS (
       SELECT 1
       FROM pg_tables
       WHERE schemaname = 'public'
       AND tablename = 'external_roles'
   ) THEN
+      -- Log rows to be deleted
+      RAISE NOTICE 'Deleting duplicate external_roles with criteria: array_length(aliases,1)=1, updated_at IS NULL, role_type IN (ClusterRole,Role), created_at < 2026-05-01';
+      
+      SELECT COUNT(*) INTO deleted_count
+      FROM external_roles 
+      WHERE array_length(aliases, 1) = 1 
+        AND updated_at IS NULL 
+        AND role_type IN ('ClusterRole','Role') 
+        AND created_at < '2026-05-01';
+      
+      RAISE NOTICE 'Found % rows matching deletion criteria', deleted_count;
+      
       DELETE FROM external_roles WHERE array_length(aliases, 1) = 1 AND updated_at IS NULL AND role_type in ('ClusterRole','Role') AND created_at < '2026-05-01' ;
+      
+      RAISE NOTICE 'Successfully deleted % external_roles rows', deleted_count;
   END IF;
 END $$;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@functions/external_role_duplicate_rows_fix.sql` at line 9, The DELETE
currently runs silently; modify the statement to use RETURNING to capture
deleted rows and persist observability (either by inserting deleted row metadata
into an audit table like external_role_deletions or at minimum emitting a RAISE
NOTICE with the count). Concretely, wrap the DELETE FROM external_roles ... in a
CTE or transaction, use DELETE ... RETURNING id, aliases, role_type, created_at,
updated_at to collect deleted rows, then INSERT those rows into an audit/log
table (external_role_deletions) with a timestamp and actor, or use RAISE NOTICE
to log the total deleted count; ensure the SQL in
functions/external_role_duplicate_rows_fix.sql references the same DELETE and
RETURNING logic so you can verify what was removed.

@yashmehrotra yashmehrotra requested a review from moshloop April 16, 2026 08:07
@moshloop moshloop enabled auto-merge (squash) April 16, 2026 14:09
@moshloop moshloop merged commit 464449e into main Apr 16, 2026
18 checks passed
@moshloop moshloop deleted the fix-ext-role branch April 16, 2026 14:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants