fix(insights): history dedup uses title fallback, limits after dedup by izadoesdev · Pull Request #461 · databuddy-analytics/Databuddy

izadoesdev · 2026-05-28T14:36:13Z

Summary

Fixes three issues from PR #460 review:

Empty subjectKey no longer collapses unrelated insights into one (P1)
Fetch 50 rows then dedup to 12 unique subjects (was LIMIT 12 before dedup)
Remove unused runId from select

Test plan

bun test in apps/insights

Summary by cubic

Fix insight history dedup to prevent empty subjectKey from merging unrelated insights, and apply the 12-item cap after dedup by fetching 50 rows first. Remove the unused runId from the query and clean up HTML spacing in packages/evals/ui.

^{Written for commit 50b1ddc. Summary will update on new commits.}

Review in cubic

- Empty subjectKey no longer collapses unrelated insights (P1) - Fetch 50 rows then dedup to 12 unique subjects (was LIMIT 12 before dedup) - Remove unused runId from select

vercel · 2026-05-28T14:36:22Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
databuddy-status	Building	Preview, Comment	May 28, 2026 2:36pm

2 Skipped Deployments

Project	Deployment	Actions	Updated (UTC)
dashboard	Skipped		May 28, 2026 2:36pm
documentation	Skipped		May 28, 2026 2:36pm

coderabbitai · 2026-05-28T14:36:24Z

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: a63108f9-fdff-4db4-bac4-a017271cd64a

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch izadoesdev/fix-history-dedup

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

greptile-apps · 2026-05-28T14:38:38Z

Greptile Summary

This PR fixes three issues from review #460: it prevents empty subjectKey from collapsing unrelated insights during dedup, fetches 50 rows before applying the 12-subject limit, and drops the unused runId from the select.

The subjectKey || title fallback correctly prevents empty-key collisions, and the LIMIT 50 → dedup → 12 pipeline ordering is now right.
The dedup loop's early-exit guard (lines.length >= RECENT_INSIGHTS_PROMPT_LIMIT) counts total output lines rather than unique subjects; since each insight can contribute up to 3 lines (title, description, rootCause), the loop can stop after as few as ~4 subjects instead of the intended 12.
The packages/evals/ui/index.html change is cosmetic indentation only.

Confidence Score: 3/5

The core dedup fix is correct, but the early-exit guard in the dedup loop still misbehaves when insights carry description and rootCause fields.

An org with fully-populated insights would see only 4-6 subjects in the history context instead of the intended 12, quietly degrading AI prompt quality on every run.

apps/insights/src/prompts.ts — the dedup loop and its early-exit condition

Important Files Changed

Filename	Overview
apps/insights/src/prompts.ts	Fixes empty-subjectKey dedup collapse and moves LIMIT after dedup, but the limit guard checks total output lines instead of unique-subject count, which can prematurely stop iteration at far fewer than 12 subjects.
packages/evals/ui/index.html	Whitespace/indentation-only changes; no functional impact.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[DB query LIMIT 50 ordered by createdAt desc] --> B{rows empty?}
    B -- Yes --> C[return empty string]
    B -- No --> D[Build subjectCounts map using subjectKey OR title fallback]
    D --> E[Loop over rows]
    E --> F{lines.length >= RECENT_INSIGHTS_PROMPT_LIMIT?}
    F -- Yes --> G[break]
    F -- No --> H{key already in seen set?}
    H -- Yes --> E
    H -- No --> I[Add key to seen and push 1 to 3 lines per row]
    I --> E
    G --> J[Return formatted context block]
    I --> J

_{Reviews (1): Last reviewed commit: "fix(insights): history dedup uses title ..." | Re-trigger Greptile}

greptile-apps · 2026-05-28T14:38:42Z

 	const seen = new Set<string>();
 	const lines: string[] = [];
 	for (const row of rows) {
-		if (seen.has(row.subjectKey)) {
+		if (lines.length >= RECENT_INSIGHTS_PROMPT_LIMIT) {
+			break;
+		}
+		const key = row.subjectKey || row.title;
+		if (seen.has(key)) {
 			continue;
 		}
-		seen.add(row.subjectKey);
+		seen.add(key);


Use a dedicated counter for unique subjects rather than lines.length, so insights with description and rootCause don't erroneously reduce the number of distinct subjects shown. The PR description explicitly calls for "12 unique subjects" but the current guard measures output lines.

Suggested change

const seen = new Set<string>();

const lines: string[] = [];

for (const row of rows) {

if (seen.has(row.subjectKey)) {

if (lines.length >= RECENT_INSIGHTS_PROMPT_LIMIT) {

break;

}

const key = row.subjectKey || row.title;

if (seen.has(key)) {

continue;

}

seen.add(row.subjectKey);

seen.add(key);

const seen = new Set<string>();

const lines: string[] = [];

let subjectCount = 0;

for (const row of rows) {

if (subjectCount >= RECENT_INSIGHTS_PROMPT_LIMIT) {

break;

}

const key = row.subjectKey || row.title;

if (seen.has(key)) {

continue;

}

seen.add(key);

subjectCount++;

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

fix(insights): history dedup uses title fallback, limits after dedup

50b1ddc

- Empty subjectKey no longer collapses unrelated insights (P1) - Fetch 50 rows then dedup to 12 unique subjects (was LIMIT 12 before dedup) - Remove unused runId from select

izadoesdev merged commit 5693333 into staging May 28, 2026
3 of 4 checks passed

unkey-deploy Bot deployed to links - preview May 28, 2026 14:36 View deployment

vercel Bot deployed to Preview – databuddy-status May 28, 2026 14:36 View deployment

unkey-deploy Bot deployed to api - preview May 28, 2026 14:37 View deployment

greptile-apps Bot reviewed May 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(insights): history dedup uses title fallback, limits after dedup#461

fix(insights): history dedup uses title fallback, limits after dedup#461
izadoesdev merged 1 commit into
stagingfrom
izadoesdev/fix-history-dedup

izadoesdev commented May 28, 2026 •

edited by cubic-dev-ai Bot

Loading

Uh oh!

Uh oh!

vercel Bot commented May 28, 2026

Uh oh!

coderabbitai Bot commented May 28, 2026

Review skipped

Uh oh!

greptile-apps Bot commented May 28, 2026

Uh oh!

greptile-apps Bot May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

izadoesdev commented May 28, 2026 • edited by cubic-dev-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by cubic

Uh oh!

Uh oh!

vercel Bot commented May 28, 2026

Uh oh!

coderabbitai Bot commented May 28, 2026

Review skipped

Uh oh!

greptile-apps Bot commented May 28, 2026

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

izadoesdev commented May 28, 2026 •

edited by cubic-dev-ai Bot

Loading