fix(analytics): multi-topic clustering and per-topic cooldown for confusion spike detection#643
Conversation
…fusion spike detection
|
@SRSoham is attempting to deploy a commit to the Karan Mani Tripathi 's projects Team on Vercel. A member of the Team first needs to authorize it. |
|
CodeAnt AI is reviewing your PR. Thanks for using CodeAnt! 🎉We're free for open-source projects. if you're enjoying it, help us grow by sharing. Share on X · |
WalkthroughThe PR refactors the Inngest confusion detection from single-concept spike analysis to multi-topic clustering. It introduces configurable time windows and thresholds, calls Groq with a new clustering prompt, parses the resulting clusters, filters by size, applies per-topic cooldown checks, and persists each cluster as an alert row to the database. ChangesConfusion Spike Detection Refactor
Sequence DiagramsequenceDiagram
participant DetectionFn as Inngest Function
participant DoubtsDB as Doubts DB
participant GroqAPI as Groq API
participant AlertsDB as Confusion Alerts DB
DetectionFn->>DoubtsDB: Fetch recent doubts<br/>(LOOKBACK_MS window)
DoubtsDB-->>DetectionFn: Recent doubts array
alt Spike Threshold Met
DetectionFn->>GroqAPI: Send doubts with<br/>clustering prompt
GroqAPI-->>DetectionFn: JSON { clusters: [...] }
DetectionFn->>DetectionFn: Parse response &<br/>filter by cluster size
loop For each qualifying cluster
DetectionFn->>AlertsDB: Check per-topic<br/>cooldown
alt Cooldown passed
DetectionFn->>AlertsDB: Insert cluster alert<br/>(topic, summary,<br/>confidence, etc.)
end
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
| const spikeClusters = clusteringResult.clusters.filter( | ||
| (c) => Array.isArray(c.doubtIndices) && c.doubtIndices.length >= SPIKE_THRESHOLD && c.topic | ||
| ); |
There was a problem hiding this comment.
Suggestion: Spike qualification and stored doubt count are based on raw model-provided indices without validating they map to real doubts, so invalid/duplicate indices can trigger false spikes and incorrect counts. Validate index bounds and uniqueness first, then apply threshold and persist count from validated mapped IDs. [logic error]
Severity Level: Major ⚠️
- ⚠️ Spike detection can trigger on malformed LLM indices.
- ⚠️ Stored doubtCount may overstate affected students significantly.Steps of Reproduction ✅
1. `detectConfusionSpikes` builds `dynamicDoubts` from the database at lines 49-67 and
sends a formatted list to Groq at lines 77-81, instructing it (lines 82-101) to return
`doubtIndices` referencing 1-based positions.
2. The Groq response is parsed at lines 113-124 into `clusteringResult.clusters`, but
there is no validation that `doubtIndices` are within bounds, unique, or non-duplicated
before further use.
3. The filter at lines 127-130 constructs `spikeClusters` solely based on
`Array.isArray(c.doubtIndices)` and `c.doubtIndices.length >= SPIKE_THRESHOLD && c.topic`,
so a cluster containing indices like `[1, 1, 1, 1, 1]` or `[1, 2, 999, 1000, 1001]` will
still qualify as a spike even though many indices are duplicates or out of range.
4. Later, when storing the alert at lines 169-179, `mappedIds` (lines 162-165) filters out
invalid IDs from `dynamicDoubts`, but `doubtCount` is persisted as
`cluster.doubtIndices.length` (line 175), not the number of validated `mappedIds`, so
malformed or duplicated indices can trigger a spike and record an inflated `doubtCount`
even though far fewer real doubts actually support that topic.Fix in Cursor | Fix in VSCode Claude
(Use Cmd/Ctrl + Click for best experience)
Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** src/app/api/inngest/ConfusionSpikeDetector.ts
**Line:** 128:130
**Comment:**
*Logic Error: Spike qualification and stored doubt count are based on raw model-provided indices without validating they map to real doubts, so invalid/duplicate indices can trigger false spikes and incorrect counts. Validate index bounds and uniqueness first, then apply threshold and persist count from validated mapped IDs.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix| .where( | ||
| and( | ||
| eq(confusionAlertsTable.classroomId, Number(classroomId)), | ||
| eq(confusionAlertsTable.topic, cluster.topic), |
There was a problem hiding this comment.
Suggestion: Topic matching uses raw LLM text, so small wording/casing/whitespace variations bypass the per-topic cooldown and create repeated alerts for the same concept. Normalize topic strings (trim, case-fold, optionally canonicalize) before both cooldown lookup and insert. [logic error]
Severity Level: Major ⚠️
- ⚠️ Per-topic cooldown fails on tiny topic text changes.
- ⚠️ confusion_alerts table accumulates near-duplicate conceptual topics.Steps of Reproduction ✅
1. `detectConfusionSpikes` (lines 33-188 in
`src/app/api/inngest/ConfusionSpikeDetector.ts`) queries recent doubts, calls Groq, and
builds `spikeClusters` (lines 127-130) where each `cluster.topic` is free-form text from
the LLM.
2. On the first execution for a classroom, suppose Groq returns a cluster with `topic`
`"Binary search basics"` and enough `doubtIndices` to pass the spike filter; the code at
lines 169-179 inserts a row into `confusion_alerts` with that exact `topic` string.
3. Later within the `COOLDOWN_MS` window, another batch of doubts for the same conceptual
topic triggers `detectConfusionSpikes` again, but Groq now describes the topic as `"binary
search basics "` (different casing and trailing space) while still referring to the same
underlying concept.
4. The cooldown query at lines 145-153 uses `eq(confusionAlertsTable.topic,
cluster.topic)` (line 151) against the `topic` column defined as `varchar({ length: 255
})` in `src/configs/schema.ts:208-212`; because the string does not match exactly,
`recentAlert` is `undefined`, the early return at lines 158-160 is skipped, and a new
alert row is inserted at lines 169-179, creating multiple alerts for what is effectively
the same topic within one cooldown window.Fix in Cursor | Fix in VSCode Claude
(Use Cmd/Ctrl + Click for best experience)
Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** src/app/api/inngest/ConfusionSpikeDetector.ts
**Line:** 151:151
**Comment:**
*Logic Error: Topic matching uses raw LLM text, so small wording/casing/whitespace variations bypass the per-topic cooldown and create repeated alerts for the same concept. Normalize topic strings (trim, case-fold, optionally canonicalize) before both cooldown lookup and insert.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix|
CodeAnt AI finished reviewing your PR. |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
src/app/api/inngest/ConfusionSpikeDetector.ts (1)
150-152: ⚡ Quick winConsider normalizing topic strings for more reliable cooldown matching.
The cooldown check uses exact string equality on
cluster.topic. If Groq returns slightly different strings for semantically equivalent topics (e.g., "React Hooks" vs "react hooks"), the cooldown would be bypassed, potentially causing duplicate alerts.Normalizing the topic (e.g., lowercase, trimmed) before comparison and storage would improve cooldown reliability.
♻️ Suggested normalization
+ const normalizedTopic = cluster.topic.toLowerCase().trim(); + const [recentAlert] = await db .select({ id: confusionAlertsTable.id }) .from(confusionAlertsTable) .where( and( eq(confusionAlertsTable.classroomId, Number(classroomId)), - eq(confusionAlertsTable.topic, cluster.topic), + eq(confusionAlertsTable.topic, normalizedTopic), gte(confusionAlertsTable.createdAt, lookbackTime) ) )And similarly update line 171 to use
normalizedTopicwhen inserting.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/app/api/inngest/ConfusionSpikeDetector.ts` around lines 150 - 152, The cooldown check uses exact equality on cluster.topic; normalize the topic string (e.g., const normalizedTopic = cluster.topic.trim().toLowerCase()) and use normalizedTopic in the query (replace eq(confusionAlertsTable.topic, cluster.topic) with eq(confusionAlertsTable.topic, normalizedTopic)) and also use normalizedTopic when inserting new rows (the insert call referenced near the later insert/line 171) so stored topics and cooldown comparisons match consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/app/api/inngest/ConfusionSpikeDetector.ts`:
- Line 174: The computed confidence value can exceed 0–100 when
cluster.confidenceScore is outside 0–1; update the assignment that sets
confidence (currently: confidence: Math.round((cluster.confidenceScore || 0) *
100)) to clamp the result into the 0–100 range by first defaulting
cluster.confidenceScore to 0, multiplying by 100, rounding, then applying
Math.min(Math.max(value, 0), 100) so the final value is always an integer
between 0 and 100.
- Around line 139-140: The step ID currently uses the raw topic string in the
loop (step.run(`process-cluster-${cluster.topic}`, ...)), which can create
collisions/invalid/long IDs; change the ID generation in the loop over
spikeClusters to use a stable index-based or sanitized identifier (for example
use the loop index or a short hashed/sanitized form of cluster.topic) when
calling step.run so the ID becomes deterministic and safe (e.g.,
`process-cluster-${i}` or `process-cluster-${sanitize(cluster.topic)}`); update
the code that references that ID within this loop (the step.run call and any
related logging/variables like clusterResult) so all usages use the new safe ID
generation.
---
Nitpick comments:
In `@src/app/api/inngest/ConfusionSpikeDetector.ts`:
- Around line 150-152: The cooldown check uses exact equality on cluster.topic;
normalize the topic string (e.g., const normalizedTopic =
cluster.topic.trim().toLowerCase()) and use normalizedTopic in the query
(replace eq(confusionAlertsTable.topic, cluster.topic) with
eq(confusionAlertsTable.topic, normalizedTopic)) and also use normalizedTopic
when inserting new rows (the insert call referenced near the later insert/line
171) so stored topics and cooldown comparisons match consistently.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: f03025c4-9fa3-470c-9a1f-93fe5366ff8f
📒 Files selected for processing (1)
src/app/api/inngest/ConfusionSpikeDetector.ts
| for (const cluster of spikeClusters) { | ||
| const clusterResult = await step.run(`process-cluster-${cluster.topic}`, async () => { |
There was a problem hiding this comment.
Sanitize the step ID to prevent collisions and invalid characters.
The step ID process-cluster-${cluster.topic} uses the raw topic string from Groq's response. This can cause:
- Step ID collisions if Groq returns duplicate topic names
- Invalid characters in the step ID (spaces, special chars, etc.)
- Excessively long IDs if the topic string is long
Use an index-based ID or sanitize the topic string.
🛠️ Suggested fix using loop index
- for (const cluster of spikeClusters) {
- const clusterResult = await step.run(`process-cluster-${cluster.topic}`, async () => {
+ for (let i = 0; i < spikeClusters.length; i++) {
+ const cluster = spikeClusters[i];
+ const clusterResult = await step.run(`process-cluster-${i}`, async () => {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for (const cluster of spikeClusters) { | |
| const clusterResult = await step.run(`process-cluster-${cluster.topic}`, async () => { | |
| for (let i = 0; i < spikeClusters.length; i++) { | |
| const cluster = spikeClusters[i]; | |
| const clusterResult = await step.run(`process-cluster-${i}`, async () => { |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/app/api/inngest/ConfusionSpikeDetector.ts` around lines 139 - 140, The
step ID currently uses the raw topic string in the loop
(step.run(`process-cluster-${cluster.topic}`, ...)), which can create
collisions/invalid/long IDs; change the ID generation in the loop over
spikeClusters to use a stable index-based or sanitized identifier (for example
use the loop index or a short hashed/sanitized form of cluster.topic) when
calling step.run so the ID becomes deterministic and safe (e.g.,
`process-cluster-${i}` or `process-cluster-${sanitize(cluster.topic)}`); update
the code that references that ID within this loop (the step.run call and any
related logging/variables like clusterResult) so all usages use the new safe ID
generation.
| topic: cluster.topic, | ||
| summary: cluster.summary || "Multiple students are exhibiting core structural concept doubts.", | ||
| suggestedAction: computedAction, | ||
| confidence: Math.round((cluster.confidenceScore || 0) * 100), |
There was a problem hiding this comment.
Clamp confidence score to the 0–100 range.
The schema documents confidence as an integer between 0–100. If Groq returns a confidenceScore outside the 0–1 range, the computed value could exceed these bounds (e.g., 1.5 → 150).
🛠️ Suggested fix
- confidence: Math.round((cluster.confidenceScore || 0) * 100),
+ confidence: Math.min(100, Math.max(0, Math.round((cluster.confidenceScore || 0) * 100))),📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| confidence: Math.round((cluster.confidenceScore || 0) * 100), | |
| confidence: Math.min(100, Math.max(0, Math.round((cluster.confidenceScore || 0) * 100))), |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/app/api/inngest/ConfusionSpikeDetector.ts` at line 174, The computed
confidence value can exceed 0–100 when cluster.confidenceScore is outside 0–1;
update the assignment that sets confidence (currently: confidence:
Math.round((cluster.confidenceScore || 0) * 100)) to clamp the result into the
0–100 range by first defaulting cluster.confidenceScore to 0, multiplying by
100, rounding, then applying Math.min(Math.max(value, 0), 100) so the final
value is always an integer between 0 and 100.
|
Hi @SRSoham! Thanks for submitting this PR. Here is a detailed review of your changes: 🤖 Bot Review Feedback to ResolveCodeRabbit and CodeAnt have highlighted some important issues that need to be addressed. Please prioritize fixing these:
<de...
Sanitize the step ID to prevent collisions and invalid characters. The step ID
Clamp confidence score to the 0–100 range. The schema documents Details⚙️ CI/CD PipelineYour PR currently has some failing checks:
Note: We have recently fixed database configuration and This is an automated review assessment. Please address the feedback above to move your PR forward. |
User description
Closes #535 (partially — addresses the multi-cluster and per-topic cooldown checklist items)
ConfusionSpikeDetector.ts now prompts Groq to return multiple topic clusters per run instead of a single isSpike verdict, so distinct confusion themes occurring simultaneously are each detected.
Cooldown logic moved from a flat 30-min classroom-wide check to a per-topic check against confusionAlertsTable, queried right before inserting each cluster's alert.
Spike threshold (5 doubts) is now applied per cluster rather than to the whole batch.
No schema changes required — existing confusionAlertsTable columns are sufficient.
Not addressed in this PR: notificationsTable integration — the existing /api/confusion + ConfusionAlertPanel dashboard already serves as the "teacher notification," but flagging in case the maintainer wants a separate notification row too.
CodeAnt-AI Description
Detect multiple confusion topics and skip repeats per topic
What Changed
Impact
✅ Fewer missed confusion spikes✅ Fewer repeated alerts for the same topic✅ Clearer teacher notifications for mixed-topic doubt bursts💡 Usage Guide
Checking Your Pull Request
Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.
Talking to CodeAnt AI
Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:
This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.
Example
Preserve Org Learnings with CodeAnt
You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:
This helps CodeAnt AI learn and adapt to your team's coding style and standards.
Example
Retrigger review
Ask CodeAnt AI to review the PR again, by typing:
Check Your Repository Health
To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.
Summary by CodeRabbit