Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
WalkthroughThis change extends analytics queries and handlers to track inbound and outbound metrics for calls and leads. The implementation adds call_direction filtering, computes inbound/outbound aggregations using SQL FILTER clauses, and updates result structures across multiple analytics paths including time-series, grouped, and aggregate results. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/database/queries/breeze_buddy/analytics.py`:
- Around line 514-517: The CASE expression that sets the alias lead_key
currently uses ELSE request_id which can be NULL for outbound calls; update each
occurrence of that CASE (the places that define lead_key) to use
COALESCE(request_id, id::text) instead of plain request_id so outbound rows get
a non-NULL fallback (i.e., CASE WHEN call_direction = 'INBOUND' THEN id::text
ELSE COALESCE(request_id, id::text) END AS lead_key); apply this same
replacement to all three locations where lead_key is defined.
- Around line 675-683: lead_keys is returning one row per call from
filtered_data which causes lead_aggregates (joining on lead_key to
base_lead_trends) to multiply rows and overcount aggregates; modify the
lead_keys CTE to deduplicate by grouping on lead_key (and include any necessary
aggregated or MIN/MAX call_initiated_time if that field is required downstream)
— update the lead_keys definition so it SELECTs lead_key, call_direction,
MIN(call_initiated_time) or appropriate aggregate and add GROUP BY lead_key (and
call_direction if kept) to prevent N rows per lead_key before joining in
lead_aggregates.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
app/api/routers/breeze_buddy/analytics/handlers.pyapp/database/accessor/breeze_buddy/analytics.pyapp/database/queries/breeze_buddy/analytics.py
| CASE | ||
| WHEN call_direction = 'INBOUND' THEN id::text | ||
| ELSE request_id | ||
| END as lead_key, |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -type f -name "analytics.py" | head -20Repository: juspay/clairvoyance
Length of output: 374
🏁 Script executed:
wc -l ./app/database/queries/breeze_buddy/analytics.pyRepository: juspay/clairvoyance
Length of output: 115
🏁 Script executed:
cat -n ./app/database/queries/breeze_buddy/analytics.py | sed -n '510,520p; 572,582p; 674,684p'Repository: juspay/clairvoyance
Length of output: 1349
🏁 Script executed:
cat -n ./app/database/queries/breeze_buddy/analytics.py | sed -n '500,540p'Repository: juspay/clairvoyance
Length of output: 1924
🏁 Script executed:
cat -n ./app/database/queries/breeze_buddy/analytics.py | sed -n '540,560p'Repository: juspay/clairvoyance
Length of output: 1338
🏁 Script executed:
fd -t f -name "*.py" | xargs rg -l "LEAD_CALL_TRACKER" | head -5Repository: juspay/clairvoyance
Length of output: 465
🏁 Script executed:
cat -n ./app/database/queries/breeze_buddy/lead_call_tracker.py | head -100Repository: juspay/clairvoyance
Length of output: 3959
🏁 Script executed:
cat -n ./app/database/queries/breeze_buddy/analytics.py | sed -n '560,600p'Repository: juspay/clairvoyance
Length of output: 1861
🏁 Script executed:
cat -n ./app/database/queries/breeze_buddy/analytics.py | sed -n '660,710p'Repository: juspay/clairvoyance
Length of output: 2329
Prevent NULL lead_key for outbound rows.
ELSE request_id can yield NULL in outbound calls. When this happens:
COUNT(DISTINCT lead_key)ignores NULL values, skewing lead countsGROUP BY lead_keycreates separate NULL groups, collapsing unrelated recordsJOINoperations onlead_keyfail to match NULL values
Replace with COALESCE(request_id, id::text) in all three locations (lines 514–517, 576–578, 678–680) to provide a non-NULL fallback.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/database/queries/breeze_buddy/analytics.py` around lines 514 - 517, The
CASE expression that sets the alias lead_key currently uses ELSE request_id
which can be NULL for outbound calls; update each occurrence of that CASE (the
places that define lead_key) to use COALESCE(request_id, id::text) instead of
plain request_id so outbound rows get a non-NULL fallback (i.e., CASE WHEN
call_direction = 'INBOUND' THEN id::text ELSE COALESCE(request_id, id::text) END
AS lead_key); apply this same replacement to all three locations where lead_key
is defined.
| lead_keys AS ( | ||
| SELECT | ||
| CASE | ||
| WHEN call_direction = 'INBOUND' THEN id::text | ||
| ELSE request_id | ||
| END as lead_key, | ||
| call_direction, | ||
| call_initiated_time | ||
| FROM filtered_data |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
wc -l app/database/queries/breeze_buddy/analytics.pyRepository: juspay/clairvoyance
Length of output: 113
🏁 Script executed:
sed -n '670,710p' app/database/queries/breeze_buddy/analytics.pyRepository: juspay/clairvoyance
Length of output: 1692
🏁 Script executed:
# Get broader context around the lead_keys CTE
sed -n '675,750p' app/database/queries/breeze_buddy/analytics.pyRepository: juspay/clairvoyance
Length of output: 3010
🏁 Script executed:
sed -n '695,710p' app/database/queries/breeze_buddy/analytics.pyRepository: juspay/clairvoyance
Length of output: 777
Fix row multiplication in lead-based trends aggregation.
lead_keys currently returns one row per call (no deduplication), but lead_aggregates joins it to already-aggregated base_lead_trends by lead_key only. This causes row multiplication: if a lead_key has N calls in filtered_data, the join produces N rows for that lead instead of 1, inflating SUM(blt.total_calls_for_lead) and SUM(blt.finished_calls) by a factor of N.
Add GROUP BY lead_key to lead_keys to deduplicate:
Suggested fix
lead_keys AS (
SELECT
CASE
WHEN call_direction = 'INBOUND' THEN id::text
ELSE request_id
END as lead_key,
- call_direction,
- call_initiated_time
+ MAX(call_direction) as call_direction
FROM filtered_data
+ GROUP BY 1
),📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| lead_keys AS ( | |
| SELECT | |
| CASE | |
| WHEN call_direction = 'INBOUND' THEN id::text | |
| ELSE request_id | |
| END as lead_key, | |
| call_direction, | |
| call_initiated_time | |
| FROM filtered_data | |
| lead_keys AS ( | |
| SELECT | |
| CASE | |
| WHEN call_direction = 'INBOUND' THEN id::text | |
| ELSE request_id | |
| END as lead_key, | |
| MAX(call_direction) as call_direction | |
| FROM filtered_data | |
| GROUP BY 1 | |
| ), |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/database/queries/breeze_buddy/analytics.py` around lines 675 - 683,
lead_keys is returning one row per call from filtered_data which causes
lead_aggregates (joining on lead_key to base_lead_trends) to multiply rows and
overcount aggregates; modify the lead_keys CTE to deduplicate by grouping on
lead_key (and include any necessary aggregated or MIN/MAX call_initiated_time if
that field is required downstream) — update the lead_keys definition so it
SELECTs lead_key, call_direction, MIN(call_initiated_time) or appropriate
aggregate and add GROUP BY lead_key (and call_direction if kept) to prevent N
rows per lead_key before joining in lead_aggregates.
There was a problem hiding this comment.
Pull request overview
This PR adds support for tracking and reporting inbound vs outbound calls separately in the analytics system. Previously, all call analytics were aggregated together regardless of direction. The changes introduce call_direction filtering across all analytics queries and endpoints.
Changes:
- Added inbound_calls and outbound_calls metrics to call-based analytics
- Added inbound_leads and outbound_leads metrics to lead-based analytics
- Modified SQL queries to use call_direction for proper lead identification (INBOUND uses id, OUTBOUND uses request_id)
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| app/database/queries/breeze_buddy/analytics.py | Updated analytics SQL queries to include call_direction field and add separate counts for inbound/outbound calls and leads; removed request_id NOT NULL filter to support inbound calls |
| app/database/accessor/breeze_buddy/analytics.py | Added inbound_calls and outbound_calls fields to summary analytics response structure with null coalescing |
| app/api/routers/breeze_buddy/analytics/handlers.py | Added inbound/outbound metrics to handler responses and implemented lead direction counting logic for aggregated results |
| SUM(blt.total_calls_for_lead) as total_calls, | ||
| SUM(blt.finished_calls) as finished_calls | ||
| FROM base_lead_trends blt | ||
| LEFT JOIN lead_keys lk ON blt.lead_key = lk.lead_key |
There was a problem hiding this comment.
The JOIN between base_lead_trends and lead_keys could produce duplicate rows because lead_keys contains one row per call (not per unique lead_key). If a lead has multiple calls, it will have multiple rows in lead_keys with the same lead_key but different call_initiated_time values. This could cause a single lead to join with multiple rows from lead_keys, potentially resulting in incorrect counts if a lead somehow has calls with different call_directions. Consider using a deduplicated version of lead_keys, such as SELECT DISTINCT lead_key, call_direction FROM lead_keys, or use a subquery to get a single call_direction per lead_key.
| LEFT JOIN lead_keys lk ON blt.lead_key = lk.lead_key | |
| LEFT JOIN ( | |
| SELECT DISTINCT | |
| lead_key, | |
| call_direction | |
| FROM lead_keys | |
| ) lk ON blt.lead_key = lk.lead_key |
| SELECT | ||
| request_id, | ||
| lead_key, | ||
| MAX(call_direction) as call_direction, |
There was a problem hiding this comment.
Using MAX(call_direction) to determine the call direction for a lead is problematic. In PostgreSQL, MAX on an enum or string will use alphabetical ordering, so 'OUTBOUND' > 'INBOUND'. This means if a lead somehow has multiple calls with different directions (which shouldn't happen in practice), it would always be classified as OUTBOUND. Consider using a more deterministic approach, such as ANY_VALUE(call_direction) or ensuring business logic prevents mixed call directions for the same lead_key.
| MAX(call_direction) as call_direction, | |
| CASE | |
| WHEN BOOL_OR(call_direction = 'OUTBOUND') THEN 'OUTBOUND' | |
| ELSE 'INBOUND' | |
| END as call_direction, |
| CASE | ||
| WHEN call_direction = 'INBOUND' THEN id::text | ||
| ELSE request_id | ||
| END as lead_key, |
There was a problem hiding this comment.
The removal of the request_id IS NOT NULL condition could cause issues for outbound calls that have a NULL request_id. In the lead_key CASE expression, outbound calls use request_id as the lead_key. If request_id is NULL, the lead_key will be NULL, which could lead to incorrect aggregations or filtering of valid data. Consider adding a NULL check or ensuring that all outbound calls have a non-NULL request_id through database constraints or application logic.
| SELECT | ||
| CASE | ||
| WHEN call_direction = 'INBOUND' THEN id::text | ||
| ELSE request_id |
There was a problem hiding this comment.
The removal of the request_id IS NOT NULL condition could cause issues for outbound calls that have a NULL request_id. In the lead_key CASE expression, outbound calls use request_id as the lead_key. If request_id is NULL, the lead_key will be NULL, which could lead to incorrect aggregations or filtering of valid data. Consider adding a NULL check or ensuring that all outbound calls have a non-NULL request_id through database constraints or application logic.
| ELSE request_id | |
| ELSE COALESCE(request_id, id::text) |
| CASE | ||
| WHEN call_direction = 'INBOUND' THEN id::text | ||
| ELSE request_id | ||
| END as lead_key, |
There was a problem hiding this comment.
The removal of the request_id IS NOT NULL condition could cause issues for outbound calls that have a NULL request_id. In the lead_key CASE expression, outbound calls use request_id as the lead_key. If request_id is NULL, the lead_key will be NULL, which could lead to incorrect aggregations or filtering of valid data. Consider adding a NULL check or ensuring that all outbound calls have a non-NULL request_id through database constraints or application logic.
| FROM filtered_data | ||
| ), | ||
| unique_leads AS ( | ||
| SELECT DISTINCT lead_key, {group_by}, call_direction FROM lead_keys |
There was a problem hiding this comment.
The unique_leads CTE selects DISTINCT lead_key, group_by, and call_direction. If a lead (lead_key) somehow has calls with different call_directions (which shouldn't happen in practice), it will create multiple rows in unique_leads, causing the lead to be counted in both inbound_leads and outbound_leads. Consider adding a constraint to ensure each lead_key has only one call_direction, or use an aggregate function like MAX or MIN to deterministically select a single call_direction per lead_key and group_by combination.
| SELECT DISTINCT lead_key, {group_by}, call_direction FROM lead_keys | |
| SELECT | |
| lead_key, | |
| {group_by}, | |
| MIN(call_direction) AS call_direction | |
| FROM lead_keys | |
| GROUP BY lead_key, {group_by} |
- Add inbound_calls count to call-based summary query - Add inbound_count CTE to lead-based query for inbound lead tracking - Expose outbound_leads and inbound_leads in API response - Update accessor to include inbound metrics in all return paths
f57340a to
4ea48c6
Compare
Summary by CodeRabbit