Skip to content

feat lead scoring and sorting#375

Open
saurabhhhcodes wants to merge 4 commits into
Kuldeeep18:mainfrom
saurabhhhcodes:fix/lead-scoring-engine
Open

feat lead scoring and sorting#375
saurabhhhcodes wants to merge 4 commits into
Kuldeeep18:mainfrom
saurabhhhcodes:fix/lead-scoring-engine

Conversation

@saurabhhhcodes

@saurabhhhcodes saurabhhhcodes commented Jun 21, 2026

Copy link
Copy Markdown
Contributor

What changed

  • Added lead scoring derived from campaign engagement and campaign-lead status.
  • Added a backfill command to populate scores for existing leads.
  • Added filtering and sorting controls in the leads view for score ranges and score-based ordering.

Why

  • Lead lists were missing an actionable ranking signal, which made high-value leads harder to surface.
  • Existing records also needed a way to be normalized after the scoring logic landed.

Validation

  • python3 -m py_compile /tmp/leadorbit-335/backend/campaigns/signals.py /tmp/leadorbit-335/backend/leads/views.py /tmp/leadorbit-335/backend/leads/tests.py /tmp/leadorbit-335/backend/leads/management/commands/backfill_lead_scores.py
  • git -C /tmp/leadorbit-335 diff --check

Closes #311
Closes #181
Closes #91

Summary by CodeRabbit

  • New Features
    • Added lead scoring support and expose it in the leads UI with filtering by score range and selectable sorting (newest/highest/lowest/oldest).
    • Added a management command to backfill/recompute stored lead scores.
  • Improvements
    • Updated campaign engagement tracking to increment/decrement cached counters on lead create/update/delete and keep related lead scores synchronized.
  • Tests
    • Added coverage for incremental counter updates via signals and for lead list default ordering.

@coderabbitai

coderabbitai Bot commented Jun 21, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

Adds an end-to-end lead scoring system. Django signal handlers on CampaignLead compute and persist a score on related Lead records using engagement counts and penalties. A new backfill_lead_scores management command retrofits existing records. LeadViewSet gains min_score, max_score, and sort_by query parameters. The frontend filter panel exposes corresponding score and sort controls.

Changes

Lead Scoring Engine

Layer / File(s) Summary
Delta-based counter updates and lead score calculation
backend/campaigns/signals.py
Introduces helpers for counter flag computation, delta application, and score calculation from CampaignLead engagement data (opens, clicks, replies, actives) with bounce and unsubscribe penalties clamped to non-negative. Adds pre_save receiver to capture prior state, rewrites post_save to compute and apply net counter deltas (including cross-campaign moves) then update Lead.score via targeted update_fields, and rewrites post_delete to subtract counter flags and recompute score with exception guards.
Signal handler test coverage
backend/campaigns/tests.py
CampaignCounterSignalTests verifies campaign counters and lead scores are maintained correctly via signals: creates/updates/deletes CampaignLead records and asserts Campaign cached counter fields and Lead.score reflect expected incremental changes, patching _update_campaign_counters to verify no full recount is triggered.
Backfill management command
backend/leads/management/commands/backfill_lead_scores.py
New backfill_lead_scores command accepts optional --lead-id argument to limit scope, iterates leads with .iterator(), calls _calculate_lead_score for each, saves only the score field via bulk_update, and reports processed count.
API score filtering and sorting
backend/leads/views.py
LeadViewSet.get_queryset() gains min_score/max_score integer filters (invalid input returns empty queryset) and a sort_by allowlist that applies ordering and returns early with distinct().
Lead filtering test updates
backend/leads/tests.py
LeadFilterTests adds timezone/timedelta imports, backdates created_at on fixture leads for reliable ordering, and asserts unfiltered results default to newest-first order.
Frontend filter panel and JS logic
frontend/leads.html
Adds "Min score", "Max score" numeric inputs, and a "Sort by" dropdown to the filter panel; wires them into getFilterParams(), updateFilterIndicator() (with human-readable badges), and clearFilters().

Sequence Diagram(s)

sequenceDiagram
  participant User as User<br/>(Frontend)
  participant Frontend as leads.html<br/>(JS Filter)
  participant API as LeadViewSet<br/>(Backend)
  participant Signals as CampaignLead<br/>Signals
  participant Lead as Lead Model

  User->>Frontend: Set min_score, max_score, sort_by
  Frontend->>Frontend: getFilterParams()
  Frontend->>Frontend: updateFilterIndicator()
  Frontend->>API: GET /leads?min_score=X&max_score=Y&sort_by=Z
  
  API->>API: parse min_score, max_score, sort_by
  API->>Lead: filter(score__gte=min_score, score__lte=max_score).order_by(...)
  API-->>Frontend: filtered, sorted lead list

  Note over Signals: Score maintained via signals
  Signals->>Lead: _update_lead_score(lead)
  Signals->>Lead: save(update_fields=['score'])
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

  • #311: This PR addresses the algorithmic lead scoring engine requested in the issue by implementing a score field update pipeline, a backfill utility, and API/frontend support for sorting and filtering by score.

Poem

🐇 A score for every lead, computed with care,
Bounces penalized, clicks rewarded fair,
Signals fire swiftly, the fields get saved,
Deltas applied smoothly, cross-campaigns behaved.
Filter by min, sort from high to low —
Now the best prospects are first in the flow! 🌟

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 27.27% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: implementing lead scoring and sorting functionality as described in issue #311.
Linked Issues check ✅ Passed All core requirements from issue #311 are met: score field added, backend calculation logic implemented, signal handlers manage incremental updates, management command provides backfill capability, API endpoint supports sorting and filtering by score, and frontend UI includes score controls.
Out of Scope Changes check ✅ Passed All changes directly support the lead scoring and sorting feature. No unrelated modifications detected; changes span signals for score updates, views for filtering/sorting, management commands for backfills, tests for validation, and frontend controls.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
backend/campaigns/signals.py (1)

60-79: ⚡ Quick win

Consider aggregating counts in a single query for better performance.

_calculate_lead_score executes 5 separate COUNT queries against CampaignLead for each lead. While acceptable in signal handlers (single lead at a time), this becomes an N+1 issue in the backfill command where it's called in a loop.

A single aggregated query would reduce database round-trips:

♻️ Proposed optimization using aggregate
 def _calculate_lead_score(lead):
     """
     Derive a simple engagement score for a lead from related CampaignLead rows.
     """
-    qs = CampaignLead.objects.filter(lead=lead)
-    open_count = qs.filter(last_opened_at__isnull=False).count()
-    click_count = qs.filter(last_clicked_at__isnull=False).count()
-    reply_count = qs.filter(status='REPLIED').count()
-    bounced_count = qs.filter(status='BOUNCED').count()
-    active_count = qs.filter(status='ACTIVE').count()
+    from django.db.models import Count, Q
+
+    agg = CampaignLead.objects.filter(lead=lead).aggregate(
+        open_count=Count('id', filter=Q(last_opened_at__isnull=False)),
+        click_count=Count('id', filter=Q(last_clicked_at__isnull=False)),
+        reply_count=Count('id', filter=Q(status='REPLIED')),
+        bounced_count=Count('id', filter=Q(status='BOUNCED')),
+        active_count=Count('id', filter=Q(status='ACTIVE')),
+    )
 
     score = (
-        (open_count * 5)
-        + (click_count * 7)
-        + (reply_count * 10)
-        + (active_count * 2)
-        - (bounced_count * 5)
+        (agg['open_count'] * 5)
+        + (agg['click_count'] * 7)
+        + (agg['reply_count'] * 10)
+        + (agg['active_count'] * 2)
+        - (agg['bounced_count'] * 5)
         - (5 if lead.global_unsubscribe else 0)
     )
     return max(score, 0)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/campaigns/signals.py` around lines 60 - 79, The _calculate_lead_score
function performs 5 separate database COUNT queries on the CampaignLead queryset
(for open_count, click_count, reply_count, bounced_count, and active_count),
which causes N+1 query problems when called in loops. Replace these 5 separate
filter().count() calls with a single aggregate() query using Django's Count,
Case, and When functions to calculate all metrics in one database round-trip,
then extract the aggregated values to compute the score.
backend/leads/management/commands/backfill_lead_scores.py (1)

18-30: ⚖️ Poor tradeoff

Backfill will be slow for large datasets; consider batching and progress feedback.

For organizations with many leads, this command will be slow (5+ queries per lead, no batched writes). Consider:

  1. Adding periodic progress output so operators know backfill is progressing
  2. Using bulk_update for batched writes
♻️ Proposed improvement with progress feedback
     def handle(self, *args, **options):
         lead_id = options.get('lead_id')
         leads = Lead.objects.all()
         if lead_id:
             leads = leads.filter(id=lead_id)
 
         processed = 0
         for lead in leads.iterator():
             lead.score = _calculate_lead_score(lead)
             lead.save(update_fields=['score'])
             processed += 1
+            if processed % 1000 == 0:
+                self.stdout.write(f'Processed {processed} leads...')
 
         self.stdout.write(self.style.SUCCESS(f'Successfully backfilled scores for {processed} lead(s).'))
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/leads/management/commands/backfill_lead_scores.py` around lines 18 -
30, The handle method processes leads one at a time with individual save()
calls, which is inefficient for large datasets. Replace the current loop that
iterates through leads and calls save(update_fields=['score']) for each lead
with a batched approach using bulk_update. Collect the updated leads in batches
(e.g., every 100-1000 leads) and call bulk_update on them together, which will
reduce the number of database queries. Additionally, add periodic progress
output within the loop (using self.stdout.write) at regular intervals to provide
feedback to operators that the backfill is progressing, such as every 100 or
1000 processed leads.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/leads/views.py`:
- Around line 95-99: The issue is that when sort_by is empty (the default
frontend selection), the code returns qs.distinct() without applying any
ordering, resulting in arbitrary database order instead of the expected
newest-first ordering. To fix this, modify the logic to check if sort_by is
empty or not in the allowed list, and apply a default ordering of '-created_at'
in those cases. Keep the existing allowed set check for valid sort options, but
add an elif condition (or modify the return logic) to handle the empty/invalid
sort_by case by returning qs.order_by('-created_at').distinct() instead of just
qs.distinct().

---

Nitpick comments:
In `@backend/campaigns/signals.py`:
- Around line 60-79: The _calculate_lead_score function performs 5 separate
database COUNT queries on the CampaignLead queryset (for open_count,
click_count, reply_count, bounced_count, and active_count), which causes N+1
query problems when called in loops. Replace these 5 separate filter().count()
calls with a single aggregate() query using Django's Count, Case, and When
functions to calculate all metrics in one database round-trip, then extract the
aggregated values to compute the score.

In `@backend/leads/management/commands/backfill_lead_scores.py`:
- Around line 18-30: The handle method processes leads one at a time with
individual save() calls, which is inefficient for large datasets. Replace the
current loop that iterates through leads and calls save(update_fields=['score'])
for each lead with a batched approach using bulk_update. Collect the updated
leads in batches (e.g., every 100-1000 leads) and call bulk_update on them
together, which will reduce the number of database queries. Additionally, add
periodic progress output within the loop (using self.stdout.write) at regular
intervals to provide feedback to operators that the backfill is progressing,
such as every 100 or 1000 processed leads.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 5c28677e-9a7f-47ff-b1db-1666bd14159c

📥 Commits

Reviewing files that changed from the base of the PR and between 0474fe9 and 404e0da.

📒 Files selected for processing (6)
  • backend/campaigns/signals.py
  • backend/leads/management/__init__.py
  • backend/leads/management/commands/__init__.py
  • backend/leads/management/commands/backfill_lead_scores.py
  • backend/leads/views.py
  • frontend/leads.html

Comment thread backend/leads/views.py Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
backend/campaigns/tests.py (1)

1467-1496: ⚡ Quick win

Missing Lead.score assertions despite PR objectives.

The PR objective explicitly calls for lead scoring, and signals.py calls _update_lead_score(instance.lead) on save/delete. However, none of these tests assert on self.lead.score. Consider adding assertions to verify the lead score is updated correctly when engagement events change:

Suggested additions
     def test_campaignlead_create_updates_cached_counts_incrementally(self):
         with patch('campaigns.signals._update_campaign_counters', side_effect=AssertionError('full recount should not run')):
             CampaignLead.objects.create(
                 organization=self.organization,
                 campaign=self.campaign,
                 lead=self.lead,
                 status='ACTIVE',
                 last_opened_at=timezone.now(),
             )

         self.campaign.refresh_from_db()
         self.assertEqual(self.campaign.leads_count, 1)
         # ... existing assertions ...
+
+        self.lead.refresh_from_db()
+        self.assertGreater(self.lead.score, 0, "Lead score should increase on engagement")

     def test_campaignlead_update_and_delete_apply_delta_changes(self):
         # ... existing test code ...

         campaign_lead.status = 'REPLIED'
         campaign_lead.last_opened_at = timezone.now()
         campaign_lead.last_clicked_at = timezone.now()
         campaign_lead.save()

         self.campaign.refresh_from_db()
         # ... existing campaign assertions ...
+
+        self.lead.refresh_from_db()
+        score_before_delete = self.lead.score
+        self.assertGreater(score_before_delete, 0)

         campaign_lead.delete()
         self.campaign.refresh_from_db()
         # ... existing assertions ...
+
+        self.lead.refresh_from_db()
+        # Score should be recalculated (likely 0 if no other campaign engagements)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/campaigns/tests.py` around lines 1467 - 1496, The test method
test_campaignlead_update_and_delete_apply_delta_changes verifies delta changes
to campaign counts but is missing assertions for lead score updates. The
signals.py module calls _update_lead_score(instance.lead) on save and delete
operations, but the test does not verify this scoring behavior. After updating
the campaign_lead with engagement events (status change to REPLIED,
last_opened_at, last_clicked_at), refresh self.lead from the database and add
assertions to verify the score has been incremented correctly based on the
engagement activity. After deleting the campaign_lead, refresh self.lead again
and add assertions to verify the score has been decremented or adjusted
appropriately.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/campaigns/signals.py`:
- Around line 191-193: The try/except block that catches all Exception types is
too broad and masks actual database errors. Remove the broad except Exception
clause entirely since the guard in _apply_campaign_counter_delta already handles
the case of a missing campaign_id, or if you must keep exception handling,
narrow it to only catch specific expected exceptions like DatabaseError from the
database module. This ensures that actual programming errors and database
connectivity issues are not silently suppressed.

---

Nitpick comments:
In `@backend/campaigns/tests.py`:
- Around line 1467-1496: The test method
test_campaignlead_update_and_delete_apply_delta_changes verifies delta changes
to campaign counts but is missing assertions for lead score updates. The
signals.py module calls _update_lead_score(instance.lead) on save and delete
operations, but the test does not verify this scoring behavior. After updating
the campaign_lead with engagement events (status change to REPLIED,
last_opened_at, last_clicked_at), refresh self.lead from the database and add
assertions to verify the score has been incremented correctly based on the
engagement activity. After deleting the campaign_lead, refresh self.lead again
and add assertions to verify the score has been decremented or adjusted
appropriately.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 6462c75a-2ba2-4274-82a7-6fd774a50099

📥 Commits

Reviewing files that changed from the base of the PR and between 404e0da and c6c5266.

📒 Files selected for processing (2)
  • backend/campaigns/signals.py
  • backend/campaigns/tests.py

Comment thread backend/campaigns/signals.py Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
backend/leads/management/commands/backfill_lead_scores.py (1)

27-28: 🏗️ Heavy lift

Backfill still does one score-aggregation query per lead.

bulk_update improves writes, but Line 28 calls _calculate_lead_score(lead), which runs an aggregate query per lead. On large datasets this becomes an N-query read bottleneck. Consider replacing per-lead score calculation with a set-based aggregation keyed by lead_id, then batch-apply computed scores.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/leads/management/commands/backfill_lead_scores.py` around lines 27 -
28, The loop iterating through leads calls `_calculate_lead_score(lead)` for
each individual lead, triggering an aggregate query per lead which creates a
read bottleneck on large datasets. Refactor the code to perform a single
set-based aggregation keyed by lead_id to compute scores for all leads at once,
then batch-apply these pre-computed scores to the leads collection instead of
calculating scores one-by-one within the iteration loop.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/leads/tests.py`:
- Line 469: The list comprehension at line 469 uses the ambiguous
single-character loop variable `l` which triggers the Ruff E741 lint error and
reduces code readability. Rename the loop variable `l` to a descriptive name
like `lead` throughout the list comprehension `[l['email'] for l in resp.data]`
so it becomes `[lead['email'] for lead in resp.data]` to satisfy linting
requirements and improve clarity.

---

Nitpick comments:
In `@backend/leads/management/commands/backfill_lead_scores.py`:
- Around line 27-28: The loop iterating through leads calls
`_calculate_lead_score(lead)` for each individual lead, triggering an aggregate
query per lead which creates a read bottleneck on large datasets. Refactor the
code to perform a single set-based aggregation keyed by lead_id to compute
scores for all leads at once, then batch-apply these pre-computed scores to the
leads collection instead of calculating scores one-by-one within the iteration
loop.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: fd4c0704-01ef-4789-bf90-dbfa0cd9f000

📥 Commits

Reviewing files that changed from the base of the PR and between c6c5266 and a44c04b.

📒 Files selected for processing (5)
  • backend/campaigns/signals.py
  • backend/campaigns/tests.py
  • backend/leads/management/commands/backfill_lead_scores.py
  • backend/leads/tests.py
  • backend/leads/views.py
🚧 Files skipped from review as they are similar to previous changes (3)
  • backend/leads/views.py
  • backend/campaigns/tests.py
  • backend/campaigns/signals.py

Comment thread backend/leads/tests.py
def test_no_filter_defaults_to_newest_first(self):
resp = self._get()
self.assertEqual(resp.status_code, status.HTTP_200_OK)
emails = [l['email'] for l in resp.data]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Rename ambiguous loop variable to satisfy lint and readability

Line 469 uses l, which triggers Ruff E741. Rename it to a descriptive name (e.g., lead) to keep CI lint green.

Suggested patch
-        emails = [l['email'] for l in resp.data]
+        emails = [lead['email'] for lead in resp.data]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
emails = [l['email'] for l in resp.data]
emails = [lead['email'] for lead in resp.data]
🧰 Tools
🪛 Ruff (0.15.17)

[error] 469-469: Ambiguous variable name: l

(E741)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/leads/tests.py` at line 469, The list comprehension at line 469 uses
the ambiguous single-character loop variable `l` which triggers the Ruff E741
lint error and reduces code readability. Rename the loop variable `l` to a
descriptive name like `lead` throughout the list comprehension `[l['email'] for
l in resp.data]` so it becomes `[lead['email'] for lead in resp.data]` to
satisfy linting requirements and improve clarity.

Source: Linters/SAST tools

@Kuldeeep18

Copy link
Copy Markdown
Owner

Hi @saurabhhhcodes 👋

LeadOrbit Bot here 🤖

A repository star ⭐ is required for PR review and merge.

We noticed that you haven't starred the repository yet. Please star the repo and reply once completed so we can continue with the review process.

Thanks for contributing to LeadOrbit! 🚀

@Kuldeeep18

Copy link
Copy Markdown
Owner

Hi @saurabhhhcodes 👋

LeadOrbit Bot here 🤖

We noticed you've opened a Pull Request but haven't starred the repository yet.

Starring the repository is mandatory for PR review and merge.

Please:

  1. Star the repository.
  2. Reply to this comment with "Done".

Once you've done that, the bot will continue processing your PR.

Note: PRs from contributors who haven't starred the repository will remain pending until this requirement is completed.

Thanks for contributing to LeadOrbit! 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants