Skip to content

⚡ Bolt: optimize RAG retrieval performance#712

Open
RohanExploit wants to merge 1 commit intomainfrom
bolt-rag-optimization-8262278304064565290
Open

⚡ Bolt: optimize RAG retrieval performance#712
RohanExploit wants to merge 1 commit intomainfrom
bolt-rag-optimization-8262278304064565290

Conversation

@RohanExploit
Copy link
Copy Markdown
Owner

@RohanExploit RohanExploit commented Apr 29, 2026

💡 What: Optimized the Jaccard similarity calculation in the CivicRAG retrieval service.
🎯 Why: The previous implementation used set.union() which allocates a new set object on every iteration, leading to significant overhead in the retrieval loop.
📊 Impact: Measured a ~3x performance improvement in retrieval latency (0.0127 ms -> 0.0041 ms per retrieval).
🔬 Measurement: Verified with a dedicated benchmark script (backend/tests/benchmark_rag.py) and confirmed functional correctness with pytest backend/tests/test_rag_service.py.


PR created automatically by Jules for task 8262278304064565290 started by @RohanExploit


Summary by cubic

Speeds up RAG retrieval by optimizing Jaccard similarity calculation and avoiding costly set.union() allocations. Reduces per-retrieval latency by ~3x (0.0127 ms → 0.0041 ms).

  • Performance
    • Pre-calculated token counts during init.
    • Added isdisjoint() early exit for non-overlapping sets.
    • Computed union size via |A| + |B| - |A∩B| to avoid set allocations.
    • Benchmarked with backend/tests/benchmark_rag.py; verified with backend/tests/test_rag_service.py.

Written for commit f97de93. Summary will update on new commits. Review in cubic

Summary by CodeRabbit

  • Performance
    • Optimized the retrieval pipeline to improve query response times and system efficiency.

- Pre-calculate token counts for policies during initialization.
- Use `isdisjoint()` for fast early-exit on non-matching policies.
- Use inclusion-exclusion principle to calculate union size mathematically, avoiding expensive `set.union()` allocations.
@google-labs-jules
Copy link
Copy Markdown
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 29, 2026

Deploy Preview for fixmybharat canceled.

Name Link
🔨 Latest commit f97de93
🔍 Latest deploy log https://app.netlify.com/projects/fixmybharat/deploys/69f20d2e6e4bc10008a4d9f6

@github-actions
Copy link
Copy Markdown

🙏 Thank you for your contribution, @RohanExploit!

PR Details:

Quality Checklist:
Please ensure your PR meets the following criteria:

  • Code follows the project's style guidelines
  • Self-review of code completed
  • Code is commented where necessary
  • Documentation updated (if applicable)
  • No new warnings generated
  • Tests added/updated (if applicable)
  • All tests passing locally
  • No breaking changes to existing functionality

Review Process:

  1. Automated checks will run on your code
  2. A maintainer will review your changes
  3. Address any requested changes promptly
  4. Once approved, your PR will be merged! 🎉

Note: The maintainers will monitor code quality and ensure the overall project flow isn't broken.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 29, 2026

📝 Walkthrough

Walkthrough

The pull request optimizes Jaccard similarity calculations in the RAG retrieval pipeline by eliminating explicit set.union() calls. Changes include precomputing token counts during preparation, introducing early-exit logic via isdisjoint(), and computing union sizes using the inclusion-exclusion principle during scoring.

Changes

Cohort / File(s) Summary
Performance Documentation
.jules/bolt.md
Added performance note documenting optimization strategy for Jaccard similarity calculations, recommending avoidance of set.union() construction and detailing alternative count-based approach.
RAG Service Optimization
backend/rag_service.py
Precomputes per-policy token counts during preparation; updates query handling with early-return for empty tokenization; replaces set-based union construction with count-based Jaccard computation using inclusion-exclusion principle and isdisjoint() early exits.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

🐰 A rabbit's ode to swift retrieval:

No unions built where counts suffice,
Pre-counted tokens—oh how nice!
Inclusion-exclusion's ancient art,
Makes Jaccard scoring zip and dart. ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely describes the main change: optimizing RAG retrieval performance through performance improvements (indicated by ⚡ and 'optimize').
Description check ✅ Passed The description covers the key template sections: What/Why/Impact explanation, Type of Change (Performance improvement), Testing Done (with specific test files mentioned), and includes verification details. All critical sections are well-populated.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch bolt-rag-optimization-8262278304064565290

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 0/1 reviews remaining, refill in 60 minutes.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Optimizes the CivicRAG retrieval hot path by reducing per-iteration allocations during Jaccard similarity scoring.

Changes:

  • Pre-compute and store content_token_count during policy preparation to avoid repeated len() calls and enable arithmetic union sizing.
  • Replace set.union() allocation with an inclusion–exclusion union-size calculation and add an isdisjoint() early-exit to skip non-overlapping policies.
  • Document the optimization approach in the Bolt learning log.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
backend/rag_service.py Removes per-policy set.union() allocations by using precomputed token counts and arithmetic union sizing in retrieve().
.jules/bolt.md Adds a note describing the mathematical union-size optimization strategy for Jaccard similarity.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
backend/rag_service.py (1)

102-106: Use isdisjoint() for the title boost check.

This branch only needs to know whether any title token matched, so you can avoid building another temporary intersection set in the hot path.

Suggested tweak
-            title_match = len(query_tokens.intersection(title_tokens))
-            if title_match > 0:
+            if not query_tokens.isdisjoint(title_tokens):
                 score += 0.2  # Bonus for title match
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/rag_service.py` around lines 102 - 106, Replace the temporary
intersection allocation used to compute title_match with a cheap existence
check: instead of building prepared['title_tokens'].intersection(query_tokens)
and testing its length, use the set method isdisjoint to check if any token
overlaps (e.g., if not title_tokens.isdisjoint(query_tokens)) and then apply the
+0.2 boost to score; update the branch around title_tokens, title_match,
query_tokens and score to use this boolean check and remove the unnecessary
intersection/len work.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.jules/bolt.md:
- Around line 89-91: The changelog entry titled "2026-05-18 - Mathematical Set
Operations for Jaccard Similarity" is future-dated; update the header date to
the correct (non-future) date for this PR so entries remain chronologically
consistent, e.g., replace "2026-05-18" with today's or the PR date in that
header text, keeping the rest of the entry unchanged.

In `@backend/rag_service.py`:
- Around line 90-95: The comment uses the Unicode union symbol `∪` which
triggers Ruff; update the comment above the union_count calculation to use plain
ASCII (e.g., "A U B" or the word "union") instead of `∪`. Locate the block
around variables intersection_count, query_tokens, policy_tokens,
query_token_count and prepared['content_token_count'] (the union_count
computation) and replace the Unicode symbol in the explanatory comment with an
ASCII alternative.

---

Nitpick comments:
In `@backend/rag_service.py`:
- Around line 102-106: Replace the temporary intersection allocation used to
compute title_match with a cheap existence check: instead of building
prepared['title_tokens'].intersection(query_tokens) and testing its length, use
the set method isdisjoint to check if any token overlaps (e.g., if not
title_tokens.isdisjoint(query_tokens)) and then apply the +0.2 boost to score;
update the branch around title_tokens, title_match, query_tokens and score to
use this boolean check and remove the unnecessary intersection/len work.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a3f69c5e-a172-4b7c-b627-5797eea1bfe7

📥 Commits

Reviewing files that changed from the base of the PR and between 3166316 and f97de93.

📒 Files selected for processing (2)
  • .jules/bolt.md
  • backend/rag_service.py

Comment thread .jules/bolt.md
Comment on lines +89 to +91
## 2026-05-18 - Mathematical Set Operations for Jaccard Similarity
**Learning:** Calculating Jaccard similarity (|A ∩ B| / |A ∪ B|) using `set.union()` inside a retrieval loop incurs significant O(N) memory allocation and population overhead. Since |A ∪ B| = |A| + |B| - |A ∩ B|, the union size can be calculated via O(1) arithmetic if set sizes are pre-calculated.
**Action:** Pre-calculate set lengths for static data. In retrieval loops, use `isdisjoint()` for early exits and the inclusion-exclusion formula to avoid explicit set union operations.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Avoid future-dating this note.

Line 89 uses 2026-05-18, which is after this PR’s current date. That makes the note order look inconsistent and can confuse readers/tools that sort these entries chronologically.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.jules/bolt.md around lines 89 - 91, The changelog entry titled "2026-05-18
- Mathematical Set Operations for Jaccard Similarity" is future-dated; update
the header date to the correct (non-future) date for this PR so entries remain
chronologically consistent, e.g., replace "2026-05-18" with today's or the PR
date in that header text, keeping the rest of the entry unchanged.

Comment thread backend/rag_service.py
Comment on lines +90 to +95
# Jaccard Similarity: |A ∩ B| / |A ∪ B|
intersection_count = len(query_tokens.intersection(policy_tokens))

if not union:
# Performance: Use mathematical formula for union length: |A ∪ B| = |A| + |B| - |A ∩ B|
# This avoids O(N) allocation and population of a new union set.
union_count = query_token_count + prepared['content_token_count'] - intersection_count
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Replace the Unicode union symbol in the comment.

Ruff is already flagging ; using plain ASCII here will keep the note portable and silence the warning.

Suggested tweak
-            # Jaccard Similarity: |A ∩ B| / |A ∪ B|
+            # Jaccard Similarity: |A ∩ B| / |A union B|
🧰 Tools
🪛 Ruff (0.15.12)

[warning] 90-90: Comment contains ambiguous (UNION). Did you mean U (LATIN CAPITAL LETTER U)?

(RUF003)


[warning] 93-93: Comment contains ambiguous (UNION). Did you mean U (LATIN CAPITAL LETTER U)?

(RUF003)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/rag_service.py` around lines 90 - 95, The comment uses the Unicode
union symbol `∪` which triggers Ruff; update the comment above the union_count
calculation to use plain ASCII (e.g., "A U B" or the word "union") instead of
`∪`. Locate the block around variables intersection_count, query_tokens,
policy_tokens, query_token_count and prepared['content_token_count'] (the
union_count computation) and replace the Unicode symbol in the explanatory
comment with an ASCII alternative.

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 2 files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants