Skip to content

⚡ Bolt: Offload blocking I/O to thread pool in async handlers#696

Open
RohanExploit wants to merge 1 commit intomainfrom
bolt-async-offloading-6398104808224236360
Open

⚡ Bolt: Offload blocking I/O to thread pool in async handlers#696
RohanExploit wants to merge 1 commit intomainfrom
bolt-async-offloading-6398104808224236360

Conversation

@RohanExploit
Copy link
Copy Markdown
Owner

@RohanExploit RohanExploit commented Apr 22, 2026

💡 What: Offloaded blocking synchronous I/O (Database and File System) to a thread pool in critical async endpoints (voice/submit-issue, field-officer/visit/upload-images) and background tasks.

🎯 Why: In FastAPI, synchronous operations (SQLAlchemy queries/commits and with open().write()) inside async def functions block the main event loop. This leads to severe performance degradation under concurrent load, as the entire server hangs until the I/O operation completes.

📊 Impact: Expected to significantly reduce tail latency (P99) and improve throughput for concurrent users by keeping the main event loop available for other requests.

🔬 Measurement: Verified with a full backend test suite (PYTHONPATH=. pytest backend/tests/) with 107/107 passing tests, confirming that functional integrity and blockchain chaining remain correct after offloading.


PR created automatically by Jules for task 6398104808224236360 started by @RohanExploit


Summary by cubic

Offloaded blocking DB and file I/O to a thread pool in async paths to keep the event loop responsive. Improves tail latency and throughput for voice issue submission and visit image uploads.

  • Refactors
    • Voice: Offloaded DB read for previous hash and issue persistence via fastapi.concurrency.run_in_threadpool.
    • Field officer: Offloaded visit lookup, image file writes, and DB commit to a thread pool.
    • Background tasks: Changed create_grievance_from_issue_background to def so FastAPI runs it in a thread pool; wrapped blocking DB work in run_in_threadpool for process_action_plan_background.
    • Docs: Added threadpool offloading guidance to .jules/bolt.md.

Written for commit 69ff2d5. Summary will update on new commits.

Summary by CodeRabbit

Release Notes

  • Documentation

    • Added best practices guidance for optimizing backend asynchronous input/output operations
  • Performance Improvements

    • Optimized image upload handling to prevent blocking operations during request processing
    • Improved blockchain hash lookup performance with better concurrent execution patterns
    • Enhanced background task execution for issue tracking and grievance creation workflows to increase system throughput and responsiveness

- Wrapped synchronous DB and File I/O in `run_in_threadpool` for `voice` and `field_officer` routers.
- Optimized background tasks by leveraging standard `def` for blocking-only logic.
- Ensured event loop responsiveness during heavy submission paths.
Copilot AI review requested due to automatic review settings April 22, 2026 14:08
@google-labs-jules
Copy link
Copy Markdown
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 22, 2026

Deploy Preview for fixmybharat canceled.

Name Link
🔨 Latest commit 69ff2d5
🔍 Latest deploy log https://app.netlify.com/projects/fixmybharat/deploys/69e8d6762df2ec000876fdb5

@github-actions
Copy link
Copy Markdown

🙏 Thank you for your contribution, @RohanExploit!

PR Details:

Quality Checklist:
Please ensure your PR meets the following criteria:

  • Code follows the project's style guidelines
  • Self-review of code completed
  • Code is commented where necessary
  • Documentation updated (if applicable)
  • No new warnings generated
  • Tests added/updated (if applicable)
  • All tests passing locally
  • No breaking changes to existing functionality

Review Process:

  1. Automated checks will run on your code
  2. A maintainer will review your changes
  3. Address any requested changes promptly
  4. Once approved, your PR will be merged! 🎉

Note: The maintainers will monitor code quality and ensure the overall project flow isn't broken.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 22, 2026

📝 Walkthrough

Walkthrough

This PR documents and implements FastAPI tail-latency mitigation by offloading blocking database and filesystem I/O operations from async handlers to a threadpool. Multiple backend files are updated to use run_in_threadpool for synchronous database queries and commits, and one background task is converted from async to sync for automatic threadpool execution.

Changes

Cohort / File(s) Summary
Documentation
.jules/bolt.md
Added guidance on FastAPI tail-latency mitigation, recommending offload of blocking I/O operations via run_in_threadpool in async handlers and synchronous def functions for background work.
Async Handler I/O Offloading
backend/routers/field_officer.py, backend/routers/voice.py
Wrapped blocking database queries, commits, and filesystem writes with await run_in_threadpool(...) to remove synchronous I/O from the async event loop. voice.py also introduces centralized save_issue_db helper for database persistence operations.
Background Task Refactoring
backend/tasks.py
Wrapped blocking database queries and commits in await run_in_threadpool(...) within async context, and converted create_grievance_from_issue_background from async def to def for automatic threadpool execution by the task scheduler.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested labels

size/m

Poem

🐰 The rabbit hops with glee,
No more blocking on the event tree!
To threadpools swift, the queries go,
Async handlers fast as spring's first flow—
Tail-latency tamed, our whiskers say hurrah! ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change—offloading blocking I/O to thread pool in async handlers—which is the primary objective of the entire changeset.
Description check ✅ Passed The description covers key required sections: what (blocking I/O offload), why (event loop blocking), impact (tail latency/throughput), and measurement (test verification). Type of change is identifiable as performance improvement. However, the 'Related Issue' section lacks a proper issue link and some checklist items are unchecked.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch bolt-async-offloading-6398104808224236360

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to reduce event-loop blocking in FastAPI async endpoints and background tasks by offloading synchronous database and filesystem operations to a thread pool, improving concurrency and tail latency.

Changes:

  • Offloads synchronous DB queries/commits and file writes to run_in_threadpool in voice submission and field-officer image upload handlers.
  • Updates background task DB operations to use run_in_threadpool, and converts a purely-blocking background task to def for automatic threadpool execution.
  • Adds a Bolt performance note documenting threadpool offloading guidance.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
backend/tasks.py Wrapes DB query/commit in run_in_threadpool and converts a background task to sync def.
backend/routers/voice.py Offloads DB query and issue persistence, plus file write, to run_in_threadpool in /voice/submit-issue.
backend/routers/field_officer.py Offloads visit lookup, image file writes, and DB commit to run_in_threadpool in visit image upload.
.jules/bolt.md Adds a new performance “learning/action” entry about threadpool offloading.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread backend/tasks.py
Comment on lines +22 to 30
# Performance Optimization: Wrap blocking DB operations in threadpool
issue = await run_in_threadpool(
lambda: db.query(Issue).filter(Issue.id == issue_id).first()
)
if issue:
current_plan = issue.action_plan or {}
issue.action_plan = {**current_plan, **action_plan}
db.commit()
await run_in_threadpool(db.commit)

Copy link

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

db (a SQLAlchemy Session) is created on the event-loop thread but then used inside run_in_threadpool, and the returned Issue ORM instance is subsequently mutated outside the threadpool. SQLAlchemy Sessions/ORM instances are not thread-safe, and crossing thread boundaries like this can lead to subtle corruption or lazy-load failures. Keep the entire DB unit-of-work (load + update + commit) inside a single threadpool function that creates/uses/closes its own SessionLocal, and only return primitive data back to the async context.

Copilot uses AI. Check for mistakes.
Comment on lines +285 to +288
# Performance Optimization: Wrap blocking DB query in threadpool
visit = await run_in_threadpool(
lambda: db.query(FieldOfficerVisit).filter(FieldOfficerVisit.id == visit_id).first()
)
Copy link

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loads a full FieldOfficerVisit ORM object in a worker thread and then uses/mutates it in the async context. SQLAlchemy ORM instances (and their associated Session) are not thread-safe across thread boundaries. Prefer moving all DB interaction for this request into a single run_in_threadpool call (or a sync helper) that queries/updates/commits using its own Session, and have the async handler only do file reads/writes and pass primitives (visit_id, image_paths) into that helper.

Copilot uses AI. Check for mistakes.
Comment on lines 357 to +361
visit.visit_images = existing_images
visit.updated_at = datetime.now(timezone.utc)

db.commit()
# Performance Optimization: Wrap blocking DB commit in threadpool
await run_in_threadpool(db.commit)
Copy link

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

db.commit is offloaded to the threadpool, but the visit ORM instance being committed was created/modified outside the threadpool (and may have been loaded in a different worker thread). This mixes Session/ORM usage across threads, which SQLAlchemy does not support. Do the load+mutation+commit in the same thread (single threadpool helper) or use an UPDATE ... query in the threadpool with a fresh Session, rather than committing an ORM instance created in another thread.

Copilot uses AI. Check for mistakes.
Comment thread backend/routers/voice.py
Comment on lines +305 to 309
# Performance Optimization: Wrap blocking DB operations in threadpool to keep event loop responsive
await run_in_threadpool(save_issue_db, db, new_issue)

# Update cache for next report AFTER successful DB commit
blockchain_last_hash_cache.set(data=integrity_hash, key="last_hash")
Copy link

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

save_issue_db is run in a worker thread using the request-scoped db Session and the new_issue ORM instance, but the async context reads new_issue.id afterwards. Passing SQLAlchemy Sessions/ORM instances across threads is not supported and can break under different DB drivers or when lazy attributes are accessed. Prefer a threadpool helper that creates its own Session and returns the new issue ID (and any other primitives) to the async handler.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
backend/routers/field_officer.py (1)

339-349: ⚠️ Potential issue | 🟠 Major

Prevent concurrent uploads from overwriting image files.

visit_{visit_id}_{timestamp}_{idx} can collide for two simultaneous uploads to the same visit within the same second, and the threadpool write can overwrite the earlier file. Add a UUID or other unique suffix.

🐛 Proposed fix
+import uuid
 from datetime import datetime, timezone
             # Generate secure filename
             timestamp = datetime.now(timezone.utc).strftime('%Y%m%d_%H%M%S')
-            safe_filename = f"visit_{visit_id}_{timestamp}_{idx}.{extension}"
+            safe_filename = f"visit_{visit_id}_{timestamp}_{idx}_{uuid.uuid4().hex}.{extension}"
             file_path = os.path.join(VISIT_IMAGES_DIR, safe_filename)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/routers/field_officer.py` around lines 339 - 349, The filename
generation can collide for concurrent uploads because timestamp and idx may be
identical; update the safe_filename creation to append a unique identifier
(e.g., uuid4 hex) so each file is globally unique. Locate the block that builds
timestamp, safe_filename and file_path (symbols: timestamp, safe_filename,
file_path) and modify the safe_filename to include a UUID suffix before the
extension; keep the rest of the save logic (_save_image and run_in_threadpool)
unchanged so writes go to distinct paths under VISIT_IMAGES_DIR.
backend/routers/voice.py (1)

268-306: ⚠️ Potential issue | 🟠 Major

Synchronize hash chain read-compute-write to prevent concurrent forks.

Two concurrent requests can race on the cache read: both fetch the same prev_hash, compute different integrity_hash values from it, then save independently. This creates sibling links instead of a linear chain.

T1: Request A reads cache → prev_hash="hash0"
T2: Request B reads cache → prev_hash="hash0"
T3: Request A saves Issue(prev="hash0", integrity="hashA"), updates cache
T4: Request B saves Issue(prev="hash0", integrity="hashB"), overwrites cache

Wrap the full critical section (prev_hash read → hash computation → DB save → cache update) in a synchronized block using asyncio.Lock or serialize it with a database-level write lock to guarantee each request chains from the most recent hash.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/routers/voice.py` around lines 268 - 306, The prev_hash read →
integrity_hash compute → DB save → cache update sequence must be serialized to
avoid forks: introduce a shared asyncio.Lock (e.g., blockchain_chain_lock) and
acquire it around the critical section that includes the run_in_threadpool call
that reads prev_issue, the hash_content/integrity_hash computation, the call to
save_issue_db, and the blockchain_last_hash_cache.set call; ensure the same lock
is used wherever issues are created (reference new_issue creation code paths) so
each request reads the latest prev_hash, computes its integrity_hash, saves the
Issue (save_issue_db) and then updates blockchain_last_hash_cache atomically
while holding the lock. If you prefer DB-level serialization instead, perform
these steps inside a transaction with a row/table lock so the
read→compute→insert→cache update is atomic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.jules/bolt.md:
- Around line 85-87: The learning entry titled "Threadpool Offloading for Tail
Latency" currently uses the future date "2026-06-12"; update that header date to
the actual PR date ("2026-04-22") so the entry reflects the correct timeline,
leaving the rest of the content (references to FastAPI, run_in_threadpool, and
the guidance about async def vs def) unchanged.

In `@backend/routers/field_officer.py`:
- Around line 285-288: The current upload_visit_images flow splits DB work
across threads causing session detachment; create a synchronous helper (e.g.
_update_visit_with_images(visit_id: int, image_paths: list, db: Session)) that
does the query, mutates visit.visit_images and visit.updated_at, and calls
db.commit() all inside the same thread, then invoke it via await
run_in_threadpool(_update_visit_with_images, visit_id, image_paths, db) after
file I/O completes; update references in upload_visit_images (replace the
separate threadpool query, on-event-loop mutation of visit.visit_images, and
separate commit) so the session-bound operations for FieldOfficerVisit are
performed atomically in that helper.

In `@backend/tasks.py`:
- Around line 22-29: The current code mixes async and sync SQLAlchemy
operations; wrap the entire DB lifecycle and update in one synchronous helper so
the Session never crosses threads: create a function (e.g.,
_update_issue_action_plan(issue_id, action_plan)) that instantiates the session,
queries Issue by id, merges/sets issue.action_plan = {**(issue.action_plan or
{}), **action_plan}, commits, and always closes the session in a finally block,
then call it via await run_in_threadpool(_update_issue_action_plan, issue_id,
action_plan) instead of calling query/commit/close separately on the event loop.

---

Outside diff comments:
In `@backend/routers/field_officer.py`:
- Around line 339-349: The filename generation can collide for concurrent
uploads because timestamp and idx may be identical; update the safe_filename
creation to append a unique identifier (e.g., uuid4 hex) so each file is
globally unique. Locate the block that builds timestamp, safe_filename and
file_path (symbols: timestamp, safe_filename, file_path) and modify the
safe_filename to include a UUID suffix before the extension; keep the rest of
the save logic (_save_image and run_in_threadpool) unchanged so writes go to
distinct paths under VISIT_IMAGES_DIR.

In `@backend/routers/voice.py`:
- Around line 268-306: The prev_hash read → integrity_hash compute → DB save →
cache update sequence must be serialized to avoid forks: introduce a shared
asyncio.Lock (e.g., blockchain_chain_lock) and acquire it around the critical
section that includes the run_in_threadpool call that reads prev_issue, the
hash_content/integrity_hash computation, the call to save_issue_db, and the
blockchain_last_hash_cache.set call; ensure the same lock is used wherever
issues are created (reference new_issue creation code paths) so each request
reads the latest prev_hash, computes its integrity_hash, saves the Issue
(save_issue_db) and then updates blockchain_last_hash_cache atomically while
holding the lock. If you prefer DB-level serialization instead, perform these
steps inside a transaction with a row/table lock so the
read→compute→insert→cache update is atomic.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2aab711a-6d3b-458f-8d7b-1323f6e38ff0

📥 Commits

Reviewing files that changed from the base of the PR and between ea329e9 and 69ff2d5.

📒 Files selected for processing (4)
  • .jules/bolt.md
  • backend/routers/field_officer.py
  • backend/routers/voice.py
  • backend/tasks.py

Comment thread .jules/bolt.md
Comment on lines +85 to +87
## 2026-06-12 - Threadpool Offloading for Tail Latency
**Learning:** Mixed I/O operations (Database and File System) in FastAPI `async def` endpoints block the event loop, causing severe tail latency spikes under concurrency. Explicitly offloading these to `run_in_threadpool` is essential for maintaining responsiveness.
**Action:** Wrap all synchronous DB and File I/O operations in `run_in_threadpool`. For purely blocking background tasks, use standard `def` instead of `async def` to leverage FastAPI's automatic threadpool execution.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use the actual PR date for this learning entry.

Line 85 is dated 2026-06-12, which is in the future relative to this PR/current date: April 22, 2026.

📝 Proposed fix
-## 2026-06-12 - Threadpool Offloading for Tail Latency
+## 2026-04-22 - Threadpool Offloading for Tail Latency
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
## 2026-06-12 - Threadpool Offloading for Tail Latency
**Learning:** Mixed I/O operations (Database and File System) in FastAPI `async def` endpoints block the event loop, causing severe tail latency spikes under concurrency. Explicitly offloading these to `run_in_threadpool` is essential for maintaining responsiveness.
**Action:** Wrap all synchronous DB and File I/O operations in `run_in_threadpool`. For purely blocking background tasks, use standard `def` instead of `async def` to leverage FastAPI's automatic threadpool execution.
## 2026-04-22 - Threadpool Offloading for Tail Latency
**Learning:** Mixed I/O operations (Database and File System) in FastAPI `async def` endpoints block the event loop, causing severe tail latency spikes under concurrency. Explicitly offloading these to `run_in_threadpool` is essential for maintaining responsiveness.
**Action:** Wrap all synchronous DB and File I/O operations in `run_in_threadpool`. For purely blocking background tasks, use standard `def` instead of `async def` to leverage FastAPI's automatic threadpool execution.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.jules/bolt.md around lines 85 - 87, The learning entry titled "Threadpool
Offloading for Tail Latency" currently uses the future date "2026-06-12"; update
that header date to the actual PR date ("2026-04-22") so the entry reflects the
correct timeline, leaving the rest of the content (references to FastAPI,
run_in_threadpool, and the guidance about async def vs def) unchanged.

Comment on lines +285 to +288
# Performance Optimization: Wrap blocking DB query in threadpool
visit = await run_in_threadpool(
lambda: db.query(FieldOfficerVisit).filter(FieldOfficerVisit.id == visit_id).first()
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Inspect database setup and field officer DB threadpool boundaries.

echo "## backend/database.py"
fd -a '^database\.py$' backend | xargs -r sed -n '1,220p'

echo
echo "## field_officer.py DB offloading around upload_visit_images"
rg -n -C5 'upload_visit_images|run_in_threadpool\(|db\.query|db\.commit|visit\.visit_images' backend/routers/field_officer.py

Repository: RohanExploit/VishwaGuru

Length of output: 10096


Refactor visit image update into a single synchronous database operation.

The upload_visit_images endpoint splits the database session across three different execution contexts:

  1. Line 286–288: Visit is queried in a threadpool worker
  2. Line 357–358: The ORM object is mutated on the event loop (visit.visit_images assignment)
  3. Line 361: Changes are committed in another threadpool worker

SQLAlchemy sessions are thread-unsafe and not designed to handle object state modifications across thread boundaries. The visit object becomes detached from its session between the query and mutation, risking state inconsistency.

Create a synchronous helper function that bundles the query, object mutation, and commit into a single database operation within one thread:

def _update_visit_with_images(visit_id: int, image_paths: list, db: Session):
    visit = db.query(FieldOfficerVisit).filter(FieldOfficerVisit.id == visit_id).first()
    if visit:
        existing = visit.visit_images or []
        existing.extend(image_paths)
        visit.visit_images = existing
        visit.updated_at = datetime.now(timezone.utc)
        db.commit()
    return visit

# In upload_visit_images, after file I/O completes:
await run_in_threadpool(_update_visit_with_images, visit_id, image_paths, db)

This keeps all session-bound operations atomic within a single thread.

Also applies to lines 355–361.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/routers/field_officer.py` around lines 285 - 288, The current
upload_visit_images flow splits DB work across threads causing session
detachment; create a synchronous helper (e.g.
_update_visit_with_images(visit_id: int, image_paths: list, db: Session)) that
does the query, mutates visit.visit_images and visit.updated_at, and calls
db.commit() all inside the same thread, then invoke it via await
run_in_threadpool(_update_visit_with_images, visit_id, image_paths, db) after
file I/O completes; update references in upload_visit_images (replace the
separate threadpool query, on-event-loop mutation of visit.visit_images, and
separate commit) so the session-bound operations for FieldOfficerVisit are
performed atomically in that helper.

Comment thread backend/tasks.py
Comment on lines +22 to +29
# Performance Optimization: Wrap blocking DB operations in threadpool
issue = await run_in_threadpool(
lambda: db.query(Issue).filter(Issue.id == issue_id).first()
)
if issue:
current_plan = issue.action_plan or {}
issue.action_plan = {**current_plan, **action_plan}
db.commit()
await run_in_threadpool(db.commit)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Find async functions that create/use SessionLocal and offload only individual DB operations.

rg -n -C4 'async def|SessionLocal\(|run_in_threadpool\(|db\.commit|db\.close' backend/tasks.py

Repository: RohanExploit/VishwaGuru

Length of output: 2617


Move the entire SQLAlchemy update into one threadpool helper.

The current code creates SessionLocal() on the event loop, queries in a threadpool, mutates the ORM object back on the event loop, commits in a threadpool, and closes synchronously. This violates SQLAlchemy's thread-affinity requirements. Keep the full session lifecycle on a single thread by extracting the database operation into a synchronous helper.

♻️ Proposed refactor
+def _merge_action_plan_into_issue(issue_id: int, action_plan: dict) -> bool:
+    db = SessionLocal()
+    try:
+        issue = db.query(Issue).filter(Issue.id == issue_id).first()
+        if not issue:
+            return False
+
+        current_plan = issue.action_plan or {}
+        issue.action_plan = {**current_plan, **action_plan}
+        db.commit()
+        return True
+    finally:
+        db.close()
+
 async def process_action_plan_background(issue_id: int, description: str, category: str, language: str, image_path: str):
-    db = SessionLocal()
     try:
         # Generate Action Plan (AI)
         action_plan = await generate_action_plan(description, category, language, image_path)
 
         # Update issue in DB
-        # Performance Optimization: Wrap blocking DB operations in threadpool
-        issue = await run_in_threadpool(
-            lambda: db.query(Issue).filter(Issue.id == issue_id).first()
-        )
-        if issue:
-            current_plan = issue.action_plan or {}
-            issue.action_plan = {**current_plan, **action_plan}
-            await run_in_threadpool(db.commit)
+        updated = await run_in_threadpool(_merge_action_plan_into_issue, issue_id, action_plan)
+        if updated:
 
             # Invalidate cache to ensure users get the updated action plan
             recent_issues_cache.clear()
     except Exception as e:
         logger.error(f"Background action plan generation failed for issue {issue_id}: {e}", exc_info=True)
-    finally:
-        db.close()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Performance Optimization: Wrap blocking DB operations in threadpool
issue = await run_in_threadpool(
lambda: db.query(Issue).filter(Issue.id == issue_id).first()
)
if issue:
current_plan = issue.action_plan or {}
issue.action_plan = {**current_plan, **action_plan}
db.commit()
await run_in_threadpool(db.commit)
def _merge_action_plan_into_issue(issue_id: int, action_plan: dict) -> bool:
db = SessionLocal()
try:
issue = db.query(Issue).filter(Issue.id == issue_id).first()
if not issue:
return False
current_plan = issue.action_plan or {}
issue.action_plan = {**current_plan, **action_plan}
db.commit()
return True
finally:
db.close()
async def process_action_plan_background(issue_id: int, description: str, category: str, language: str, image_path: str):
try:
# Generate Action Plan (AI)
action_plan = await generate_action_plan(description, category, language, image_path)
# Update issue in DB
updated = await run_in_threadpool(_merge_action_plan_into_issue, issue_id, action_plan)
if updated:
# Invalidate cache to ensure users get the updated action plan
recent_issues_cache.clear()
except Exception as e:
logger.error(f"Background action plan generation failed for issue {issue_id}: {e}", exc_info=True)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/tasks.py` around lines 22 - 29, The current code mixes async and sync
SQLAlchemy operations; wrap the entire DB lifecycle and update in one
synchronous helper so the Session never crosses threads: create a function
(e.g., _update_issue_action_plan(issue_id, action_plan)) that instantiates the
session, queries Issue by id, merges/sets issue.action_plan =
{**(issue.action_plan or {}), **action_plan}, commits, and always closes the
session in a finally block, then call it via await
run_in_threadpool(_update_issue_action_plan, issue_id, action_plan) instead of
calling query/commit/close separately on the event loop.

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 4 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="backend/routers/voice.py">

<violation number="1" location="backend/routers/voice.py:306">
P1: Do not pass the same SQLAlchemy Session instance into `run_in_threadpool`; use a session created inside the worker thread (or an async session) to avoid cross-thread session usage.</violation>
</file>

<file name="backend/routers/field_officer.py">

<violation number="1" location="backend/routers/field_officer.py:286">
P1: SQLAlchemy Session/ORM object `visit` is used across thread boundaries: queried in a threadpool worker here, mutated on the event loop (lines 357–358), then committed in yet another threadpool dispatch. SQLAlchemy Sessions are not thread-safe and ORM instances are bound to the session that loaded them. This can cause `DetachedInstanceError`, state corruption, or silent data loss.

Extract a single synchronous helper that performs the query, mutation, and commit atomically within one thread, and call it via `run_in_threadpool`.</violation>
</file>

<file name="backend/tasks.py">

<violation number="1" location="backend/tasks.py:23">
P1: The `SessionLocal()` is created on the event loop, the query is dispatched to a threadpool, the ORM `issue` object is mutated back on the event loop, and then `db.commit` is dispatched to yet another threadpool call. This splits a single SQLAlchemy Session across multiple threads, violating its thread-affinity contract.

Extract all DB work (query + mutation + commit + close) into a single synchronous helper and invoke it once via `run_in_threadpool`, returning only primitive data to the async context.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread backend/routers/voice.py
db.commit()
db.refresh(new_issue)
# Performance Optimization: Wrap blocking DB operations in threadpool to keep event loop responsive
await run_in_threadpool(save_issue_db, db, new_issue)
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Do not pass the same SQLAlchemy Session instance into run_in_threadpool; use a session created inside the worker thread (or an async session) to avoid cross-thread session usage.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At backend/routers/voice.py, line 306:

<comment>Do not pass the same SQLAlchemy Session instance into `run_in_threadpool`; use a session created inside the worker thread (or an async session) to avoid cross-thread session usage.</comment>

<file context>
@@ -300,10 +302,8 @@ def _save_audio_file():
-        db.commit()
-        db.refresh(new_issue)
+        # Performance Optimization: Wrap blocking DB operations in threadpool to keep event loop responsive
+        await run_in_threadpool(save_issue_db, db, new_issue)
 
         # Update cache for next report AFTER successful DB commit
</file context>
Fix with Cubic

try:
visit = db.query(FieldOfficerVisit).filter(FieldOfficerVisit.id == visit_id).first()
# Performance Optimization: Wrap blocking DB query in threadpool
visit = await run_in_threadpool(
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: SQLAlchemy Session/ORM object visit is used across thread boundaries: queried in a threadpool worker here, mutated on the event loop (lines 357–358), then committed in yet another threadpool dispatch. SQLAlchemy Sessions are not thread-safe and ORM instances are bound to the session that loaded them. This can cause DetachedInstanceError, state corruption, or silent data loss.

Extract a single synchronous helper that performs the query, mutation, and commit atomically within one thread, and call it via run_in_threadpool.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At backend/routers/field_officer.py, line 286:

<comment>SQLAlchemy Session/ORM object `visit` is used across thread boundaries: queried in a threadpool worker here, mutated on the event loop (lines 357–358), then committed in yet another threadpool dispatch. SQLAlchemy Sessions are not thread-safe and ORM instances are bound to the session that loaded them. This can cause `DetachedInstanceError`, state corruption, or silent data loss.

Extract a single synchronous helper that performs the query, mutation, and commit atomically within one thread, and call it via `run_in_threadpool`.</comment>

<file context>
@@ -281,7 +282,10 @@ async def upload_visit_images(
     try:
-        visit = db.query(FieldOfficerVisit).filter(FieldOfficerVisit.id == visit_id).first()
+        # Performance Optimization: Wrap blocking DB query in threadpool
+        visit = await run_in_threadpool(
+            lambda: db.query(FieldOfficerVisit).filter(FieldOfficerVisit.id == visit_id).first()
+        )
</file context>
Fix with Cubic

Comment thread backend/tasks.py
# Update issue in DB
issue = db.query(Issue).filter(Issue.id == issue_id).first()
# Performance Optimization: Wrap blocking DB operations in threadpool
issue = await run_in_threadpool(
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: The SessionLocal() is created on the event loop, the query is dispatched to a threadpool, the ORM issue object is mutated back on the event loop, and then db.commit is dispatched to yet another threadpool call. This splits a single SQLAlchemy Session across multiple threads, violating its thread-affinity contract.

Extract all DB work (query + mutation + commit + close) into a single synchronous helper and invoke it once via run_in_threadpool, returning only primitive data to the async context.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At backend/tasks.py, line 23:

<comment>The `SessionLocal()` is created on the event loop, the query is dispatched to a threadpool, the ORM `issue` object is mutated back on the event loop, and then `db.commit` is dispatched to yet another threadpool call. This splits a single SQLAlchemy Session across multiple threads, violating its thread-affinity contract.

Extract all DB work (query + mutation + commit + close) into a single synchronous helper and invoke it once via `run_in_threadpool`, returning only primitive data to the async context.</comment>

<file context>
@@ -18,11 +19,14 @@ async def process_action_plan_background(issue_id: int, description: str, catego
         # Update issue in DB
-        issue = db.query(Issue).filter(Issue.id == issue_id).first()
+        # Performance Optimization: Wrap blocking DB operations in threadpool
+        issue = await run_in_threadpool(
+            lambda: db.query(Issue).filter(Issue.id == issue_id).first()
+        )
</file context>
Fix with Cubic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants