Skip to content

Optimize app with vector db integration#1

Open
hawkh wants to merge 1 commit intomainfrom
cursor/optimize-app-with-vector-db-integration-ddf4
Open

Optimize app with vector db integration#1
hawkh wants to merge 1 commit intomainfrom
cursor/optimize-app-with-vector-db-integration-ddf4

Conversation

@hawkh
Copy link
Copy Markdown
Owner

@hawkh hawkh commented Aug 14, 2025

Refactor and optimize the PDF Chat Application for improved performance, modularity, and scalability, including enhanced vector database management and caching.

The original application suffered from a monolithic architecture, basic vector database usage (local FAISS), lack of caching, and limited scalability. This PR introduces a modular design, multi-level caching, support for various vector databases (Chroma, Pinecone, Qdrant), improved text processing, and comprehensive monitoring, transforming it into a production-ready system.


Open in Cursor Open in Web
Open with Devin

…vements

Co-authored-by: kommi.avanthi <kommi.avanthi@gmail.com>
@hawkh hawkh self-assigned this Aug 14, 2025
Repository owner deleted a comment from cursor bot Aug 14, 2025
@hawkh hawkh marked this pull request as ready for review August 14, 2025 16:06
Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 3 potential issues.

View issues and 4 additional flags in Devin Review.

Open in Devin Review

}

# Generate cache key
cache_key = f"llm_response_{hashlib.md5(question.encode()).hexdigest()}"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Cache key for LLM responses ignores context documents, causing stale responses

In app_optimized_simple.py, the generate_response method generates a cache key based solely on the question hash:

cache_key = f"llm_response_{hashlib.md5(question.encode()).hexdigest()}"

This ignores the context_docs parameter entirely. When a user asks the same question after uploading new documents, the cached response from the old context will be returned instead of generating a new response based on the updated documents.

Impact: Users will receive incorrect, outdated answers that don't reflect newly uploaded documents, even though the system found new relevant context.

Recommendation: Include a hash of the context documents' content in the cache key, e.g.: cache_key = f"llm_response_{hashlib.md5((question + ''.join(d.page_content for d in context_docs)).encode()).hexdigest()}"

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +213 to +224
cache_key = f"search_{hashlib.md5(query.encode()).hexdigest()}"

# Check cache
cached_result = cache.get(cache_key)
if cached_result:
return cached_result

# Perform search
results = self.vector_store.similarity_search(query, k=k)

# Cache results
cache.set(cache_key, results)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Similarity search cache not invalidated when documents are added

In app_optimized_simple.py, the similarity_search method caches results based only on the query:

cache_key = f"search_{hashlib.md5(query.encode()).hexdigest()}"

When add_documents is called to add new documents to the vector store (lines 181-204), the search cache is never invalidated. This means subsequent searches with the same query will return stale cached results that don't include the newly added documents.

Impact: After uploading new PDFs, users may not find relevant content from those documents because the search returns cached results from before the documents were added.

Recommendation: Either invalidate the search cache when documents are added (call cache.clear() or selectively clear search-related keys in add_documents), or include a version/hash of the vector store state in the cache key.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

'args': args,
'kwargs': sorted(kwargs.items())
}
key_string = json.dumps(key_data, sort_keys=True)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Cache key generation crashes on non-JSON-serializable objects like Document

In utils/cache.py, the _generate_key method uses json.dumps without a default parameter:

key_string = json.dumps(key_data, sort_keys=True)

The @cached decorator is applied to LLMService.generate_response (in services/llm_service.py:109) which takes context_docs: List[Document] as an argument. When the decorator tries to generate a cache key, json.dumps will raise a TypeError because Document objects are not JSON serializable.

Impact: The generate_response method will crash with a TypeError: Object of type Document is not JSON serializable when called, making the LLM service unusable in the main app.py application.

Recommendation: Add a default=str parameter to json.dumps, or implement custom serialization logic that handles non-serializable objects like Document by extracting their content hash.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants