Conversation
…vements Co-authored-by: kommi.avanthi <kommi.avanthi@gmail.com>
| } | ||
|
|
||
| # Generate cache key | ||
| cache_key = f"llm_response_{hashlib.md5(question.encode()).hexdigest()}" |
There was a problem hiding this comment.
🔴 Cache key for LLM responses ignores context documents, causing stale responses
In app_optimized_simple.py, the generate_response method generates a cache key based solely on the question hash:
cache_key = f"llm_response_{hashlib.md5(question.encode()).hexdigest()}"This ignores the context_docs parameter entirely. When a user asks the same question after uploading new documents, the cached response from the old context will be returned instead of generating a new response based on the updated documents.
Impact: Users will receive incorrect, outdated answers that don't reflect newly uploaded documents, even though the system found new relevant context.
Recommendation: Include a hash of the context documents' content in the cache key, e.g.: cache_key = f"llm_response_{hashlib.md5((question + ''.join(d.page_content for d in context_docs)).encode()).hexdigest()}"
Was this helpful? React with 👍 or 👎 to provide feedback.
| cache_key = f"search_{hashlib.md5(query.encode()).hexdigest()}" | ||
|
|
||
| # Check cache | ||
| cached_result = cache.get(cache_key) | ||
| if cached_result: | ||
| return cached_result | ||
|
|
||
| # Perform search | ||
| results = self.vector_store.similarity_search(query, k=k) | ||
|
|
||
| # Cache results | ||
| cache.set(cache_key, results) |
There was a problem hiding this comment.
🔴 Similarity search cache not invalidated when documents are added
In app_optimized_simple.py, the similarity_search method caches results based only on the query:
cache_key = f"search_{hashlib.md5(query.encode()).hexdigest()}"When add_documents is called to add new documents to the vector store (lines 181-204), the search cache is never invalidated. This means subsequent searches with the same query will return stale cached results that don't include the newly added documents.
Impact: After uploading new PDFs, users may not find relevant content from those documents because the search returns cached results from before the documents were added.
Recommendation: Either invalidate the search cache when documents are added (call cache.clear() or selectively clear search-related keys in add_documents), or include a version/hash of the vector store state in the cache key.
Was this helpful? React with 👍 or 👎 to provide feedback.
| 'args': args, | ||
| 'kwargs': sorted(kwargs.items()) | ||
| } | ||
| key_string = json.dumps(key_data, sort_keys=True) |
There was a problem hiding this comment.
🔴 Cache key generation crashes on non-JSON-serializable objects like Document
In utils/cache.py, the _generate_key method uses json.dumps without a default parameter:
key_string = json.dumps(key_data, sort_keys=True)The @cached decorator is applied to LLMService.generate_response (in services/llm_service.py:109) which takes context_docs: List[Document] as an argument. When the decorator tries to generate a cache key, json.dumps will raise a TypeError because Document objects are not JSON serializable.
Impact: The generate_response method will crash with a TypeError: Object of type Document is not JSON serializable when called, making the LLM service unusable in the main app.py application.
Recommendation: Add a default=str parameter to json.dumps, or implement custom serialization logic that handles non-serializable objects like Document by extracting their content hash.
Was this helpful? React with 👍 or 👎 to provide feedback.
Refactor and optimize the PDF Chat Application for improved performance, modularity, and scalability, including enhanced vector database management and caching.
The original application suffered from a monolithic architecture, basic vector database usage (local FAISS), lack of caching, and limited scalability. This PR introduces a modular design, multi-level caching, support for various vector databases (Chroma, Pinecone, Qdrant), improved text processing, and comprehensive monitoring, transforming it into a production-ready system.