docs: Update supported file types, provider management, connector management#1192
docs: Update supported file types, provider management, connector management#1192
Conversation
|
|
||
| If you use different embedding models for different documents, you can create [filters](/knowledge-filters) to separate documents that were embedded with different models. | ||
|
|
||
| If you use multiple embeddings models, be aware that similarity search (in **Chat**) can take longer as the agent searches each model's embeddings separately. |
There was a problem hiding this comment.
I don't think this is true. Please correct me if otherwise.
There was a problem hiding this comment.
I think that statement is true.
src/services/search_service.py detects all indexed embedding_model values, generates a query embedding for each, then builds multiple KNN clauses (one per model field) and runs them together via dis_max. So with more embedding models, retrieval work increases (more embedding calls + larger query), which can make Chat retrieval slower, depending on provider latency and OpenSearch load.
There was a problem hiding this comment.
Ok thanks! I will add it back.
Caveat: If you remove a provider, then the Chat cannot generate an embedding with the appropriate embedding model. What happens in that case? Will the Chat ignore documents embedded by the missing providing? Or does it search with a "fallback" embedding model (potentially incorrect/missing results because the embedding dimensions/structure don't match)?
There was a problem hiding this comment.
from my understanding but always good to double check with @edwinjosechittilappilly.
If a provider is removed, OpenRAG still discovers embedding models from indexed documents and attempts to generate a query embedding for each, so if one of those models is no longer available, retrieval may fail and Chat may return an error or no useful answer; there is no safe cross-model fallback for those documents because embedding spaces and dimensions differ, and the system does not consistently fallback to searching only documents tied to still-available models.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Is it ok to delete this?
|
Build successful! ✅ |
| :::tip | ||
| **Fetch latest docs** _only_ gets the latest OpenRAG documentation. | ||
|
|
||
| To update docs ingested from cloud storage connectors], see [Configure connectors](/knowledge-connectors). |
There was a problem hiding this comment.
| To update docs ingested from cloud storage connectors], see [Configure connectors](/knowledge-connectors). | |
| To update docs ingested from cloud storage connectors, see [Configure connectors](/knowledge-connectors). |
Wallgau
left a comment
There was a problem hiding this comment.
Hey @aimurphy! Just synced with the team, @edwinjosechittilappilly is running tests to check files type. So let's hold on this PR until we can give you a solid list.
cc @edwinjosechittilappilly @prasanthcaibmcom
Closes #1128