fix(embedding): 通用修复 embedding_dimensions 参数校验,解决 SiliconFlow 等兼容接口报 400 的问题#8807
fix(embedding): 通用修复 embedding_dimensions 参数校验,解决 SiliconFlow 等兼容接口报 400 的问题#8807Rat0323 wants to merge 3 commits into
Conversation
There was a problem hiding this comment.
Code Review
This pull request simplifies the _embedding_kwargs method in openai_embedding_source.py by removing a temporary workaround for the SiliconFlow provider and ensuring that the dimensions parameter is only set if its value is greater than zero. Feedback suggests checking for None or empty string values for embedding_dimensions before attempting to convert it to an integer, which prevents unnecessary warning logs from flooding the console when the field is left blank in the WebUI.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- When
embedding_dimensionsis configured as0or a negative value it is now silently ignored; consider logging a low-level warning or info in that case so misconfigurations are easier to spot while still avoiding the SiliconFlow 400s. - Since the provider-specific SiliconFlow branch is removed, it might be worth adding a short inline comment near the
dim_val > 0check to clarify that omittingdimensionsis intentional for OpenAI-compatible providers that reject this parameter entirely.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- When `embedding_dimensions` is configured as `0` or a negative value it is now silently ignored; consider logging a low-level warning or info in that case so misconfigurations are easier to spot while still avoiding the SiliconFlow 400s.
- Since the provider-specific SiliconFlow branch is removed, it might be worth adding a short inline comment near the `dim_val > 0` check to clarify that omitting `dimensions` is intentional for OpenAI-compatible providers that reject this parameter entirely.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
ec362e4 to
85a3543
Compare
85a3543 to
2b52c22
Compare
2b52c22 to
3c4ac17
Compare
- Allow automatic dimension inference when configured_dim is 0 to prevent blocking new KB creation. - Filter out dimensions parameter in OpenAI embedding requests when value is 0 to improve compatibility with providers like SiliconFlow.
📝 PR 摘要
通用修复
embedding_dimensions为 0(或默认值)时引发的兼容性问题与建库阻塞问题。移除了针对单一服务商的硬编码,提升了对类 OpenAI 接口(如 SiliconFlow, vLLM)的泛用性兼容。🐛 问题根源
dimensions: 0。导致严格校验该参数的平台(如 SiliconFlow)返回 HTTP 400 错误。knowledge_base_service.py严格校验1024 != config_dim(0),导致正常模型被系统误判拦截,知识库创建失败。🛠️ 修复方案
openai_embedding_source.py(接口层)embedding_dimensions > 0时,才向kwargs注入dimensions参数。api.siliconflow.cn的特定域名拦截逻辑,改为通用处理。支持dimensions的官方接口不受影响,拒绝无效维度的接口自动规避。knowledge_base_service.py(业务层)if configured_dim != 0 and actual_dim != configured_dim:。当配置为 0 时,信任并接纳 API 探测返回的真实维度用于 Faiss 初始化;仅在用户显式配置错误维度时拦截。📊 测试验证
测试模型:SiliconFlow (
BAAI/bge-m3)embedding_dimensions)0/ 留空dimensions1024(正例)dimensions: 10241024 == 1024,校验通过768(反例)dimensions: 7681024 != 768