[Fix] normalize multimodal message content#908
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (4)
🚧 Files skipped from review as they are similar to previous changes (2)
Walkthrough中间件与多模态插件重构了多媒体元素处理:引入 forwardHistory/fileSizes/handledAudio 以跨阶段追踪,统一图片/文件/视频/音频的能力判定与处理入口(handleFileElement),并调整插件侧的 MIME 检测与注入语义(音频统一转 MP3、图片走描述注入或 typed image_url)。 变更多媒体处理流程重构
Sequence DiagramsequenceDiagram
participant Client
participant read_chat_message as read_chat_message.ts
participant plugin_audio as plugins/audio.ts
participant detect as detectFileType/fromBuffer
participant Storage as TempStorage
Client->>read_chat_message: 传入消息与元素
read_chat_message->>plugin_audio: 传缓冲区以检测/处理音频
plugin_audio->>detect: detectFileType(buffer)
alt NATIVE_AUDIO_MIMES
plugin_audio->>read_chat_message: 返回 false(不注入 audio_url)
else
plugin_audio->>read_chat_message: 转码为 MP3 注入 audio/mpeg dataUrl,返回 false
end
read_chat_message->>Storage: 对 GIF/大文件落盘并 setElementUrl
read_chat_message->>read_chat_message: 调用 handleFileElement 并更新 fileSizes WeakMap
估计代码审查工作量🎯 4 (Complex) | ⏱️ ~45 分钟 可能相关的 PR
诗
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request refactors media and file handling in the chat message middleware and multimodal plugins, introducing weak maps/sets for state tracking, unifying model capability checks, and streamlining file processing. The review feedback highlights several key improvement opportunities: preventing massive base64 data URLs from polluting the text content when storage is unavailable, adding a guard for missing image URLs to avoid unsafe hashing, ensuring GIF images have fallback text placeholders when multimodal services are disabled, and caching downloaded image data on elements to prevent duplicate network requests.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 244f1a82b2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
This pr normalizes multimodal message content handling so core message reading and the multimodal service do not duplicate or drop media content.
Bug fixes
Other Changes
Validation