Skip to content

[Fix] normalize multimodal message content#908

Merged
dingyi222666 merged 2 commits into
v1-devfrom
fix/read-chat-message-pr
Jun 10, 2026
Merged

[Fix] normalize multimodal message content#908
dingyi222666 merged 2 commits into
v1-devfrom
fix/read-chat-message-pr

Conversation

@dingyi222666

Copy link
Copy Markdown
Member

This pr normalizes multimodal message content handling so core message reading and the multimodal service do not duplicate or drop media content.

Bug fixes

  • Keep forward-message ids in middleware-local state instead of leaking internal state through message kwargs.
  • Always add text placeholders for image, file, video, and audio attachments while appending typed multimodal content when the target model supports it.
  • Preserve fallback text when a model does not support a media type instead of dropping the attachment context.
  • Track handled audio elements with local weak state so SST and native audio handling do not duplicate each other.
  • Track inline file total size with middleware-local state while preserving the existing metadata value.

Other Changes

  • Let the multimodal audio plugin only inject converted MP3 content for non-native audio formats.
  • Let the multimodal image plugin handle GIF frame injection or image description without adding duplicate base64 image parts.

Validation

  • yarn lint-fix (0 errors, existing max-len warnings only)

@coderabbitai

coderabbitai Bot commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 2de3c5bf-81d4-4fe3-938c-b2a5206665ae

📥 Commits

Reviewing files that changed from the base of the PR and between 244f1a8 and 91cf176.

⛔ Files ignored due to path filters (1)
  • packages/service-multimodal/package.json is excluded by !**/*.json
📒 Files selected for processing (4)
  • packages/core/src/middlewares/chat/read_chat_message.ts
  • packages/service-multimodal/src/plugins/audio.ts
  • packages/service-multimodal/src/plugins/read_files.ts
  • packages/service-multimodal/src/utils.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • packages/service-multimodal/src/plugins/audio.ts
  • packages/core/src/middlewares/chat/read_chat_message.ts

Walkthrough

中间件与多模态插件重构了多媒体元素处理:引入 forwardHistory/fileSizes/handledAudio 以跨阶段追踪,统一图片/文件/视频/音频的能力判定与处理入口(handleFileElement),并调整插件侧的 MIME 检测与注入语义(音频统一转 MP3、图片走描述注入或 typed image_url)。

变更

多媒体处理流程重构

层级 / 文件(s) 摘要
状态管理与转发追踪
packages/core/src/middlewares/chat/read_chat_message.ts
模块级 WeakMap/WeakSet(forwardHistory、fileSizes、handledAudio)初始化用于跨中间件链的状态追踪;transform 期间从 forwardHistory 读取并写入 additional_kwargs.forwardMessageIds。
图像处理与能力判定
packages/core/src/middlewares/chat/read_chat_message.ts, packages/service-multimodal/src/plugins/image.ts
img 处理引入 modelSupportsElement 与 setElementUrl;为图片 URL 计算 hash;GIF 在可落盘时写入临时文件并跳过组装;非 GIF 通过 readImage 得到 buffer/base64,依据模型能力选择回退文本或追加 typed image_url part;image 插件 native 分支改为 describeAndInject 并返回 false。
文件/视频/音频统一处理与大小累计
packages/core/src/middlewares/chat/read_chat_message.ts
handleFileElement 新增 fileSizes 参数以在单条消息内累计内联 base64 总大小;内部根据下载内容推导 mime、决定落盘或内联、写入 element 的 file/filename/chatluna_file_url,并在平台 fileConfig 上校验 MIME 与大小限额,超限时返回 skip 文案。
多模态插件与 MIME 探测改动
packages/service-multimodal/src/plugins/audio.ts, packages/service-multimodal/src/plugins/read_files.ts, packages/service-multimodal/src/utils.ts
audio 插件下载后基于缓冲区探测 MIME:若为原生可支持类型(NATIVE_AUDIO_MIMES)则返回 false;否则统一转码为 MP3 并注入固定 audio/mpeg dataUrl(返回 false)。read_files 与 utils 新增基于 file-type 的缓冲区 MIME 探测(detectFileType),并将 detectAudioMimeType 的回退逻辑调整为优先使用缓冲区探测结果。
内容拼装与辅助工具
packages/core/src/middlewares/chat/read_chat_message.ts
新增 addTextPart、addFileSize、readImage、modelSupportsElement、setElementUrl 等辅助函数,移除旧的 oldImageRead 与基于 message 的 getFileTotalSize,toContentParts 及内容拼装逻辑相应调整。

Sequence Diagram

sequenceDiagram
  participant Client
  participant read_chat_message as read_chat_message.ts
  participant plugin_audio as plugins/audio.ts
  participant detect as detectFileType/fromBuffer
  participant Storage as TempStorage
  Client->>read_chat_message: 传入消息与元素
  read_chat_message->>plugin_audio: 传缓冲区以检测/处理音频
  plugin_audio->>detect: detectFileType(buffer)
  alt NATIVE_AUDIO_MIMES
    plugin_audio->>read_chat_message: 返回 false(不注入 audio_url)
  else
    plugin_audio->>read_chat_message: 转码为 MP3 注入 audio/mpeg dataUrl,返回 false
  end
  read_chat_message->>Storage: 对 GIF/大文件落盘并 setElementUrl
  read_chat_message->>read_chat_message: 调用 handleFileElement 并更新 fileSizes WeakMap
Loading

估计代码审查工作量

🎯 4 (Complex) | ⏱️ ~45 分钟

可能相关的 PR

  • ChatLunaLab/chatluna#572: 与本 PR 的图片/GIF 处理改动存在直接重叠(readImage/ GIF 行为变更)。
  • ChatLunaLab/chatluna#714: 与本 PR 的 forward-history/转发 ID 注入逻辑在相同中间件路径上有直接关系。
  • ChatLunaLab/chatluna#747: 与本 PR 关于内联文件大小限制与 handleFileElement 的改动存在代码路径重叠。

我是小兔写变更,
WeakMap 追踪到天明,
图片落盘声声慢,
音频统一转 MP3,
内容拼装新篇章。 🐇✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 46.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The PR title '[Fix] normalize multimodal message content' clearly and concisely describes the main objective of the changeset, which focuses on normalizing how multimodal content is handled across the codebase.
Description check ✅ Passed The PR description is comprehensive and directly related to the changeset, detailing specific bug fixes, changes, and validation steps for normalizing multimodal message content handling.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/read-chat-message-pr

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors media and file handling in the chat message middleware and multimodal plugins, introducing weak maps/sets for state tracking, unifying model capability checks, and streamlining file processing. The review feedback highlights several key improvement opportunities: preventing massive base64 data URLs from polluting the text content when storage is unavailable, adding a guard for missing image URLs to avoid unsafe hashing, ensuring GIF images have fallback text placeholders when multimodal services are disabled, and caching downloaded image data on elements to prevent duplicate network requests.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread packages/core/src/middlewares/chat/read_chat_message.ts
Comment thread packages/core/src/middlewares/chat/read_chat_message.ts
Comment thread packages/core/src/middlewares/chat/read_chat_message.ts
Comment thread packages/core/src/middlewares/chat/read_chat_message.ts

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 244f1a82b2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/core/src/middlewares/chat/read_chat_message.ts Outdated
@dingyi222666 dingyi222666 merged commit fc34282 into v1-dev Jun 10, 2026
5 checks passed
@dingyi222666 dingyi222666 deleted the fix/read-chat-message-pr branch June 10, 2026 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant