Skip to content

fix issue#931: multimodal and peerMentions#945

Open
LUOSENGWA wants to merge 3 commits into
agentscope-ai:mainfrom
LUOSENGWA:fix/multimodal-peermentions-runtime
Open

fix issue#931: multimodal and peerMentions#945
LUOSENGWA wants to merge 3 commits into
agentscope-ai:mainfrom
LUOSENGWA:fix/multimodal-peermentions-runtime

Conversation

@LUOSENGWA

Copy link
Copy Markdown

Summary
Fix four bugs in the Controller's openclaw.json generation pipeline and one in copaw_worker's bridge.py. These bugs collectively prevented image handling and cross-worker @mention routing in HiClaw agent teams.

Changes (10 files, ~100 lines)
Bug 1: Multimodal — two-layer fix
Controller generator.go: Infer supports_multimodal / supports_image from ModelSpec.Input and write to agents.defaults
copaw_worker bridge.py: Propagate input capability flags (supports_image, supports_video, supports_multimodal) when constructing provider models. Previously bridge.py discarded input entirely → supports_multimodal: null → bool(null)=False → all images stripped. Aligns with analysis in Issue #931.
Bug 2: peerMentions — five-layer data chain
types.go → generator.go → team_controller.go → member_reconcile.go → deployer.go
Injects channels.matrix.peerMentions into openclaw.json, enabling cross-room Matrix @mention delivery.
Peer calculation: Leader peers = all team workers; Worker peers = leader + all other workers.
Bug 3: Runtime fallback
ResolveRuntime() default corrected from RuntimeOpenClaw to RuntimeCopaw (leader path only). Worker CRD spec stored separately and unaffected. The RuntimeOpenClaw constant is preserved for explicit runtime: openclaw users.
Bug 4
Leader model propagation chain verified correct — no code change needed.
Testing
go test -v ./internal/agentconfig/ → 22/22 PASS
go test -v ./internal/backend/ -run TestDockerCreateRuntime → 6/6 PASS
go vet ./... → zero warnings
E2E verified on Node1 production with 20/20 Workers:

Chrome icon recognition (image → text) via Element Web DM ✅
Cross-worker @mention routing ✅

@LUOSENGWA

Copy link
Copy Markdown
Author

Bug 1 data flow (two-layer fix):

openclaw.json models.providers[].models[].input: ["text","image"] ✅ Controller
├─ Fix #1: generator.go → agents.defaults.supports_multimodal (QwenPaw reads)
└─ Fix #2: bridge.py → providers.json → extra_models[].supports_multimodal (copaw reads)
Previously bridge.py (line 780) discarded input entirely — both layers needed fixing.

Related: Issue #931 (opened before fix, same root cause analysis)

@LUOSENGWA

Copy link
Copy Markdown
Author

Full report in zh-cn
HiClaw-Controller四Bug修复报告.md

@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

❌ Integration Tests Failed (llm-interaction-2 / mgr=openclaw / wk=openclaw)

Commit: e14afd9
Workflow run: #1356

Test Results
No test output captured.
Debug Log (tail)
No debug logs available.

📦 Download full debug logs & test artifacts

@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

❌ Integration Tests Failed (controller-cr / mgr=openclaw / wk=openclaw)

Commit: 82d41dd
Workflow run: #1357

Test Results
No test output captured.
Debug Log (tail)
No debug logs available.

📦 Download full debug logs & test artifacts

@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

❌ Integration Tests Failed (llm-interaction / mgr=openclaw / wk=openclaw)

Commit: e14afd9
Workflow run: #1356

Test Results
No test output captured.
Debug Log (tail)
No debug logs available.

📦 Download full debug logs & test artifacts

Bug 1 (multimodal): Two-layer fix
- Controller generator.go: infer supports_multimodal/supports_image from
  model spec Input field and write to agents.defaults in openclaw.json.
- copaw_worker bridge.py: propagate input capability flags
  (supports_image/supports_video/supports_multimodal) when constructing
  provider models, so copaw react_agent can read them.

Bug 2 (peerMentions): Five-layer data chain
- types.go → generator.go → team_controller.go → member_reconcile.go →
  deployer.go. Injects peerMentions into channels.matrix of openclaw.json,
  enabling cross-room Matrix @mention routing for team workers.

Note: The original Bug 3 (ResolveRuntime fallback) was a misdiagnosis.
leaderWorkerSpec() already hardcodes Runtime="copaw" for team leaders.
The global default of RuntimeOpenClaw is correct — reverting.

Tests: all passing, go vet clean.
Files: 8 (7 Go + 1 Python), +244/-2 lines.
@LUOSENGWA LUOSENGWA force-pushed the fix/multimodal-peermentions-runtime branch from fc76b4e to e14afd9 Compare June 18, 2026 12:33
@LUOSENGWA LUOSENGWA changed the title fix: multimodal, peerMentions, and runtime defaults (4 bugs) fix: multimodal and peerMentions Jun 18, 2026
@LUOSENGWA

LUOSENGWA commented Jun 18, 2026

Copy link
Copy Markdown
Author

PR 附属说明 — 多模态 & peerMentions 修复

仓库agentscope-ai/HiClaw | 影响版本:Controller 1.1.2
改动:8 文件(7 Go + 1 Python),+244/-2,~93 行业务代码
测试:22/22 agentconfig + 6/6 backend PASS,go vet 零警告
Commite14afd9


Bug 1:多模态静默(supports_multimodal 缺失)🔴

问题:所有通道(QQ/Element/Matrix)的图片被框架静默删除。

根因:Config generation 有两个裂口:

# 裂口 文件 说明
1 agents.defaults.supports_multimodal 缺失 Controller generator.go openclaw.json 标记了 input: ["text","image"],但不传播到 agents.defaults
2 provider 层 multimodal=null copaw_worker bridge.py:783 构造 provider models 时只复制 id/name,丢弃 input 字段 → copaw react_agent.py:685 读到 bool(null)=False

修复

  • generator.go:从 defaultModelSpec().Input 推断,写入 supports_multimodal / supports_image
  • bridge.py:models 列表构造时从 input 推断 supports_image / supports_video / supports_multimodal

测试:2 个新测试(vision model → true,text-only → 无此字段)。图片 E2E:Node1 Chrome 图标识别正确 ✅


Bug 2:peerMentions 不生成 🟡

问题:Team 内 Leader↔Worker @mention 失效(3 次生产事故)。

根因:Config generation 五层数据链断裂——Controller 内部有 peerMentions 相关数据但未传播到 openclaw.json

修复:五层数据链补齐


PR Side Notes — Multimodal & peerMentions fixes

Repository: agentscope-ai/HiClaw | Affected version: Controller 1.1.2
Changes: 8 files (7 Go + 1 Python), +244/-2, ~93 lines of business code
TEST: 22/22 agentconfig + 6/6 backend PASS, go vet zero warnings
Commit: e14afd9


Bug 1: Multimodal silence (supports_multimodal missing) 🔴

Issue: Images of all channels (QQ/Element/Matrix) are silently deleted by the frame.

Root cause: There are two gaps in Config generation:

# rift file description
1 agents.defaults.supports_multimodal missing Controller generator.go openclaw.json marked input: ["text","image"] but not propagated to agents.defaults
2 provider layer multimodal=null copaw_worker bridge.py:783 When constructing provider models, only copy id/name and discard the input field → copaw react_agent.py:685 reads bool(null)=False

Fix:

  • generator.go: inferred from defaultModelSpec().Input, written to supports_multimodal / supports_image
  • bridge.py: inferred from input when constructing models list supports_image / supports_video / supports_multimodal

Tests: 2 new tests (vision model → true, text-only → no field). Image E2E: Node1 Chrome icon recognized correctly ✅


Bug 2: peerMentions not generated 🟡

Problem: Leader↔Worker @mention in Team failed (3 production accidents).

Root cause: The five-layer data link of Config generation is broken - there is peerMentions related data inside the Controller but it is not propagated to openclaw.json.

Fix: Complete the five-layer data link

The multimodal inference in GenerateOpenClawConfig() was applying
supports_multimodal to ALL configs including Manager's openclaw.json.
When Manager uses openclaw runtime, the gateway fails to start
(unrecognized field), causing CI integration tests to 300s timeout.

Guard the inference behind req.WorkerName != "manager" — Manager
does not receive or process images, and the openclaw gateway does
not understand agents.defaults.supports_multimodal.
@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

📊 CI Metrics Report

Summary

Metric Current Baseline Change
LLM Calls 84 81 +3 ↑ +3.7%
Input Tokens 2853866 2803871 +49995 ↑ +1.8%
Output Tokens 18926 16791 +2135 ↑ +12.7%
Total Tokens 2872792 2820662 +52130 ↑ +1.8%

By Role

Role Metric Current Baseline Change
🧠 Manager LLM Calls 68 68 0 — 0%
Input Tokens 2477842 2502214 -24372 ↓ -1.0%
Output Tokens 15400 13725 +1675 ↑ +12.2%
Total Tokens 2493242 2515939 -22697 ↓ -0.9%
🔧 Workers LLM Calls 16 13 +3 ↑ +23.1%
Input Tokens 376024 301657 +74367 ↑ +24.7%
Output Tokens 3526 3066 +460 ↑ +15.0%
Total Tokens 379550 304723 +74827 ↑ +24.6%

Per-Test Breakdown

Test Mgr Calls Wkr Calls Δ Calls Mgr In Wkr In Mgr Out Wkr Out Δ Tokens Trend
02-create-worker 9 0 -3 ↓ -25.0% 254404 0 1606 0 -102612 ↓ -28.6% ✅ improved
03-assign-task 8 5 -2 ↓ -13.3% 254268 114892 1661 1000 -101835 ↓ -21.5% ✅ improved
04-human-intervene 14 0 +1 ↑ +7.7% 395357 0 2684 0 -34957 ↓ -8.1% ⚠️ regressed
05-heartbeat 6 0 -1 ↓ -14.3% 211929 0 1372 0 -61951 ↓ -22.5% ✅ improved
06-multi-worker 31 11 +8 ↑ +23.5% 1361884 261132 8077 2526 +353485 ↑ +27.6% ⚠️ regressed

Trends

3 test(s) improved (fewer LLM calls)
⚠️ 2 test(s) regressed (more LLM calls)


Generated by HiClaw CI on 2026-06-18 16:10:51 UTC


📦 Download debug logs & test artifacts

…kerName

v1 guard (WorkerName != "manager") left openclaw Workers polluted —
test-15 failed with 180s no-reply because supports_multimodal silently
breaks openclaw message processing. v2 uses AgentRuntime == "copaw"
so multimodal fields only reach agents that actually run copaw.

- types.go: add AgentRuntime string to WorkerConfigRequest
- deployer.go: pass req.Spec.Runtime for Worker and Manager paths
- generator.go: guard on req.AgentRuntime == "copaw"
- test_multimodal_test.go: add AgentRuntime to vision and text tests
@LUOSENGWA LUOSENGWA changed the title fix: multimodal and peerMentions fix#: multimodal and peerMentions Jun 18, 2026
@LUOSENGWA LUOSENGWA changed the title fix#: multimodal and peerMentions fix #931: multimodal and peerMentions Jun 18, 2026
@LUOSENGWA LUOSENGWA changed the title fix #931: multimodal and peerMentions fix issue#931: multimodal and peerMentions Jun 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant