Add HY-OmniWeaving support for HunyuanVideo 1.5 by ifilipis · Pull Request #13289 · Comfy-Org/ComfyUI

ifilipis · 2026-04-04T23:19:41Z

https://huggingface.co/tencent/HY-OmniWeaving

Repackaged models:
https://huggingface.co/vafipas663/HY-OmniWeaving_repackaged/tree/main/split_files

Transformer is a finetune
Clip is a finetune
VAE is original
Upsamplers are original

Tested with their model and encoder as is

Workflow:

coderabbitai · 2026-04-04T23:35:03Z

📝 Walkthrough

Walkthrough

This pull request adds support for HunyuanVideo 1.5 "Omni" models by extending text encoder detection and checkpoint handling for Qwen2.5-VL encoders, adding attention tensor format conversion for HY-OmniWeave checkpoints, and introducing three new conditioning nodes. The changes include model detection logic updates, checkpoint key normalization, attention tensor merging for split Q/K/V formats, and UI/API extensions to expose the new hunyuan_video_15 CLIP type alongside new nodes for encoding text, concatenating vision outputs, and generating task-specific conditioning latents.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 4.35% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main purpose of the PR: adding support for HY-OmniWeaving in HunyuanVideo 1.5.
Description check	✅ Passed	The description provides relevant context about the HY-OmniWeaving model, links to resources, testing information, and includes a workflow screenshot demonstrating the implementation.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@comfy_extras/nodes_hunyuan.py`:
- Around line 528-535: The omni_mask can exceed 1.0 (e.g., omni_mask[ref_idx]
becomes 2.0), which makes concat_mask negative after computing 1.0 - omni_mask;
clamp omni_mask to the [0,1] range before inverting so concat_mask remains a
proper 0/1 mask. Update the code that computes concat_mask (and/or immediately
before it) to use a clamped version of omni_mask (e.g., torch.clamp(omni_mask,
0.0, 1.0)) when computing 1.0 - omni_mask, referencing omni_mask, concat_mask,
cond_latent, latent_length and the preceding logic that modifies omni_mask
(including _encode_single_image/reference_images handling).

In `@comfy/sd.py`:
- Around line 1270-1276: detect_te_model() accepts checkpoints keyed under
model.language_model.* for both QWEN25_3B and QWEN25_7B, but the 3B loading path
calls omnigen2.te() with the raw sd (whereas the 7B path normalizes prefixes
before loading), which risks silent weight-dropping by
transformer.load_state_dict in SDClipModel.load_sd(); update the 3B branch to
perform the same key-prefix normalization as the 7B loader before calling
omnigen2.te() (i.e., rewrite keys from the model.language_model.* layout to the
expected model.* layout), or alternatively restrict detect_te_model() to only
detect the 7B layout—prefer the former and apply the same prefix-rewrite logic
where the 3B omnigen2.te(...) invocation occurs so the state dict keys match the
model expected by transformer.load_state_dict.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 07947a53-d006-487c-a7ba-e3c834765b33

📥 Commits

Reviewing files that changed from the base of the PR and between 13917b3 and 6447250.

📒 Files selected for processing (3)

comfy/sd.py
comfy_extras/nodes_hunyuan.py
nodes.py

coderabbitai · 2026-04-04T23:35:07Z

comfy_extras/nodes_hunyuan.py

+            encoded_ref = cls._encode_single_image(vae, reference_images[:1], width, height)
+            ref_idx = 1 if latent_length > 1 else 0
+            cond_latent[:, :, ref_idx:ref_idx + 1] += encoded_ref[:, :, :1]
+            omni_mask[ref_idx] += 1.0
+
+        cond_latent = comfy.utils.resize_to_batch_size(cond_latent, batch_size)
+        # BaseModel/HunyuanVideo15 inverts concat_mask (mask = 1 - concat_mask), so pass the pre-inverted mask.
+        concat_mask = (1.0 - omni_mask).view(1, 1, latent_length, 1, 1).expand(cond_latent.shape[0], 1, latent_length, cond_latent.shape[-2], cond_latent.shape[-1]).to(cond_latent.dtype)


⚠️ Potential issue | 🟠 Major

Clamp the TiV2V mask before inverting it.

Line 531 increments a slot that is already set to 1.0 by the conditioned-video branch, so omni_mask[ref_idx] becomes 2.0. After the 1.0 - omni_mask transform on Line 535, the TiV2V path sends -1.0 in concat_mask for that frame, which breaks the 0/1 mask semantics used by the other tasks.

Proposed fix

encoded_ref = cls._encode_single_image(vae, reference_images[:1], width, height) ref_idx = 1 if latent_length > 1 else 0 cond_latent[:, :, ref_idx:ref_idx + 1] += encoded_ref[:, :, :1] - omni_mask[ref_idx] += 1.0 + omni_mask[ref_idx] = 1.0 cond_latent = comfy.utils.resize_to_batch_size(cond_latent, batch_size) + omni_mask = omni_mask.clamp_(0.0, 1.0) # BaseModel/HunyuanVideo15 inverts concat_mask (mask = 1 - concat_mask), so pass the pre-inverted mask. concat_mask = (1.0 - omni_mask).view(1, 1, latent_length, 1, 1).expand(cond_latent.shape[0], 1, latent_length, cond_latent.shape[-2], cond_latent.shape[-1]).to(cond_latent.dtype)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@comfy_extras/nodes_hunyuan.py` around lines 528 - 535, The omni_mask can exceed 1.0 (e.g., omni_mask[ref_idx] becomes 2.0), which makes concat_mask negative after computing 1.0 - omni_mask; clamp omni_mask to the [0,1] range before inverting so concat_mask remains a proper 0/1 mask. Update the code that computes concat_mask (and/or immediately before it) to use a clamped version of omni_mask (e.g., torch.clamp(omni_mask, 0.0, 1.0)) when computing 1.0 - omni_mask, referencing omni_mask, concat_mask, cond_latent, latent_length and the preceding logic that modifies omni_mask (including _encode_single_image/reference_images handling).

coderabbitai · 2026-04-04T23:35:07Z

comfy/sd.py

+    # Qwen-VL checkpoints can be saved under model.language_model.* (e.g. HY-OmniWeave text encoder).
+    if 'model.language_model.layers.0.self_attn.k_proj.bias' in sd:
+        weight = sd['model.language_model.layers.0.self_attn.k_proj.bias']
+        if weight.shape[0] == 256:
+            return TEModel.QWEN25_3B
+        if weight.shape[0] == 512:
+            return TEModel.QWEN25_7B


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail files=$(fd -a '(omnigen2|qwen_image|hunyuan_image|hunyuan_video)\.py$' .) printf 'Inspecting wrapper files:\n%s\n\n' "$files" for f in $files; do echo "== $f ==" rg -n -C2 'model\.language_model|model\.layers\.0\.self_attn\.k_proj|state_dict_prefix_replace|def llama_detect|load_sd|load_state_dict' "$f" echo done

Repository: Comfy-Org/ComfyUI

Length of output: 407

🏁 Script executed:

#!/bin/bash set -euo pipefail # Examine the critical line ranges in comfy/sd.py echo "=== Lines 1270-1280 (detect_te_model QWEN25 detection) ===" sed -n '1270,1280p' comfy/sd.py echo "" echo "=== Lines 1420-1445 (QWEN25_7B and prefix normalization) ===" sed -n '1420,1445p' comfy/sd.py echo "" echo "=== Lines 1400-1420 (Context before QWEN25_7B) ===" sed -n '1400,1420p' comfy/sd.py echo "" echo "=== Search for QWEN25_3B handling ===" rg -n -A8 'QWEN25_3B' comfy/sd.py | head -50

Repository: Comfy-Org/ComfyUI

Length of output: 6141

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "=== omnigen2.te() and llama_detect definition ===" rg -n -A15 'def te\(' comfy/text_encoders/omnigen2.py | head -60 echo "" echo "=== llama_detect in comfy/sd.py to see what keys it expects ===" rg -n -B2 -A20 'def llama_detect' comfy/sd.py echo "" echo "=== Check if omnigen2.te handles state_dict_prefix_replace or similar ===" rg -n 'state_dict_prefix_replace|model\.language_model|model\.visual' comfy/text_encoders/omnigen2.py

Repository: Comfy-Org/ComfyUI

Length of output: 1642

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "=== Omnigen2Model class and __init__ ===" rg -n -B2 -A30 'class Omnigen2Model' comfy/text_encoders/omnigen2.py | head -80 echo "" echo "=== Check if Omnigen2Model or parent does any prefix rewriting ===" rg -n 'state_dict_prefix_replace|model\.language_model|load_state_dict' comfy/text_encoders/omnigen2.py echo "" echo "=== Check parent class if Omnigen2Model inherits ===" rg -n 'import.*Omnigen|from.*omnigen' comfy/text_encoders/omnigen2.py | head -10

Repository: Comfy-Org/ComfyUI

Length of output: 1022

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "=== Check SD1ClipModel ===" rg -n -B2 -A20 'class SD1ClipModel' comfy/text_encoders/sd1_clip.py | head -60 echo "" echo "=== Check Qwen25_3BModel ===" fd 'qwen.*\.py$' --type f | xargs rg -l 'Qwen25_3BModel' echo "" echo "=== Search for Qwen25_3BModel definition ===" rg -n -B2 -A25 'class Qwen25_3BModel' comfy/

Repository: Comfy-Org/ComfyUI

Length of output: 176

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "=== Find SD1ClipModel definition ===" rg -n 'class SD1ClipModel' comfy/ echo "" echo "=== Find where sd1_clip is imported from ===" rg -n 'from.*sd1_clip|import.*sd1_clip' comfy/ echo "" echo "=== Find Qwen25_3BModel ===" rg -n 'class Qwen25_3BModel' comfy/

Repository: Comfy-Org/ComfyUI

Length of output: 2090

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "=== Qwen25_3BModel definition ===" sed -n '25,50p' comfy/text_encoders/omnigen2.py echo "" echo "=== SDClipModel and SD1ClipModel load_sd or __init__ ===" sed -n '710,800p' comfy/sd1_clip.py echo "" echo "=== Search for load_state_dict in SD1ClipModel area ===" rg -n -B5 -A15 'def load_state_dict|state_dict_prefix' comfy/sd1_clip.py | head -100

Repository: Comfy-Org/ComfyUI

Length of output: 3202

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "=== Search for SDClipModel definition ===" rg -n 'class SDClipModel' comfy/ echo "" echo "=== Search for Qwen25_3B in llama.py ===" rg -n -B2 -A30 'class Qwen25_3B' comfy/text_encoders/llama.py | head -80 echo "" echo "=== Check if SDClipModel has load_sd or transformers loading ===" rg -n -B5 -A20 'class SDClipModel' comfy/sd1_clip.py | head -80

Repository: Comfy-Org/ComfyUI

Length of output: 3913

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "=== Check SDClipModel.load_sd implementation ===" rg -n -B3 -A25 'def load_sd' comfy/sd1_clip.py | head -80 echo "" echo "=== Check if Qwen25_3B or Llama2_ handle state dict prefix flexibly ===" rg -n 'def forward|def load_state_dict|state_dict' comfy/text_encoders/llama.py | head -30 echo "" echo "=== Verify: check what keys Llama2_ expects by looking at model structure ===" sed -n '300,400p' comfy/text_encoders/llama.py | head -60

Repository: Comfy-Org/ComfyUI

Length of output: 4758

Check whether the 3B path needs the same prefix normalization as the 7B path.

detect_te_model() now accepts model.language_model.* prefixed layouts for both 256-dim (3B) and 512-dim (7B) models (lines 1271–1276). However, the 3B loader at line 1425 passes the state dict directly to omnigen2.te() with no key rewriting, while the 7B loader at lines 1431–1440 normalizes the prefixes before loading.

Since SDClipModel.load_sd() calls transformer.load_state_dict(sd, strict=False), PyTorch will silently ignore the mismatched keys model.language_model.* when the model expects model.layers.*. The checkpoint will appear supported but fail to load any weights.

The 3B path should either rewrite the keys the same way as 7B, or the detection should be scoped to only the 7B branch.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@comfy/sd.py` around lines 1270 - 1276, detect_te_model() accepts checkpoints keyed under model.language_model.* for both QWEN25_3B and QWEN25_7B, but the 3B loading path calls omnigen2.te() with the raw sd (whereas the 7B path normalizes prefixes before loading), which risks silent weight-dropping by transformer.load_state_dict in SDClipModel.load_sd(); update the 3B branch to perform the same key-prefix normalization as the 7B loader before calling omnigen2.te() (i.e., rewrite keys from the model.language_model.* layout to the expected model.* layout), or alternatively restrict detect_te_model() to only detect the 7B layout—prefer the former and apply the same prefix-rewrite logic where the 3B omnigen2.te(...) invocation occurs so the state dict keys match the model expected by transformer.load_state_dict.

Add HY-OmniWeave support for HunyuanVideo 1.5

6447250

ifilipis requested review from Kosinkadink, comfyanonymous and guill as code owners April 4, 2026 23:19

coderabbitai bot reviewed Apr 4, 2026

View reviewed changes

MeiYi-dev mentioned this pull request Apr 5, 2026

OmniWeaving new HY video support #13268

Open

power88 mentioned this pull request Apr 7, 2026

comfyui support? Tencent-Hunyuan/OmniWeaving#2

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add HY-OmniWeaving support for HunyuanVideo 1.5#13289

Add HY-OmniWeaving support for HunyuanVideo 1.5#13289
ifilipis wants to merge 1 commit intoComfy-Org:masterfrom
ifilipis:OmniWeaving

ifilipis commented Apr 4, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Apr 4, 2026

Walkthrough

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Apr 4, 2026

Uh oh!

coderabbitai bot Apr 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ifilipis commented Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Apr 4, 2026

Walkthrough

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ifilipis commented Apr 4, 2026 •

edited

Loading