Skip to content

Fix device mismatch errors Qwen 3.5 partial loading with no dynamic vram.#13295

Open
silveroxides wants to merge 3 commits intoComfy-Org:masterfrom
silveroxides:fix/qwen35-partial-load
Open

Fix device mismatch errors Qwen 3.5 partial loading with no dynamic vram.#13295
silveroxides wants to merge 3 commits intoComfy-Org:masterfrom
silveroxides:fix/qwen35-partial-load

Conversation

@silveroxides
Copy link
Copy Markdown
Contributor

Fixing some device mismatch errors in Qwen 3.5.

Ran into issues when the model is partially loaded with dynamic vram off. The manual parameters in GatedDeltaNet (A_log and dt_bias) were staying on cpu while everything else moved to gpu, causing crashes during element-wise ops. Added cast_to_device for those.

Also fixed the vision rotary embeddings because they weren't getting the right device passed down, so RoPE was failing there too. Made it work like the other models where the execution device is handled explicitly.

Tested it on my end and generation finally goes through without the "at least two devices" error.

  • fixed GatedDeltaNet parameters being on cpu
  • vision rope now uses correct device
  • added a safety check in llama apply_rope just in case

silveroxides and others added 3 commits April 5, 2026 11:57
…s (cos, sin, nsin) to the device of the input tensor xq. 2. In Qwen35VisionModel.forward, move position_embeddings to x.device.
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 5, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 56c15373-9794-4cdd-b616-f4ba935eba49

📥 Commits

Reviewing files that changed from the base of the PR and between 8cbbea8 and 987dce1.

📒 Files selected for processing (2)
  • comfy/text_encoders/llama.py
  • comfy/text_encoders/qwen35.py

📝 Walkthrough

Walkthrough

This pull request modifies device and dtype handling in two text encoder files. In comfy/text_encoders/llama.py, the apply_rope function moves RoPE tensors (cos, sin, nsin) to match the device of the input tensor before using them in computations. In comfy/text_encoders/qwen35.py, multiple functions are updated to explicitly handle device casting: GatedDeltaNet.forward casts tensors before computation, Qwen35VisionRotaryEmbedding.forward accepts optional device and dtype parameters, and Qwen35VisionModel methods ensure position embeddings are moved to the correct device during forward passes.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and specifically describes the main change: fixing device mismatch errors in Qwen 3.5 with dynamic VRAM disabled.
Description check ✅ Passed The description is well-detailed and directly related to the changeset, explaining the device mismatch issues, the fixes applied, and testing performed.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant