Fix device mismatch errors Qwen 3.5 partial loading with no dynamic vram.#13295
Fix device mismatch errors Qwen 3.5 partial loading with no dynamic vram.#13295silveroxides wants to merge 3 commits intoComfy-Org:masterfrom
Conversation
…s (cos, sin, nsin) to the device of the input tensor xq. 2. In Qwen35VisionModel.forward, move position_embeddings to x.device.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThis pull request modifies device and dtype handling in two text encoder files. In 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Fixing some device mismatch errors in Qwen 3.5.
Ran into issues when the model is partially loaded with dynamic vram off. The manual parameters in GatedDeltaNet (A_log and dt_bias) were staying on cpu while everything else moved to gpu, causing crashes during element-wise ops. Added cast_to_device for those.
Also fixed the vision rotary embeddings because they weren't getting the right device passed down, so RoPE was failing there too. Made it work like the other models where the execution device is handled explicitly.
Tested it on my end and generation finally goes through without the "at least two devices" error.