fix: suppress false-positive warnings when loading whisper audio encoder by octo-patch · Pull Request #13281 · Comfy-Org/ComfyUI

octo-patch · 2026-04-04T05:48:32Z

Problem

When loading a standard full whisper checkpoint (which contains both encoder and decoder weights) via AudioEncoderLoader, two classes of spurious warnings appear in the console:

missing audio encoder: ['feature_extractor.mel_spectrogram.spectrogram.window', 'feature_extractor.mel_spectrogram.mel_scale.fb']
unexpected audio encoder: ['decoder.embed_positions.weight', 'decoder.embed_tokens.weight', ...]

These warnings mislead users into thinking the model loaded incorrectly.

Root causes:

Unexpected decoder keys — Full whisper checkpoints contain decoder.* weights alongside encoder.* weights. Since WhisperLargeV3 is encoder-only, all decoder.* keys are flagged as unexpected. They were never needed and should be silently discarded.
Missing mel-spectrogram buffers — torchaudio.transforms.MelSpectrogram registers a Hann window (spectrogram.window) and a mel filterbank (mel_scale.fb) as PyTorch buffers. Standard whisper checkpoints do not store these constants because they are deterministically computed from the model config at init time. load_state_dict(strict=False) flags them as missing, but they are always correctly initialised by torchaudio — the warning is misleading.

Solution

Strip decoder.* keys from the state-dict before passing it to load_state_dict, eliminating the "unexpected" warnings for whisper models.
After loading, suppress warnings only for the two known torchaudio-computed buffers (feature_extractor.mel_spectrogram.spectrogram.window and feature_extractor.mel_spectrogram.mel_scale.fb); any other genuinely missing keys are still warned about.

Testing

Verified by inspection: the decoder key filter targets only the whisper branch, and the buffer exclusion set contains only torchaudio-managed names that cannot appear in wav2vec2 checkpoints, so wav2vec2 loading is unaffected.

When a full whisper checkpoint (encoder + decoder) is loaded via AudioEncoderLoader, two classes of spurious warnings were emitted: 1. 'unexpected audio encoder' for every decoder.* key - the decoder is not part of WhisperLargeV3, so these keys are always present in full whisper checkpoints and should be silently discarded. 2. 'missing audio encoder' for feature_extractor.mel_spectrogram buffers (window and mel_scale.fb) - these are torchaudio buffers computed deterministically from config at init time; they are never stored in standard whisper checkpoints but are always correctly initialised. Fix: strip decoder keys from the state-dict before loading, and suppress warnings for the two known torchaudio-computed buffer keys. Fixes Comfy-Org#13276

coderabbitai · 2026-04-04T05:51:27Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a6f19547-107f-4e89-b9b9-2af044302b9f

📥 Commits

Reviewing files that changed from the base of the PR and between f21f6b2 and 3d51d63.

📒 Files selected for processing (1)

comfy/audio_encoders/audio_encoders.py

📝 Walkthrough

Walkthrough

The audio encoder checkpoint loading logic in comfy/audio_encoders/audio_encoders.py has been updated to handle Whisper3-style state dictionaries more carefully. When loading weights, the code now filters out all parameters with keys starting with decoder. to prevent decoder weights from being treated as encoder-only checkpoint parameters. Additionally, the missing-parameter warning system has been refined to suppress warnings for two specific torchaudio-derived buffer keys (feature_extractor.mel_spectrogram.spectrogram.window and feature_extractor.mel_spectrogram.mel_scale.fb) while still logging warnings for other missing keys. Unexpected keys continue to be logged normally.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: suppressing false-positive warnings when loading whisper audio encoders.
Description check	✅ Passed	The description is directly related to the changeset, explaining the problem and solution with clear context about decoder keys and torchaudio buffers.
Linked Issues check	✅ Passed	The PR addresses issue `#13276` by filtering decoder keys and suppressing warnings for torchaudio buffers, directly fixing both the unexpected and missing encoder warnings reported.
Out of Scope Changes check	✅ Passed	All changes are scoped to fixing false-positive warnings in the audio encoder loader; no unrelated modifications are present.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

octo-patch requested review from Kosinkadink, comfyanonymous and guill as code owners April 4, 2026 05:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: suppress false-positive warnings when loading whisper audio encoder#13281

fix: suppress false-positive warnings when loading whisper audio encoder#13281
octo-patch wants to merge 1 commit intoComfy-Org:masterfrom
octo-patch:fix/issue-13276-whisper-encoder-warnings

octo-patch commented Apr 4, 2026

Uh oh!

coderabbitai bot commented Apr 4, 2026

Walkthrough

❌ Failed checks (1 warning)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

octo-patch commented Apr 4, 2026

Problem

Solution

Testing

Uh oh!

coderabbitai bot commented Apr 4, 2026

Walkthrough

❌ Failed checks (1 warning)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant