Thanks for the interesting work.
I have a question regarding the official pretrained checkpoints.
From the current source code, my understanding is that WTConv1d uses fixed db1/Haar-style wavelet filters (wt_filter / iwt_filter), shared consistently across channels. I also checked a fresh model (without loading any checkpoint), and this is indeed the case.
However, after loading the official checkpoints synth_only.pth and synth_and_ucr.pth, I found that in:
encoder.layer2.layer.0
encoder.layer3.layer.0
encoder.layer4.layer.0
the wt_filter / iwt_filter are no longer identical across channels.
For example, in encoder.layer2.layer.0.wt_filter, the first few rows in the checkpoint are:
[0] [0.8490355610847473, 0.7476450800895691]
[1] [0.6930041313171387, -0.6657928228378296]
[2] [0.6884918212890625, 0.7316277027130127]
[3] [0.380836546421051, -0.946887731552124]
If the current fresh initialization logic is followed, then [0] and [2] should be identical (low filters), and [1] and [3] should also be identical (high filters), but they are not.
I checked the raw state_dict directly, and these tensors are already like this before load_state_dict, so this seems to come from the checkpoint contents themselves.
Could you please clarify whether this is expected?
Are the official checkpoints intentionally using channel-wise different wavelet filters, or might there be an issue with the uploaded .pth files?
Thanks.
Thanks for the interesting work.
I have a question regarding the official pretrained checkpoints.
From the current source code, my understanding is that
WTConv1duses fixed db1/Haar-style wavelet filters (wt_filter/iwt_filter), shared consistently across channels. I also checked a fresh model (without loading any checkpoint), and this is indeed the case.However, after loading the official checkpoints
synth_only.pthandsynth_and_ucr.pth, I found that in:encoder.layer2.layer.0encoder.layer3.layer.0encoder.layer4.layer.0the
wt_filter/iwt_filterare no longer identical across channels.For example, in
encoder.layer2.layer.0.wt_filter, the first few rows in the checkpoint are:If the current fresh initialization logic is followed, then [0] and [2] should be identical (low filters), and [1] and [3] should also be identical (high filters), but they are not.
I checked the raw
state_dictdirectly, and these tensors are already like this beforeload_state_dict, so this seems to come from the checkpoint contents themselves.Could you please clarify whether this is expected?
Are the official checkpoints intentionally using channel-wise different wavelet filters, or might there be an issue with the uploaded
.pthfiles?Thanks.