It seems the Spokestack framework expects an RNN as model and only passes a single frame to the model at a time. In the paper, the WaveNet model expects a time context of 182 frames (1.83s). Will we be creating an alternate version of this code to support longer time contexts for model input?
It seems the Spokestack framework expects an RNN as model and only passes a single frame to the model at a time. In the paper, the WaveNet model expects a time context of 182 frames (1.83s). Will we be creating an alternate version of this code to support longer time contexts for model input?