I'm working on the script to evaluate our wakeword models, and my current approach to calculate FRR is:
- Create long stream of audio comprised of test samples containing "hey snips", separated by 1 second of silence.
- Set up SpeechPipeline as it would be for input through a microphone, but instead select the long "hey snips" wav as input stream.
- Monitor how many times wakeword is detected, compare to number of time it is present in the signal, divide by total duration of wav to get false rejections/hour.
This all seems well and good, and it's clear that we can then adjust the posterior threshold to find the appropriate setting for our desired FRR (or sweep over for evaluation), but thus far the model is not detecting any wakewords using this pipeline. It definitely does when I speak into the microphone, so I'm wondering if this is the best way to go about testing.
Do any of you have thoughts or references I could check out to guide the process?
I'm working on the script to evaluate our wakeword models, and my current approach to calculate FRR is:
This all seems well and good, and it's clear that we can then adjust the posterior threshold to find the appropriate setting for our desired FRR (or sweep over for evaluation), but thus far the model is not detecting any wakewords using this pipeline. It definitely does when I speak into the microphone, so I'm wondering if this is the best way to go about testing.
Do any of you have thoughts or references I could check out to guide the process?