Skip to content

Wake Word Model Evaluation #9

@ghost

Description

I'm working on the script to evaluate our wakeword models, and my current approach to calculate FRR is:

  1. Create long stream of audio comprised of test samples containing "hey snips", separated by 1 second of silence.
  2. Set up SpeechPipeline as it would be for input through a microphone, but instead select the long "hey snips" wav as input stream.
  3. Monitor how many times wakeword is detected, compare to number of time it is present in the signal, divide by total duration of wav to get false rejections/hour.

This all seems well and good, and it's clear that we can then adjust the posterior threshold to find the appropriate setting for our desired FRR (or sweep over for evaluation), but thus far the model is not detecting any wakewords using this pipeline. It definitely does when I speak into the microphone, so I'm wondering if this is the best way to go about testing.

Do any of you have thoughts or references I could check out to guide the process?

Metadata

Metadata

Labels

questionFurther information is requested

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions