fix(#1057): fast-open VAD gate for wake word detection#1068
Merged
Conversation
d5012aa to
2ed6187
Compare
- Replace 3-frame debounce open with fast-open/slow-close semantics: a single above-threshold frame immediately resets silence counter (opens gate in 80ms instead of 240ms) - Keep 3-frame debounce for closing: only after 3 consecutive silent frames does the silence timer begin accumulating - Lower silenceRmsThreshold 600->300 to catch quieter speech - Reduce maxSilenceSkipSeconds 3.0->1.0 for more frequent periodic checks during gating - Update WakeWordSilenceGateTest for new 1s skip interval
2ed6187 to
d92fcca
Compare
Debug APK readyCommit: Updated on each push. Removed when PR is merged or closed. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Wake word detection is inconsistent — often need to say "hey jandal" 2-3 times. The VAD gate requires 3 consecutive frames (240ms) above RMS threshold before opening, causing the classifier to miss speech onset. During gated silence, inference runs only once every 3 seconds.
Fix
Three changes to
OnnxWakeWordDetector.kt+WakeWordPreferences.kt:silenceRmsThreshold600→300 to catch quieter speech at normal distancemaxSilenceSkipSeconds3.0→1.0 during gated silenceOracle code review: no correctness issues.
Test results
:core:voice:test✓:core:voice:compileDebugKotlin✓