Skip to content

feat(#935 #939 #982 #1017): central STT transcript normaliser with Kiwi phonetic and unit alias tables#1070

Open
lokhor wants to merge 1 commit into
mainfrom
feature/stt-normalisation-935-939-982-1017
Open

feat(#935 #939 #982 #1017): central STT transcript normaliser with Kiwi phonetic and unit alias tables#1070
lokhor wants to merge 1 commit into
mainfrom
feature/stt-normalisation-935-939-982-1017

Conversation

@lokhor
Copy link
Copy Markdown
Collaborator

@lokhor lokhor commented Jun 2, 2026

Closes #935, #939, #982, #1017

What

A pure-Kotlin TranscriptNormaliser in core/voice/ that centralises STT post-processing normalisation, running on every transcript before it reaches any downstream consumer (intent router, LLM, RAG, chat).

TranscriptNormaliser (new)

Two replacement tables:

KIWI_PHONETIC_REPLACEMENTS — word-level Kiwi/Māori mishears (#935, #939):

  • fattybaku / farah pacowharepaku
  • tonifa / tanifataniwha
  • comradekumara
  • chakachocka

STT_UNIT_ALIAS_REPLACEMENTS — unit/abbreviation normalisations (#1017):

  • 300 mls / 300 Mills / 200 mils / 100 ml's300 ml (numeric prefix preserved)

All patterns are word-boundary, case-insensitive. Idempotent. No whitespace collapsing.

Backend wiring

  • NativeAndroidVoiceInputController: normalise in extractBestTranscript
  • VoskOfflineVoiceInputController: normalise in parseTranscript / parsePartialTranscript
  • SherpaOnnxVoiceInputController: normalise in resultTextOnline / resultTextOffline (covers all 5 emission sites)

QuickIntentRouter (#982)

LIST_NAME_TAIL_MISHEAR_RE rewrites trailing lost/lust/lastlist only in the regex pre-pass. Chat voice ("I lost my keys") retains the original text via the FallThrough path — verified by a regression test.

Testing

Verification

  • ./gradlew :core:voice:testDebugUnitTest — 18/18 pass
  • ./gradlew :core:skills:testDebugUnitTest — all pass (ListRouting + full suite)
  • ./gradlew :app:assembleDebug — clean build
  • ./gradlew :core:voice:lintDebug :core:skills:lintDebug — clean

Out of scope

…wi phonetic and unit alias tables

- New TranscriptNormaliser in core/voice/ runs on all STT output before downstream consumption
- KIWI_PHONETIC_REPLACEMENTS: fattybaku/farah paco → wharepaku, tonifa/tanifa → taniwha, comrade → kumara, chaka → chocka
- STT_UNIT_ALIAS_REPLACEMENTS: mls/Mills/mils/ml's → ml for numeric quantities
- NativeAndroid, Vosk, Sherpa-ONNX backends wrap extractBestTranscript/resultTextOnline/resultTextOffline
- LIST_NAME_TAIL_MISHEAR_RE in QIR pre-router catches lost/lust/last → list for list commands only
- TranscriptNormaliserTest (18 tests), QuickIntentRouterListRoutingTest regression, E4B fallthrough test

Closes #935, #939, #982, #1017
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

Debug APK ready

Download app-debug.apk

Commit: a5b8225 - Build #2073

Updated on each push. Removed when PR is merged or closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: STT post-processing — transcription normalisation (Kiwi/Māori + units + number words)

1 participant