feat: wavekat smart-turn variant + zh fine-tune#19
Merged
Conversation
Design for shipping WaveKat-trained Smart Turn weights (zh first, more languages to follow) via the wavekat HuggingFace org, with a language- agnostic repo layout (wavekat/smart-turn-ONNX with per-language subdirs) and runtime hf-hub loading aligned with wavekat-tts. Frozen to Pipecat's ONNX contract so the same weights work from both Rust and Python. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds `SmartTurnVariant` / `SmartTurnLang` enums and a `PipecatSmartTurn::with_variant(...)` constructor that selects between the embedded upstream Pipecat v3 weights and WaveKat language- specialized fine-tunes resolved from `wavekat/smart-turn-ONNX` via `hf-hub`. The new `wavekat-smart-turn` feature is opt-in, implies `pipecat`, and pulls in `hf-hub` 0.5 with the synchronous `ureq` backend so the crate stays runtime-agnostic. WaveKat fine-tunes resolve to `<lang>/smart-turn-cpu.onnx` inside one language-agnostic HF repo, with a `WAVEKAT_TURN_MODEL_DIR` env override for offline / CI builds. Both enums are `#[non_exhaustive]` so new languages don't break callers. Tests cover the `with_variant(PipecatV3)` path and a `WAVEKAT_TURN_MODEL_DIR`-driven local lookup, both without network. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pulls wavekat/smart-turn-ONNX from HuggingFace and runs the zh variant against the existing fixtures. Gated with #[ignore] so the network call is opt-in. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Synthesized with wavekat-tts (Qwen3-TTS, zh) at 24 kHz, resampled to 16 kHz mono via ffmpeg. Mid-utterance clip was found by sweeping trim points and picking one (3.3 s) the model classifies confidently as Unfinished. The smoke test now asserts state classification for all three zh fixtures; English clips remain print-only since the zh fine-tune isn't expected to score them correctly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Merged
wavekat-eason
pushed a commit
that referenced
this pull request
May 11, 2026
## 🤖 New release * `wavekat-turn`: 0.0.7 -> 0.0.8 (✓ API compatible changes) <details><summary><i><b>Changelog</b></i></summary><p> <blockquote> ## [0.0.8](v0.0.7...v0.0.8) - 2026-05-11 ### Added - wavekat smart-turn variant + zh fine-tune ([#19](#19)) </blockquote> </p></details> --- This PR was generated with [release-plz](https://github.com/release-plz/release-plz/). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
SmartTurnVariant::Wavekat(SmartTurnLang::Zh)constructor onPipecatSmartTurn— loads the Mandarin fine-tune fromwavekat/smart-turn-ONNXat runtime viahf-hub, cached under$HF_HOME/hub/. ExistingPipecatSmartTurn::new()still ships the upstream Pipecat v3 weights unchanged.wavekat-smart-turnfeature flag (impliespipecat, addshf-hub). Off by default.WAVEKAT_TURN_MODEL_DIR=<dir>escape hatch resolves<dir>/<lang>/smart-turn-cpu.onnxand skips the network — used for offline / CI builds.wavekat-tts(Qwen3-TTS, 24 kHz → 16 kHz mono via ffmpeg) undertests/fixtures/zh_*.wav.#[ignore]-gated smoke testwavekat_hf_download_smoke(alsomake hf-smoke) downloads the model from HF and asserts the zh fine-tune classifies the three fixtures on the correct side of 0.5.docs/04-plan-wavekat-smart-turn.md.Test plan
cargo fmt --all -- --checkcargo clippy --workspace --all-features -- -D warningsmake ci(all feature-flag combos)make hf-smoke— three zh fixtures classify as expected (finished/finished/unfinished); two English fixtures kept for diagnostics🤖 Generated with Claude Code