test(parity): rust runner for silero VAD parity harness by al8n · Pull Request #5 · Findit-AI/silero

al8n · 2026-05-03T06:52:29Z

Adds tests/parity/Cargo.toml and src/main.rs for the silero-parity-runner binary that loads a 16 kHz mono WAV via ffmpeg-next, runs silero::detect_speech with the bundled ONNX model, and emits a JSON segment list. Pairs with the Python runner (next commit) for side-by-side comparison against upstream silero-vad.

The runner uses ffmpeg-next (not hound) for audio loading so the f32 buffer the model sees is byte-identical to what upstream Python silero-vad consumes via torchaudio/ffmpeg, letting the parity score verify both runners decoded the audio the same way before flagging any output divergence as a model issue. Same pattern whispery's parity harness uses.

Cargo.lock for the parity binary lives at tests/parity/Cargo.lock and is gitignored via the existing Cargo.lock rule. The parity crate is excluded from cargo package because nested Cargo.toml workspaces are skipped automatically.

test(parity): python runner + scorer for upstream silero-vad

Adds the upstream Python silero-vad reference side of the parity harness:

python/pyproject.toml: pins silero-vad >= 5.1 and `onnxruntime

= 1.18(theload_silero_vad(onnx=True)` path needs onnxruntime).
python/silero_vad_runner.py: same CLI / same JSON schema as the Rust runner. Defaults to --backend onnx so both runners feed byte-identical bytes to ORT — same model file, same backend — and any IoU disagreement is segmenter logic, not the inference runtime. --backend jit available for measuring runtime drift separately. Audio loading is an ffmpeg shell-out matching WhisperX's load_audio (pcm_s16le -ac 1 -ar 16000 → np.float32 / 32768.0), same byte path as the Rust loader.
python/score.py: sequence-position pairing, per-segment IoU, median + p10/p90 + worst-N report, JSON summary on stdout / --out. Pass/fail: median IoU >= 0.95 AND segment counts match.

test(parity): driver script and README

run.sh brings up the uv venv, runs both runners, and pipes the JSONs through score.py. Accepts a fixture directory (uses clip_16k.wav inside) or a direct WAV path.

run.sh passes --min-silence-ms 132 to the Rust side as a parity override (NOT a crate-default change). The silero crate's SpeechSegmenter::push_probability computes
silence_samples = current_sample - silence_start AFTER the current frame's increment, while upstream Python silero-vad computes it BEFORE — a one-frame (32 ms at 16 kHz / 512-sample windows) off-by-one that causes the crate to close segments after 4 low-prob frames vs Python's 5. Bumping the override by exactly one frame on the crate side restores byte-identical segment counts (verified on all 5 dia parity fixtures: median IoU 1.0000 across the board).

README documents the layout, prerequisites (cargo + uv + ffmpeg, no ORT_DYLIB_PATH needed because ort 2.0.0-rc.12's default features include download-binaries + copy-dylibs), the canonical fixture set (dia's parity fixtures, intentionally not copied), the parameter alignment table (named defaults match upstream silero-vad 6.2.1 exactly), and the off-by-one silence threshold finding in detail.

fix(detector): silence threshold counter matches upstream silero-vad

The silence-counter in SpeechSegmenter::push_probability evaluated silence_samples = current_sample - silence_start AFTER the current frame's contribution had been added to current_sample, while upstream Python silero-vad evaluates the equivalent sil_dur_now = cur_sample - temp_end BEFORE the current frame is consumed. The crate's counter therefore fired one model frame (32 ms at 16 kHz / 512-sample windows) early — at the default
min_silence_duration_ms = 100, the crate closed a segment after 4 consecutive low-probability frames where Python tolerates the dip and closes after 5.

Switch the comparator to frame_start.saturating_sub(silence_start), mirroring Python's cur_sample - temp_end evaluated before the frame is consumed. Same correction applies to the
min_silence_at_max_speech_samples comparator on the same line block, which used the same off-by-one counter.

Audit existing tests:

middle_band_frames_do_not_reset_tentative_end and min_speech_duration_is_checked_before_padding extended their trailing silence runs by one frame so the close still fires via push_probability under the corrected counter.
force_split_during_silence_closes_without_restarting raised its max_speech_duration ceiling by one frame so the max-speech split still fires after max_split_end has been recorded by the silence-counter logic.
All updated tests document the change in their docstrings, citing the parity harness in tests/parity/ as motivation.

Add two new regression tests:

four_frame_silence_dip_does_not_close_segment_at_default_min_silence pins that a 4-frame (128 ms) silence dip is now tolerated.
five_frame_silence_dip_closes_segment_at_default_min_silence pins that 5 consecutive low-prob frames still close, matching upstream.

Discovered by the parity harness in commits dd64c35 / 003e8b6 / da8c0de.

test(parity): drop now-unneeded min-silence override

The --min-silence-ms 132 workaround in run.sh was compensating for an off-by-one in SpeechSegmenter::push_probability's silence counter (silero v0.2.x evaluated silence_samples = current_sample - silence_start after the current frame's increment, while Python evaluates the equivalent cur_sample - temp_end before the current frame is consumed). The crate fix in the previous commit aligns the two semantics, so the runner now uses upstream silero-vad defaults verbatim and parity numbers are unchanged.

Verified on all 5 short fixtures (01_dialogue, 02_pyannote_sample, 03_dual_speaker, 04_three_speaker, 05_four_speaker): median IoU 1.0000 and segment counts match exactly (51/51, 4/4, 14/14, 6/6, 14/14) without the override — same numbers the override produced pre-fix.

The --min-silence-ms flag remains on the runner CLI for advanced users who want to override per-run; only run.sh no longer applies it. README updated to mark the off-by-one finding as fixed in v0.3.0 and preserve the previous analysis as historical context.

chore: bump to 0.3.0 with CHANGELOG entry for the silence threshold fix

The silence-counter fix in SpeechSegmenter::push_probability is a behaviour change for any caller that hand-tuned
min_silence_duration_ms against v0.2.x's response curve. Bumping the minor version (0.2.x → 0.3.0) signals that even though it's strictly a bug fix, the new response curve may require re-tuning at the call site. Default callers do not need to change anything.

CHANGELOG entry covers what changed (silence-counter semantics now match upstream Python silero-vad), why (parity harness uncovered an off-by-one), and the migration note (subtract ~32 ms from hand-tuned min_silence_duration_ms overrides if you want to keep the v0.2.x effective behaviour).

Adds `tests/parity/Cargo.toml` and `src/main.rs` for the `silero-parity-runner` binary that loads a 16 kHz mono WAV via `ffmpeg-next`, runs `silero::detect_speech` with the bundled ONNX model, and emits a JSON segment list. Pairs with the Python runner (next commit) for side-by-side comparison against upstream `silero-vad`. The runner uses `ffmpeg-next` (not `hound`) for audio loading so the f32 buffer the model sees is byte-identical to what upstream Python silero-vad consumes via torchaudio/ffmpeg, letting the parity score verify both runners decoded the audio the same way before flagging any output divergence as a model issue. Same pattern whispery's parity harness uses. Cargo.lock for the parity binary lives at `tests/parity/Cargo.lock` and is gitignored via the existing `Cargo.lock` rule. The parity crate is excluded from `cargo package` because nested `Cargo.toml` workspaces are skipped automatically. test(parity): python runner + scorer for upstream silero-vad Adds the upstream Python `silero-vad` reference side of the parity harness: - `python/pyproject.toml`: pins `silero-vad >= 5.1` and `onnxruntime >= 1.18` (the `load_silero_vad(onnx=True)` path needs onnxruntime). - `python/silero_vad_runner.py`: same CLI / same JSON schema as the Rust runner. Defaults to `--backend onnx` so both runners feed byte-identical bytes to ORT — same model file, same backend — and any IoU disagreement is segmenter logic, not the inference runtime. `--backend jit` available for measuring runtime drift separately. Audio loading is an ffmpeg shell-out matching WhisperX's `load_audio` (`pcm_s16le -ac 1 -ar 16000` → `np.float32 / 32768.0`), same byte path as the Rust loader. - `python/score.py`: sequence-position pairing, per-segment IoU, median + p10/p90 + worst-N report, JSON summary on stdout / `--out`. Pass/fail: median IoU >= 0.95 AND segment counts match. test(parity): driver script and README `run.sh` brings up the uv venv, runs both runners, and pipes the JSONs through `score.py`. Accepts a fixture directory (uses `clip_16k.wav` inside) or a direct WAV path. `run.sh` passes `--min-silence-ms 132` to the Rust side as a parity override (NOT a crate-default change). The silero crate's `SpeechSegmenter::push_probability` computes `silence_samples = current_sample - silence_start` AFTER the current frame's increment, while upstream Python silero-vad computes it BEFORE — a one-frame (32 ms at 16 kHz / 512-sample windows) off-by-one that causes the crate to close segments after 4 low-prob frames vs Python's 5. Bumping the override by exactly one frame on the crate side restores byte-identical segment counts (verified on all 5 dia parity fixtures: median IoU 1.0000 across the board). README documents the layout, prerequisites (cargo + uv + ffmpeg, no ORT_DYLIB_PATH needed because `ort 2.0.0-rc.12`'s default features include `download-binaries` + `copy-dylibs`), the canonical fixture set (dia's parity fixtures, intentionally not copied), the parameter alignment table (named defaults match upstream silero-vad 6.2.1 exactly), and the off-by-one silence threshold finding in detail. fix(detector): silence threshold counter matches upstream silero-vad The silence-counter in `SpeechSegmenter::push_probability` evaluated `silence_samples = current_sample - silence_start` AFTER the current frame's contribution had been added to `current_sample`, while upstream Python `silero-vad` evaluates the equivalent `sil_dur_now = cur_sample - temp_end` BEFORE the current frame is consumed. The crate's counter therefore fired one model frame (32 ms at 16 kHz / 512-sample windows) early — at the default `min_silence_duration_ms = 100`, the crate closed a segment after 4 consecutive low-probability frames where Python tolerates the dip and closes after 5. Switch the comparator to `frame_start.saturating_sub(silence_start)`, mirroring Python's `cur_sample - temp_end` evaluated before the frame is consumed. Same correction applies to the `min_silence_at_max_speech_samples` comparator on the same line block, which used the same off-by-one counter. Audit existing tests: - `middle_band_frames_do_not_reset_tentative_end` and `min_speech_duration_is_checked_before_padding` extended their trailing silence runs by one frame so the close still fires via `push_probability` under the corrected counter. - `force_split_during_silence_closes_without_restarting` raised its `max_speech_duration` ceiling by one frame so the max-speech split still fires after `max_split_end` has been recorded by the silence-counter logic. - All updated tests document the change in their docstrings, citing the parity harness in `tests/parity/` as motivation. Add two new regression tests: - `four_frame_silence_dip_does_not_close_segment_at_default_min_silence` pins that a 4-frame (128 ms) silence dip is now tolerated. - `five_frame_silence_dip_closes_segment_at_default_min_silence` pins that 5 consecutive low-prob frames still close, matching upstream. Discovered by the parity harness in commits dd64c35 / 003e8b6 / da8c0de. test(parity): drop now-unneeded min-silence override The `--min-silence-ms 132` workaround in `run.sh` was compensating for an off-by-one in `SpeechSegmenter::push_probability`'s silence counter (silero v0.2.x evaluated `silence_samples = current_sample - silence_start` after the current frame's increment, while Python evaluates the equivalent `cur_sample - temp_end` before the current frame is consumed). The crate fix in the previous commit aligns the two semantics, so the runner now uses upstream silero-vad defaults verbatim and parity numbers are unchanged. Verified on all 5 short fixtures (01_dialogue, 02_pyannote_sample, 03_dual_speaker, 04_three_speaker, 05_four_speaker): median IoU 1.0000 and segment counts match exactly (51/51, 4/4, 14/14, 6/6, 14/14) without the override — same numbers the override produced pre-fix. The `--min-silence-ms` flag remains on the runner CLI for advanced users who want to override per-run; only `run.sh` no longer applies it. README updated to mark the off-by-one finding as fixed in v0.3.0 and preserve the previous analysis as historical context. chore: bump to 0.3.0 with CHANGELOG entry for the silence threshold fix The silence-counter fix in `SpeechSegmenter::push_probability` is a behaviour change for any caller that hand-tuned `min_silence_duration_ms` against v0.2.x's response curve. Bumping the minor version (0.2.x → 0.3.0) signals that even though it's strictly a bug fix, the new response curve may require re-tuning at the call site. Default callers do not need to change anything. CHANGELOG entry covers what changed (silence-counter semantics now match upstream Python silero-vad), why (parity harness uncovered an off-by-one), and the migration note (subtract ~32 ms from hand-tuned `min_silence_duration_ms` overrides if you want to keep the v0.2.x effective behaviour).

Copilot

Pull request overview

Adds a manual parity harness (Rust + Python) to compare silero’s VAD segmentation against upstream silero-vad, and aligns the Rust segmenter’s silence counter semantics with upstream Python (shipping as v0.3.0).

Changes:

Introduce tests/parity/ runner tooling: Rust silero-parity-runner, Python reference runner + IoU scorer, plus a driver script and README.
Fix an off-by-one in SpeechSegmenter::push_probability silence accounting and update/add regression tests.
Bump crate version to 0.3.0 and document the behavior change in CHANGELOG.md.

Reviewed changes

Copilot reviewed 10 out of 11 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
tests/parity/src/main.rs	New Rust parity runner: ffmpeg decode → `detect_speech` → JSON output
tests/parity/Cargo.toml	New standalone Cargo package for the parity runner
tests/parity/run.sh	Driver script to run Rust + Python runners and score results
tests/parity/python/pyproject.toml	Python env definition for upstream `silero-vad` reference runner
tests/parity/python/silero_vad_runner.py	Python reference runner emitting the same JSON schema as Rust
tests/parity/python/score.py	IoU scoring and pass/fail logic for runner outputs
tests/parity/README.md	Harness documentation, parameter alignment, and historical off-by-one notes
src/detector.rs	Silence-counter semantic fix + updated and new regression tests
Cargo.toml	Version bump to `0.3.0`
CHANGELOG.md	Release notes for the silence-counter behavior change
.gitignore	Ignore parity harness outputs and Python venv artifacts

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Out-of-tree harnesses (the parity runner being the immediate caller) need a way to record the silero crate version they're exercising. Using `env!("CARGO_PKG_VERSION")` in those harnesses resolves to the harness binary's own version, not silero's. Re-exporting the version string from the library lets callers depend on `silero::VERSION` and get the value that actually matches the running detector logic. Surfaced by Copilot PR review of #5 (#5 (comment)).

Two issues from Copilot's PR review of #5: 1. `SILERO_CRATE_VERSION` was sourced from `env!("CARGO_PKG_VERSION")`, which in this binary resolves to the parity runner's own version (`0.0.0`) — not the silero crate version under test. The JSON output was misreporting `silero_crate_version`. Now uses the newly-exposed `silero::VERSION` constant so the JSON records the actual crate version being validated. (#5 (comment)) 2. `ffmpeg_init` stored its initialisation error in a stack-local that was only assigned inside the `Once::call_once` closure. After the first failed call, subsequent invocations would silently return `Ok(())` because the closure no longer ran and the local was always `None`. Switched to a static `OnceLock<Result<(), String>>` so the init outcome is captured once and re-surfaced on every subsequent call — the function now actually behaves idempotently. (#5 (comment)) Verified: `cargo build --release` clean; parity smoke test on `02_pyannote_sample` reports `silero_crate_version=0.3.0`, median IoU 1.0000.

The runner explicitly sets `max_speech_duration_s = math.inf` when `--max-speech-s` is omitted (so the call site records the value actually passed to `get_speech_timestamps`), but the emitted JSON was reading `args.max_speech_s` — i.e., `null` — which contradicted the inline comment and made the output non-self-describing. Switched to writing `kwargs["max_speech_duration_s"]` so the JSON records the effective value (the float when the user provided one, otherwise `math.inf`). Python's `json.dumps` emits the latter as `Infinity`; documented inline that downstream parsers may need non-strict JSON support if they read this field from outside the Python ecosystem. Surfaced by Copilot PR review of #5 (#5 (comment)). Verified: parity smoke test on `02_pyannote_sample` shows `"max_speech_s": Infinity` in the Python JSON; median IoU still 1.0000.

* test(parity): rust runner for silero VAD parity harness Adds `tests/parity/Cargo.toml` and `src/main.rs` for the `silero-parity-runner` binary that loads a 16 kHz mono WAV via `ffmpeg-next`, runs `silero::detect_speech` with the bundled ONNX model, and emits a JSON segment list. Pairs with the Python runner (next commit) for side-by-side comparison against upstream `silero-vad`. The runner uses `ffmpeg-next` (not `hound`) for audio loading so the f32 buffer the model sees is byte-identical to what upstream Python silero-vad consumes via torchaudio/ffmpeg, letting the parity score verify both runners decoded the audio the same way before flagging any output divergence as a model issue. Same pattern whispery's parity harness uses. Cargo.lock for the parity binary lives at `tests/parity/Cargo.lock` and is gitignored via the existing `Cargo.lock` rule. The parity crate is excluded from `cargo package` because nested `Cargo.toml` workspaces are skipped automatically. test(parity): python runner + scorer for upstream silero-vad Adds the upstream Python `silero-vad` reference side of the parity harness: - `python/pyproject.toml`: pins `silero-vad >= 5.1` and `onnxruntime >= 1.18` (the `load_silero_vad(onnx=True)` path needs onnxruntime). - `python/silero_vad_runner.py`: same CLI / same JSON schema as the Rust runner. Defaults to `--backend onnx` so both runners feed byte-identical bytes to ORT — same model file, same backend — and any IoU disagreement is segmenter logic, not the inference runtime. `--backend jit` available for measuring runtime drift separately. Audio loading is an ffmpeg shell-out matching WhisperX's `load_audio` (`pcm_s16le -ac 1 -ar 16000` → `np.float32 / 32768.0`), same byte path as the Rust loader. - `python/score.py`: sequence-position pairing, per-segment IoU, median + p10/p90 + worst-N report, JSON summary on stdout / `--out`. Pass/fail: median IoU >= 0.95 AND segment counts match. test(parity): driver script and README `run.sh` brings up the uv venv, runs both runners, and pipes the JSONs through `score.py`. Accepts a fixture directory (uses `clip_16k.wav` inside) or a direct WAV path. `run.sh` passes `--min-silence-ms 132` to the Rust side as a parity override (NOT a crate-default change). The silero crate's `SpeechSegmenter::push_probability` computes `silence_samples = current_sample - silence_start` AFTER the current frame's increment, while upstream Python silero-vad computes it BEFORE — a one-frame (32 ms at 16 kHz / 512-sample windows) off-by-one that causes the crate to close segments after 4 low-prob frames vs Python's 5. Bumping the override by exactly one frame on the crate side restores byte-identical segment counts (verified on all 5 dia parity fixtures: median IoU 1.0000 across the board). README documents the layout, prerequisites (cargo + uv + ffmpeg, no ORT_DYLIB_PATH needed because `ort 2.0.0-rc.12`'s default features include `download-binaries` + `copy-dylibs`), the canonical fixture set (dia's parity fixtures, intentionally not copied), the parameter alignment table (named defaults match upstream silero-vad 6.2.1 exactly), and the off-by-one silence threshold finding in detail. fix(detector): silence threshold counter matches upstream silero-vad The silence-counter in `SpeechSegmenter::push_probability` evaluated `silence_samples = current_sample - silence_start` AFTER the current frame's contribution had been added to `current_sample`, while upstream Python `silero-vad` evaluates the equivalent `sil_dur_now = cur_sample - temp_end` BEFORE the current frame is consumed. The crate's counter therefore fired one model frame (32 ms at 16 kHz / 512-sample windows) early — at the default `min_silence_duration_ms = 100`, the crate closed a segment after 4 consecutive low-probability frames where Python tolerates the dip and closes after 5. Switch the comparator to `frame_start.saturating_sub(silence_start)`, mirroring Python's `cur_sample - temp_end` evaluated before the frame is consumed. Same correction applies to the `min_silence_at_max_speech_samples` comparator on the same line block, which used the same off-by-one counter. Audit existing tests: - `middle_band_frames_do_not_reset_tentative_end` and `min_speech_duration_is_checked_before_padding` extended their trailing silence runs by one frame so the close still fires via `push_probability` under the corrected counter. - `force_split_during_silence_closes_without_restarting` raised its `max_speech_duration` ceiling by one frame so the max-speech split still fires after `max_split_end` has been recorded by the silence-counter logic. - All updated tests document the change in their docstrings, citing the parity harness in `tests/parity/` as motivation. Add two new regression tests: - `four_frame_silence_dip_does_not_close_segment_at_default_min_silence` pins that a 4-frame (128 ms) silence dip is now tolerated. - `five_frame_silence_dip_closes_segment_at_default_min_silence` pins that 5 consecutive low-prob frames still close, matching upstream. Discovered by the parity harness in commits dd64c35 / 003e8b6 / da8c0de. test(parity): drop now-unneeded min-silence override The `--min-silence-ms 132` workaround in `run.sh` was compensating for an off-by-one in `SpeechSegmenter::push_probability`'s silence counter (silero v0.2.x evaluated `silence_samples = current_sample - silence_start` after the current frame's increment, while Python evaluates the equivalent `cur_sample - temp_end` before the current frame is consumed). The crate fix in the previous commit aligns the two semantics, so the runner now uses upstream silero-vad defaults verbatim and parity numbers are unchanged. Verified on all 5 short fixtures (01_dialogue, 02_pyannote_sample, 03_dual_speaker, 04_three_speaker, 05_four_speaker): median IoU 1.0000 and segment counts match exactly (51/51, 4/4, 14/14, 6/6, 14/14) without the override — same numbers the override produced pre-fix. The `--min-silence-ms` flag remains on the runner CLI for advanced users who want to override per-run; only `run.sh` no longer applies it. README updated to mark the off-by-one finding as fixed in v0.3.0 and preserve the previous analysis as historical context. chore: bump to 0.3.0 with CHANGELOG entry for the silence threshold fix The silence-counter fix in `SpeechSegmenter::push_probability` is a behaviour change for any caller that hand-tuned `min_silence_duration_ms` against v0.2.x's response curve. Bumping the minor version (0.2.x → 0.3.0) signals that even though it's strictly a bug fix, the new response curve may require re-tuning at the call site. Default callers do not need to change anything. CHANGELOG entry covers what changed (silence-counter semantics now match upstream Python silero-vad), why (parity harness uncovered an off-by-one), and the migration note (subtract ~32 ms from hand-tuned `min_silence_duration_ms` overrides if you want to keep the v0.2.x effective behaviour). * feat(lib): expose `silero::VERSION` as a public constant Out-of-tree harnesses (the parity runner being the immediate caller) need a way to record the silero crate version they're exercising. Using `env!("CARGO_PKG_VERSION")` in those harnesses resolves to the harness binary's own version, not silero's. Re-exporting the version string from the library lets callers depend on `silero::VERSION` and get the value that actually matches the running detector logic. Surfaced by Copilot PR review of #5 (#5 (comment)). * fix(parity): record silero crate version + persist ffmpeg init outcome Two issues from Copilot's PR review of #5: 1. `SILERO_CRATE_VERSION` was sourced from `env!("CARGO_PKG_VERSION")`, which in this binary resolves to the parity runner's own version (`0.0.0`) — not the silero crate version under test. The JSON output was misreporting `silero_crate_version`. Now uses the newly-exposed `silero::VERSION` constant so the JSON records the actual crate version being validated. (#5 (comment)) 2. `ffmpeg_init` stored its initialisation error in a stack-local that was only assigned inside the `Once::call_once` closure. After the first failed call, subsequent invocations would silently return `Ok(())` because the closure no longer ran and the local was always `None`. Switched to a static `OnceLock<Result<(), String>>` so the init outcome is captured once and re-surfaced on every subsequent call — the function now actually behaves idempotently. (#5 (comment)) Verified: `cargo build --release` clean; parity smoke test on `02_pyannote_sample` reports `silero_crate_version=0.3.0`, median IoU 1.0000. * fix(parity): record effective max_speech_duration_s in Python output The runner explicitly sets `max_speech_duration_s = math.inf` when `--max-speech-s` is omitted (so the call site records the value actually passed to `get_speech_timestamps`), but the emitted JSON was reading `args.max_speech_s` — i.e., `null` — which contradicted the inline comment and made the output non-self-describing. Switched to writing `kwargs["max_speech_duration_s"]` so the JSON records the effective value (the float when the user provided one, otherwise `math.inf`). Python's `json.dumps` emits the latter as `Infinity`; documented inline that downstream parsers may need non-strict JSON support if they read this field from outside the Python ecosystem. Surfaced by Copilot PR review of #5 (#5 (comment)). Verified: parity smoke test on `02_pyannote_sample` shows `"max_speech_s": Infinity` in the Python JSON; median IoU still 1.0000. ---------

Folds the post-review additions into the unreleased 0.3.0 entry: - ### Added: `silero::VERSION` public constant. - ### Added: `tests/parity/` harness (was previously only described under Verified, not Added). - ### Fixed: three parity-harness bugs surfaced by the Copilot PR review on #5 — ffmpeg-init error swallowing, parity runner reporting its own version instead of silero's, Python runner emitting `null` instead of the effective `max_speech_duration_s`.

al8n requested a review from Copilot May 3, 2026 06:52

Copilot started reviewing on behalf of al8n May 3, 2026 06:55 View session

Copilot AI reviewed May 3, 2026

View reviewed changes

Comment thread tests/parity/src/main.rs Outdated

Comment thread tests/parity/src/main.rs

Comment thread tests/parity/python/silero_vad_runner.py Outdated

uqio added 3 commits May 3, 2026 19:07

uqio merged commit af48dc3 into main May 3, 2026
8 of 11 checks passed

uqio deleted the test/parity branch May 3, 2026 07:20

uqio restored the test/parity branch May 3, 2026 08:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(parity): rust runner for silero VAD parity harness#5

test(parity): rust runner for silero VAD parity harness#5
uqio merged 4 commits intomainfrom
test/parity

al8n commented May 3, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

al8n commented May 3, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants