Fast streaming audio resampling for Rust, focused on x86/x86_64, AArch64 ARM, and RISC-V CPUs.
The crate exposes a reusable library by default. WAV CLI support is optional and gated behind the cli feature so library users do not pull hound.
use fast_audio_resampler::{FirBackend, Quality, Resampler, ResamplerConfig};
let config = ResamplerConfig {
input_rate: 44_100,
output_rate: 48_000,
channels: 2,
quality: Quality::Balanced,
backend: FirBackend::Auto,
max_input_frames_per_chunk: None,
};
let mut resampler = Resampler::<f32>::new(config)?;
let mut output = Vec::new();
resampler.process(&input_samples, &mut output)?;
resampler.finish(&mut output)?;
# Ok::<(), Box<dyn std::error::Error>>(())CLI:
cargo run --features cli -- --in input.wav --out output.wav --rate 48000Python bindings are available through PyO3 for Python 3.9+:
maturin develop --features python-extensionfrom fast_audio_resampler import F32Resampler
resampler = F32Resampler(44_100, 48_000, 2, quality="balanced", backend="auto")
output, stats = resampler.process(input_samples)
tail, tail_stats = resampler.finish()
output.extend(tail)Criterion benchmark results from cargo bench --bench resampler. Each one-shot case processes one second of input audio. Times are medians; lower is better.
Exact 8k <-> 16k conversions use two different engines depending on quality:
Quality::Fast: polyphase IIR all-pass path.Quality::BalancedandQuality::Best: FIR half-band path.- Other ratios: windowed-sinc polyphase FIR path.
Quality::Fast IIR results on x86_64:
| Format | Ratio | Channels | FIR Backend | Mode | Median |
|---|---|---|---|---|---|
f32 |
8k -> 16k | 1 | scalar | one-shot | 72.741 us |
i16 |
8k -> 16k | 1 | scalar | one-shot | 122.27 us |
f32 |
8k -> 16k | 1 | auto | one-shot | 75.437 us |
i16 |
8k -> 16k | 1 | auto | one-shot | 122.11 us |
f32 |
8k -> 16k | 2 | scalar | one-shot | 113.57 us |
i16 |
8k -> 16k | 2 | scalar | one-shot | 215.01 us |
f32 |
8k -> 16k | 2 | auto | one-shot | 114.21 us |
i16 |
8k -> 16k | 2 | auto | one-shot | 209.43 us |
f32 |
16k -> 8k | 1 | scalar | one-shot | 132.63 us |
i16 |
16k -> 8k | 1 | scalar | one-shot | 187.26 us |
f32 |
16k -> 8k | 1 | auto | one-shot | 133.90 us |
i16 |
16k -> 8k | 1 | auto | one-shot | 177.19 us |
f32 |
16k -> 8k | 2 | scalar | one-shot | 190.97 us |
i16 |
16k -> 8k | 2 | scalar | one-shot | 264.89 us |
f32 |
16k -> 8k | 2 | auto | one-shot | 192.58 us |
i16 |
16k -> 8k | 2 | auto | one-shot | 241.18 us |
f32 |
8k -> 16k | 2 | auto | streaming, 64-frame chunks | 183.40 us |
i16 |
8k -> 16k | 2 | auto | streaming, 64-frame chunks | 273.88 us |
f32 |
16k -> 8k | 2 | auto | streaming, 64-frame chunks | 288.59 us |
i16 |
16k -> 8k | 2 | auto | streaming, 64-frame chunks | 381.17 us |
Quality::Balanced FIR half-band results on x86_64:
| Format | Ratio | Channels | FIR Backend | Mode | Median |
|---|---|---|---|---|---|
f32 |
8k -> 16k | 1 | scalar | one-shot | 410.61 us |
i16 |
8k -> 16k | 1 | scalar | one-shot | 422.92 us |
f32 |
8k -> 16k | 1 | auto | one-shot | 407.42 us |
i16 |
8k -> 16k | 1 | auto | one-shot | 411.30 us |
f32 |
8k -> 16k | 2 | scalar | one-shot | 784.18 us |
i16 |
8k -> 16k | 2 | scalar | one-shot | 854.23 us |
f32 |
8k -> 16k | 2 | auto | one-shot | 804.98 us |
i16 |
8k -> 16k | 2 | auto | one-shot | 890.49 us |
f32 |
16k -> 8k | 1 | scalar | one-shot | 403.95 us |
i16 |
16k -> 8k | 1 | scalar | one-shot | 417.35 us |
f32 |
16k -> 8k | 1 | auto | one-shot | 427.03 us |
i16 |
16k -> 8k | 1 | auto | one-shot | 412.09 us |
f32 |
16k -> 8k | 2 | scalar | one-shot | 794.36 us |
i16 |
16k -> 8k | 2 | scalar | one-shot | 861.91 us |
f32 |
16k -> 8k | 2 | auto | one-shot | 834.62 us |
i16 |
16k -> 8k | 2 | auto | one-shot | 853.38 us |
f32 |
8k -> 16k | 2 | auto | streaming, 64-frame chunks | 1.0462 ms |
i16 |
8k -> 16k | 2 | auto | streaming, 64-frame chunks | 1.2621 ms |
f32 |
16k -> 8k | 2 | auto | streaming, 64-frame chunks | 1.1097 ms |
i16 |
16k -> 8k | 2 | auto | streaming, 64-frame chunks | 1.0886 ms |
General-ratio Quality::Balanced FIR results on x86_64:
| Format | Ratio | Channels | FIR Backend | Mode | Median |
|---|---|---|---|---|---|
f32 |
44.1k -> 48k | 1 | scalar | one-shot | 6.3880 ms |
i16 |
44.1k -> 48k | 1 | scalar | one-shot | 6.4852 ms |
f32 |
44.1k -> 48k | 1 | auto | one-shot | 5.8639 ms |
i16 |
44.1k -> 48k | 1 | auto | one-shot | 5.6396 ms |
f32 |
44.1k -> 48k | 2 | scalar | one-shot | 13.026 ms |
i16 |
44.1k -> 48k | 2 | scalar | one-shot | 14.342 ms |
f32 |
44.1k -> 48k | 2 | auto | one-shot | 14.945 ms |
i16 |
44.1k -> 48k | 2 | auto | one-shot | 10.145 ms |
f32 |
48k -> 44.1k | 1 | scalar | one-shot | 5.6738 ms |
i16 |
48k -> 44.1k | 1 | scalar | one-shot | 6.0230 ms |
f32 |
48k -> 44.1k | 1 | auto | one-shot | 5.2813 ms |
i16 |
48k -> 44.1k | 1 | auto | one-shot | 6.0168 ms |
f32 |
48k -> 44.1k | 2 | scalar | one-shot | 12.333 ms |
i16 |
48k -> 44.1k | 2 | scalar | one-shot | 10.919 ms |
f32 |
48k -> 44.1k | 2 | auto | one-shot | 10.069 ms |
i16 |
48k -> 44.1k | 2 | auto | one-shot | 9.4516 ms |
f32 |
48k -> 44.1k | 2 | auto | streaming, 64-frame chunks | 13.213 ms |
AArch64 ARM benchmarks should be regenerated with the current IIR/FIR split. The crate includes NEON FIR kernels and an IIR stereo NEON backend path, both selected where supported.
- Uses windowed-sinc polyphase FIR resampling for arbitrary sample-rate ratios.
- Uses FIR half-band filtering for exact
8000 <-> 16000atQuality::BalancedandQuality::Best. - Uses a polyphase IIR all-pass path for exact
8000 <-> 16000atQuality::Fast. - Supports
f32andi16sample paths. - Uses runtime CPU feature detection instead of CPU vendor checks.
- Uses AVX2/FMA, AVX-512, and AArch64 NEON intrinsics for FIR
f32where available. - Uses RISC-V RVV 1.0 FIR kernels on
riscv64builds compiled with-C target-feature=+v. - Uses a Q15 fixed-point
i16path with AVX2_mm256_madd_epi16, AArch64 NEON widening multiply, or RISC-V RVV widening multiply-accumulate on supported CPUs. - Keeps FIR backend naming explicit with
FirBackendandSelectedFirBackend; deprecatedBackendaliases remain for compatibility. - Keeps IIR backend selection separate from FIR backend selection. IIR currently has scalar, x86 SSE2, AArch64 NEON, and RVV-gated stereo all-pass kernels, with conservative auto-selection based on benchmark behavior.
- Keeps RISC-V RVV selection compile-time gated because stable Rust does not yet provide portable runtime detection for the vector extension.
- Stores FIR coefficients in phase-major aligned storage for cache-friendly reads.
- Uses per-channel ring buffers and IIR state for streaming history, avoiding steady-state buffer shifting.
- Keeps the public API stable while hiding FIR backend and buffer details internally.
Let:
N= input frames processedM= output frames producedC= channel countT= FIR tap count selected byQualityS= fixed IIR all-pass stage count for the exact-ratio fast path
Construction:
- Time:
O(P * T), wherePis the number of polyphase coefficient phases. - Space:
O(P * T + C * T)for FIR coefficient tables and per-channel history. - Exact
8k <-> 16kQuality::FastIIR construction uses fixed coefficient/state storage per channel instead of a phase table.
Processing:
- FIR time:
O(M * C * T)scalar work. - IIR exact-ratio fast time:
O(M * C * S), whereSis small and fixed. - FIR SIMD reduces the constant factor by processing multiple taps per instruction.
- IIR stereo backends can process left/right all-pass lanes together where the target architecture supports it.
- Streaming append is
O(N * C). - Ring-buffer history discard is
O(C)and does not move sample data.
Output size:
- Approximately
ceil(N * output_rate / input_rate)frames afterfinish.
- Default: library only, no WAV dependency.
cli: enables the WAV command-line tool and the optionalhounddependency.python: enables testable PyO3 bindings with the Python 3.9 stable ABI.python-extension: enablespythonplus PyO3 extension-module linking for Python package builds.