fast-audio-resampler

Fast streaming audio resampling for Rust, focused on x86/x86_64, AArch64 ARM, and RISC-V CPUs.

The crate exposes a reusable library by default. WAV CLI support is optional and gated behind the cli feature so library users do not pull hound.

Usage

use fast_audio_resampler::{FirBackend, Quality, Resampler, ResamplerConfig};

let config = ResamplerConfig {
    input_rate: 44_100,
    output_rate: 48_000,
    channels: 2,
    quality: Quality::Balanced,
    backend: FirBackend::Auto,
    max_input_frames_per_chunk: None,
};

let mut resampler = Resampler::<f32>::new(config)?;
let mut output = Vec::new();
resampler.process(&input_samples, &mut output)?;
resampler.finish(&mut output)?;
# Ok::<(), Box<dyn std::error::Error>>(())

CLI:

cargo run --features cli -- --in input.wav --out output.wav --rate 48000

Python bindings are available through PyO3 for Python 3.9+:

maturin develop --features python-extension

from fast_audio_resampler import F32Resampler

resampler = F32Resampler(44_100, 48_000, 2, quality="balanced", backend="auto")
output, stats = resampler.process(input_samples)
tail, tail_stats = resampler.finish()
output.extend(tail)

Benchmarks

Criterion benchmark results from cargo bench --bench resampler. Each one-shot case processes one second of input audio. Times are medians; lower is better.

Exact 8k <-> 16k conversions use two different engines depending on quality:

Quality::Fast: polyphase IIR all-pass path.
Quality::Balanced and Quality::Best: FIR half-band path.
Other ratios: windowed-sinc polyphase FIR path.

x86_64

Quality::Fast IIR results on x86_64:

Format	Ratio	Channels	FIR Backend	Mode	Median
`f32`	8k -> 16k	1	scalar	one-shot	72.741 us
`i16`	8k -> 16k	1	scalar	one-shot	122.27 us
`f32`	8k -> 16k	1	auto	one-shot	75.437 us
`i16`	8k -> 16k	1	auto	one-shot	122.11 us
`f32`	8k -> 16k	2	scalar	one-shot	113.57 us
`i16`	8k -> 16k	2	scalar	one-shot	215.01 us
`f32`	8k -> 16k	2	auto	one-shot	114.21 us
`i16`	8k -> 16k	2	auto	one-shot	209.43 us
`f32`	16k -> 8k	1	scalar	one-shot	132.63 us
`i16`	16k -> 8k	1	scalar	one-shot	187.26 us
`f32`	16k -> 8k	1	auto	one-shot	133.90 us
`i16`	16k -> 8k	1	auto	one-shot	177.19 us
`f32`	16k -> 8k	2	scalar	one-shot	190.97 us
`i16`	16k -> 8k	2	scalar	one-shot	264.89 us
`f32`	16k -> 8k	2	auto	one-shot	192.58 us
`i16`	16k -> 8k	2	auto	one-shot	241.18 us
`f32`	8k -> 16k	2	auto	streaming, 64-frame chunks	183.40 us
`i16`	8k -> 16k	2	auto	streaming, 64-frame chunks	273.88 us
`f32`	16k -> 8k	2	auto	streaming, 64-frame chunks	288.59 us
`i16`	16k -> 8k	2	auto	streaming, 64-frame chunks	381.17 us

Quality::Balanced FIR half-band results on x86_64:

Format	Ratio	Channels	FIR Backend	Mode	Median
`f32`	8k -> 16k	1	scalar	one-shot	410.61 us
`i16`	8k -> 16k	1	scalar	one-shot	422.92 us
`f32`	8k -> 16k	1	auto	one-shot	407.42 us
`i16`	8k -> 16k	1	auto	one-shot	411.30 us
`f32`	8k -> 16k	2	scalar	one-shot	784.18 us
`i16`	8k -> 16k	2	scalar	one-shot	854.23 us
`f32`	8k -> 16k	2	auto	one-shot	804.98 us
`i16`	8k -> 16k	2	auto	one-shot	890.49 us
`f32`	16k -> 8k	1	scalar	one-shot	403.95 us
`i16`	16k -> 8k	1	scalar	one-shot	417.35 us
`f32`	16k -> 8k	1	auto	one-shot	427.03 us
`i16`	16k -> 8k	1	auto	one-shot	412.09 us
`f32`	16k -> 8k	2	scalar	one-shot	794.36 us
`i16`	16k -> 8k	2	scalar	one-shot	861.91 us
`f32`	16k -> 8k	2	auto	one-shot	834.62 us
`i16`	16k -> 8k	2	auto	one-shot	853.38 us
`f32`	8k -> 16k	2	auto	streaming, 64-frame chunks	1.0462 ms
`i16`	8k -> 16k	2	auto	streaming, 64-frame chunks	1.2621 ms
`f32`	16k -> 8k	2	auto	streaming, 64-frame chunks	1.1097 ms
`i16`	16k -> 8k	2	auto	streaming, 64-frame chunks	1.0886 ms

General-ratio Quality::Balanced FIR results on x86_64:

Format	Ratio	Channels	FIR Backend	Mode	Median
`f32`	44.1k -> 48k	1	scalar	one-shot	6.3880 ms
`i16`	44.1k -> 48k	1	scalar	one-shot	6.4852 ms
`f32`	44.1k -> 48k	1	auto	one-shot	5.8639 ms
`i16`	44.1k -> 48k	1	auto	one-shot	5.6396 ms
`f32`	44.1k -> 48k	2	scalar	one-shot	13.026 ms
`i16`	44.1k -> 48k	2	scalar	one-shot	14.342 ms
`f32`	44.1k -> 48k	2	auto	one-shot	14.945 ms
`i16`	44.1k -> 48k	2	auto	one-shot	10.145 ms
`f32`	48k -> 44.1k	1	scalar	one-shot	5.6738 ms
`i16`	48k -> 44.1k	1	scalar	one-shot	6.0230 ms
`f32`	48k -> 44.1k	1	auto	one-shot	5.2813 ms
`i16`	48k -> 44.1k	1	auto	one-shot	6.0168 ms
`f32`	48k -> 44.1k	2	scalar	one-shot	12.333 ms
`i16`	48k -> 44.1k	2	scalar	one-shot	10.919 ms
`f32`	48k -> 44.1k	2	auto	one-shot	10.069 ms
`i16`	48k -> 44.1k	2	auto	one-shot	9.4516 ms
`f32`	48k -> 44.1k	2	auto	streaming, 64-frame chunks	13.213 ms

AArch64 ARM

AArch64 ARM benchmarks should be regenerated with the current IIR/FIR split. The crate includes NEON FIR kernels and an IIR stereo NEON backend path, both selected where supported.

Design Choices

Uses windowed-sinc polyphase FIR resampling for arbitrary sample-rate ratios.
Uses FIR half-band filtering for exact 8000 <-> 16000 at Quality::Balanced and Quality::Best.
Uses a polyphase IIR all-pass path for exact 8000 <-> 16000 at Quality::Fast.
Supports f32 and i16 sample paths.
Uses runtime CPU feature detection instead of CPU vendor checks.
Uses AVX2/FMA, AVX-512, and AArch64 NEON intrinsics for FIR f32 where available.
Uses RISC-V RVV 1.0 FIR kernels on riscv64 builds compiled with -C target-feature=+v.
Uses a Q15 fixed-point i16 path with AVX2 _mm256_madd_epi16, AArch64 NEON widening multiply, or RISC-V RVV widening multiply-accumulate on supported CPUs.
Keeps FIR backend naming explicit with FirBackend and SelectedFirBackend; deprecated Backend aliases remain for compatibility.
Keeps IIR backend selection separate from FIR backend selection. IIR currently has scalar, x86 SSE2, AArch64 NEON, and RVV-gated stereo all-pass kernels, with conservative auto-selection based on benchmark behavior.
Keeps RISC-V RVV selection compile-time gated because stable Rust does not yet provide portable runtime detection for the vector extension.
Stores FIR coefficients in phase-major aligned storage for cache-friendly reads.
Uses per-channel ring buffers and IIR state for streaming history, avoiding steady-state buffer shifting.
Keeps the public API stable while hiding FIR backend and buffer details internally.

Complexity

Let:

N = input frames processed
M = output frames produced
C = channel count
T = FIR tap count selected by Quality
S = fixed IIR all-pass stage count for the exact-ratio fast path

Construction:

Time: O(P * T), where P is the number of polyphase coefficient phases.
Space: O(P * T + C * T) for FIR coefficient tables and per-channel history.
Exact 8k <-> 16k Quality::Fast IIR construction uses fixed coefficient/state storage per channel instead of a phase table.

Processing:

FIR time: O(M * C * T) scalar work.
IIR exact-ratio fast time: O(M * C * S), where S is small and fixed.
FIR SIMD reduces the constant factor by processing multiple taps per instruction.
IIR stereo backends can process left/right all-pass lanes together where the target architecture supports it.
Streaming append is O(N * C).
Ring-buffer history discard is O(C) and does not move sample data.

Output size:

Approximately ceil(N * output_rate / input_rate) frames after finish.

Features

Default: library only, no WAV dependency.
cli: enables the WAV command-line tool and the optional hound dependency.
python: enables testable PyO3 bindings with the Python 3.9 stable ABI.
python-extension: enables python plus PyO3 extension-module linking for Python package builds.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.github/workflows		.github/workflows
benches		benches
docs		docs
src		src
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

fast-audio-resampler

Usage

Benchmarks

x86_64

AArch64 ARM

Design Choices

Complexity

Features

About

Uh oh!

Releases 1

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

fast-audio-resampler

Usage

Benchmarks

x86_64

AArch64 ARM

Design Choices

Complexity

Features

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Contributors

Uh oh!

Languages