Shared types for the WaveKat audio processing ecosystem.
Warning
Early development. API may change.
| Type | Description |
|---|---|
AudioFrame |
Audio samples with sample rate, accepts i16 and f32 in slice, Vec, or array form |
IntoSamples |
Trait for transparent sample format conversion |
AudioSource / AudioSink |
Async producer/consumer traits — the seam every WaveKat audio pipeline composes against |
codec::g711 |
G.711 μ-law (PCMU) and A-law (PCMA) — telephony codecs for SIP/RTP |
cargo add wavekat-coreuse wavekat_core::AudioFrame;
// From f32 — zero-copy (slice, &Vec<f32>, or array)
let frame = AudioFrame::new(&f32_samples, 16000);
// From i16 — normalizes to f32 [-1.0, 1.0] automatically
let frame = AudioFrame::new(&i16_samples, 16000);
// From an owned Vec — zero-copy, produces AudioFrame<'static>
let frame = AudioFrame::from_vec(vec![0.0f32; 160], 16000);
// Inspect the frame
let samples: &[f32] = frame.samples();
let rate: u32 = frame.sample_rate();
let n: usize = frame.len();
let empty: bool = frame.is_empty();
let secs: f64 = frame.duration_secs();
// Convert a borrowed frame to owned
let owned: AudioFrame<'static> = frame.into_owned();The WaveKat ecosystem standardizes on 16 kHz, mono, f32 [-1.0, 1.0].
AudioFrame handles the conversion so downstream crates don't have to.
Your audio (any format)
|
v
AudioFrame::new(samples, sample_rate)
|
+---> wavekat-vad
+---> wavekat-turn
+---> wavekat-asr (future)
AudioSource and AudioSink are the producer/consumer seam every WaveKat
audio pipeline composes against. Concrete impls (cpal-backed mic/speaker,
agent-driven sources, RTP-driven sinks, …) live in the consuming crates so
that adding a new producer or consumer is "implement the trait" rather than
"rewrite the RTP path."
use wavekat_core::{AudioFrame, AudioSink, AudioSource};
struct FileSource { /* … */ }
struct RtpSink { /* … */ }
impl AudioSource for FileSource {
async fn next_frame(&mut self) -> Option<AudioFrame<'static>> {
// None signals end-of-stream; callers stop draining.
todo!()
}
}
impl AudioSink for RtpSink {
async fn write_frame(&mut self, frame: AudioFrame<'_>) {
// Implementations may drop on backpressure rather than block —
// stalling the RTP path is worse than dropping a frame.
todo!()
}
}PCMU (μ-law) and PCMA (A-law) — the two static codecs every SIP endpoint speaks. One 16-bit PCM sample ↔ one 8-bit codeword; a 20 ms RTP frame at 8 kHz is 160 samples / 160 bytes.
use wavekat_core::codec::g711::{
G711Codec, G711_FRAME_SAMPLES, G711_SAMPLE_RATE,
};
// Resolve the codec from a SIP/RTP payload type.
let codec = G711Codec::from_payload_type(0).unwrap(); // 0 = PCMU, 8 = PCMA
// Encode a 20 ms frame of PCM into G.711 bytes.
let pcm: Vec<i16> = vec![0; G711_FRAME_SAMPLES];
let mut bytes = Vec::with_capacity(G711_FRAME_SAMPLES);
codec.encode(&pcm, &mut bytes);
assert_eq!(bytes.len(), G711_FRAME_SAMPLES);
// Decode the other direction.
let mut decoded = Vec::with_capacity(bytes.len());
codec.decode(&bytes, &mut decoded);
assert_eq!(G711_SAMPLE_RATE, 8000);Adds WAV file I/O via hound.
cargo add wavekat-core --features wavuse wavekat_core::AudioFrame;
// Read a WAV file (f32 or i16, normalized automatically)
let frame = AudioFrame::from_wav("input.wav")?;
println!("{} Hz, {} samples", frame.sample_rate(), frame.len());
// Write a frame to a WAV file (mono f32 PCM)
frame.write_wav("output.wav")?;Adds sample-rate conversion via rubato (high-quality sinc interpolation).
cargo add wavekat-core --features resampleuse wavekat_core::AudioFrame;
// Resample a 44.1 kHz frame to 24 kHz for TTS
let frame = AudioFrame::from_vec(samples, 44100);
let frame = frame.resample(24000)?;
assert_eq!(frame.sample_rate(), 24000);
// No-op if already at the target rate
let same = frame.resample(24000)?;Licensed under Apache 2.0.
Copyright 2026 WaveKat.