src/sounddiff/
types.py Frozen dataclasses for all analysis results
formats.py Audio I/O and format detection (soundfile)
loudness.py LUFS, true peak, loudness range (pyloudnorm)
spectral.py Frequency band energy comparison (numpy FFT)
temporal.py Cross-correlation and segment detection
detection.py Clipping and silence detection
core.py Pipeline orchestration
cli.py Click CLI entry point
report.py Output formatters (terminal, JSON, HTML)
CLI (click)
-> core.diff(path_a, path_b)
-> formats.load_audio(path_a) -> ndarray + AudioMetadata
-> formats.load_audio(path_b) -> ndarray + AudioMetadata
-> loudness.compare_loudness(data_a, data_b, ...) -> LoudnessComparison
-> spectral.compare_spectral(data_a, data_b, ...) -> SpectralComparison
-> temporal.compare_temporal(data_a, data_b, ...) -> TemporalComparison
-> detection.compare_detection(data_a, data_b, ...) -> DetectionResult
-> DiffResult (all results combined)
-> report.render(result, format)
-> stdout or file
Every result type is a frozen (immutable) dataclass. Once a DiffResult is created, it can't be modified. Computed values like lufs_delta are properties that derive from the stored measurements. This keeps the data model predictable and easy to test.
The four analysis modules (loudness, spectral, temporal, detection) don't import from each other. Each one takes raw audio data and returns a typed result. The core module orchestrates them in sequence. This means you can add a new analysis module without touching the existing ones, and you can run any analysis in isolation.
formats.load_audio() always returns multi-channel data as a 2D array (frames x channels). Each analysis module decides how to handle channels:
- Loudness: passes multi-channel data directly to pyloudnorm
- Spectral: mixes to mono for FFT analysis
- Temporal: mixes to mono for cross-correlation
- Detection (clipping): analyzes each channel independently
- Detection (silence): mixes to mono for RMS calculation
Every function takes its inputs as arguments and returns its outputs. No module-level caches, no singletons, no configuration objects that need to be initialized. This makes the code straightforward to test and reason about.
File validation and error handling happen in formats.py (file loading) and cli.py (user-facing errors). The analysis modules assume they receive valid numpy arrays and focus on computation.
Tests live in tests/ with one file per module. Test audio is generated deterministically by scripts/generate_test_audio.py using fixed parameters (specific frequencies, amplitudes, and durations). This means tests are reproducible across machines without committing audio files to the repo.
The test suite uses:
- pytest: standard test runner and fixtures
- hypothesis: property-based testing for DSP edge cases (e.g., "correlation of a signal with itself is always 1.0")
- click.testing.CliRunner: CLI integration tests without spawning subprocesses
Key fixtures in conftest.py provide pre-loaded audio pairs (identical, loud/quiet, clipped/clean, with/without silence, different lengths, mono) so tests can focus on assertions rather than setup.
- Create
src/sounddiff/your_module.pywith acompare_*function that takes audio data and returns a result dataclass - Add the result dataclass to
types.py - Call your
compare_*function fromcore.pyand include the result inDiffResult - Add rendering logic to
report.pyfor terminal, JSON, and HTML output - Write tests in
tests/test_your_module.py - Add test fixtures to
conftest.pyif needed