Skip to content

Fix audio interleaving and thread safety in screencapture-audio#47

Open
pasrom wants to merge 3 commits intom96-chan:mainfrom
pasrom:fix/screencapture-audio-interleaving
Open

Fix audio interleaving and thread safety in screencapture-audio#47
pasrom wants to merge 3 commits intom96-chan:mainfrom
pasrom:fix/screencapture-audio-interleaving

Conversation

@pasrom
Copy link

@pasrom pasrom commented Feb 26, 2026

Summary

  • Fix planar audio misinterpreted as interleaved: ScreenCaptureKit on macOS 13+ delivers non-interleaved (planar) float32 audio by default ([L0..Ln, R0..Rn]). The previous code used CMBlockBufferGetDataPointer which reads raw bytes sequentially, treating planar data as interleaved. This causes consumers to average adjacent same-channel samples, destroying high-frequency content and producing metallic/robotic-sounding audio.
  • Use serial dispatch queue: Replace the concurrent global queue with a dedicated serial DispatchQueue for audio output handling, preventing interleaved writes to stdout from concurrent callbacks.
  • Use POSIX write() with EINTR handling: Replace FileHandle.standardOutput.write(Data(...)) with direct POSIX write() calls to avoid per-callback Data allocation in the real-time audio path, and disable stdout buffering for immediate pipe delivery.

Root Cause

ScreenCaptureKit delivers audio in non-interleaved (planar) format:

Buffer 0: [L0, L1, ..., Ln]   (left channel)
Buffer 1: [R0, R1, ..., Rn]   (right channel)

The old code dumped these bytes sequentially via CMBlockBufferGetDataPointer, producing [L0..Ln, R0..Rn] on stdout. Any consumer interpreting this as interleaved [L0, R0, L1, R1, ...] would mix adjacent same-channel samples, causing metallic audio.

Fix

  1. Inspect AudioStreamBasicDescription.mFormatFlags for kAudioFormatFlagIsNonInterleaved
  2. When planar: use two-pass CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer to read per-channel AudioBuffers, then interleave to [L0, R0, L1, R1, ...]
  3. When interleaved: pass through directly (fallback for future macOS versions)

Test plan

  • Build with swift build -c release on macOS 15 (Apple Silicon)
  • Capture audio from Microsoft Teams and verify clean output (no metallic quality, no crackling)
  • Verify stderr shows Audio format: 48000 Hz, 2ch, 32-bit, flags=0x... (nonInterleaved=true)

Replace the concurrent global queue (.global(qos: .userInteractive)) with
a dedicated serial DispatchQueue for SCStream audio output handling.

The concurrent queue allows multiple audio callbacks to execute
simultaneously, causing interleaved writes to stdout that corrupt the
PCM byte stream and produce crackling artifacts.

A serial queue with explicit .userInteractive QoS ensures callbacks
execute one at a time, eliminating byte-level interleaving without
requiring locks.
ScreenCaptureKit on macOS 13+ delivers non-interleaved (planar) float32
audio by default:
  Buffer 0: [L0, L1, ..., Ln]
  Buffer 1: [R0, R1, ..., Rn]

The previous code used CMBlockBufferGetDataPointer which reads raw bytes
sequentially, treating planar data as if it were interleaved. This causes
Python consumers doing reshape(-1, 2).mean(axis=1) to average adjacent
same-channel samples, destroying high-frequency content and producing
metallic/robotic-sounding audio.

Fix: inspect the AudioStreamBasicDescription format flags. When
kAudioFormatFlagIsNonInterleaved is set, use the two-pass
CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer pattern to
properly read per-channel AudioBuffers and interleave them to
[L0, R0, L1, R1, ...] before writing to stdout.

The interleaved path is preserved as a fallback for future macOS
versions that may change the default format.
Replace FileHandle.standardOutput.write(Data(...)) with direct POSIX
write() calls to avoid per-callback Data allocation and Foundation
overhead in the real-time audio path.

- Add writeAllToStdout() helper that loops until all bytes are written,
  handles partial writes, and retries on EINTR (signal interruption)
- Disable C stdout buffering with setbuf(stdout, nil) so PCM data
  reaches the pipe consumer immediately
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant