Capture microphone by polling the clip directly (no AudioSource/OnAudioFilterRead)#305
Open
MaxHeimbrock wants to merge 2 commits into
Open
Capture microphone by polling the clip directly (no AudioSource/OnAudioFilterRead)#305MaxHeimbrock wants to merge 2 commits into
MaxHeimbrock wants to merge 2 commits into
Conversation
MicrophoneSource captured by playing the looping mic AudioClip through an AudioSource and tapping the DSP output in AudioProbe.OnAudioFilterRead. That path has two unsynchronized clocks — the mic hardware clock (fills the clip) and the audio-output clock (plays it) — so the read cursor drifts against the write cursor and produces periodic gaps (choppy audio). It also resampled the mic to the output rate and ran capture work on the real-time audio thread. Read the mic clip's ring buffer directly instead: each frame, read the new samples between the last read position and Microphone.GetPosition (splitting at the ring wrap), downmix to mono, and push to the native source. No AudioSource, no playback cursor, no drift. Capture runs on the main thread; the native source's queue absorbs the per-frame pacing. The capture rate is resolved before start from Microphone.GetDeviceCaps (DefaultMicrophoneSampleRate clamped to the device's supported range) and the native source is created at that rate/mono via a new explicit-format RtcAudioSource constructor, so pushed frames always match it. If the device opens at a different rate than expected, capture is skipped with a warning rather than pushing a mismatch. The MicrophoneSource(deviceName, sourceObject) signature is kept for compatibility; sourceObject is no longer used. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
In some device states (observed with a Bluetooth headset) Unity misreports the microphone clip's sample rate: clip.frequency/clip.samples claim 16000 while Microphone.GetPosition advances at the device's true ~51kHz, so the clip is filled ~3x faster than its label. Pushing those samples labeled as the (wrong) declared rate overfed the native source's 1-second buffer, which then rejected ~2/3 of frames with "InvalidState - failed to capture frame". Stop trusting clip.frequency. Configure the native source at a fixed 48kHz mono, measure the true capture rate at startup from how fast GetPosition advances (refined with an EMA to track slow drift), and resample the captured audio from that measured rate to 48kHz with a streaming linear resampler before pushing. We then push exactly 48kHz/s, matching the native drain rate, so the buffer no longer overruns and the audio is correctly pitched. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
MicrophoneSourcecaptured the mic by playing its loopingAudioClipthrough anAudioSourceand tapping the DSP output inAudioProbe.OnAudioFilterRead. That path has two unsynchronized clocks — the mic hardware clock (fills the clip) and the audio-output/DSP clock (plays it back). With no sync between the write cursor and the playback read cursor, they drift, and when they cross you get periodic gaps = choppy audio. It also resampled the mic to the output rate and ran the capture/encode work on the real-time audio thread.Change
Read the mic clip's ring buffer directly instead of playing it:
Microphone.GetPosition(splitting at the ring-buffer wrap so eachAudioClip.GetDataread is contiguous), downmixes to mono if needed, and pushes to the native source. NoAudioSource, noAudioProbe, no playback cursor → no drift.Microphone.GetDeviceCaps(DefaultMicrophoneSampleRateclamped into the device's supported range), and the native source is created at that rate / mono via a new explicit-formatRtcAudioSource(type, sampleRate, channels)constructor — so pushed frames always match the native source and never trip a rate/channel mismatch. We don't assume the requested rate is honored: we readclip.frequencyand, if it differs (e.g. the device changed since construction), skip capture with a warning instead of resampling.RtcAudioSource's existing(int channels, RtcAudioSourceType)constructor now delegates to the new explicit-format one; no behavior change for other sources. TheMicrophoneSource(deviceName, sourceObject)signature is kept for compatibility (sourceObjectis no longer used).Scope / trade-offs
MicrophoneSource.csand a small additive constructor inRtcAudioSource.cs.Track.cs/Participant.cs/MeetManager.csunchanged.clip.frequencywon't match and capture is skipped with a warning until the track is restarted. (Separate concern from this PR.)Verification
Assembly-CSharpall compile clean.Utils.Infolog reportsclip.frequency/clip.channels/native rate for confirmation.🤖 Generated with Claude Code