fix: eliminate sounddevice race condition in audio playback#195
fix: eliminate sounddevice race condition in audio playback#195octo-patch wants to merge 1 commit intodnhkng:mainfrom
Conversation
…hkng#177) Replace the dual-stream approach (sd.play() + separate monitoring OutputStream) with a single OutputStream that both plays audio and tracks progress. The previous design started playback via sd.play() in start_speaking() and then created a second OutputStream in measure_percentage_spoken() purely for progress monitoring. This caused a race condition in sounddevice's internal finished_callback_wrapper where del self.out fails if the _CallbackContext is partially cleaned up -- producing the AttributeError: _CallbackContext object has no attribute out exception on every startup announcement. The fix: - start_speaking() stores audio data and creates a fresh per-session stop event, but no longer calls sd.play() - measure_percentage_spoken() plays the stored audio through a single OutputStream callback that also tracks position - stop_speaking() sets the stop event instead of calling sd.stop(), allowing the active callback to raise CallbackStop and exit cleanly - Each start_speaking() call gets a new threading.Event, so concurrent playback sessions do not interfere with each other
📝 WalkthroughWalkthroughThe audio I/O module refactors playback from immediate execution to a deferred callback-based approach. Audio is now queued in Changes
Sequence Diagram(s)sequenceDiagram
participant App as Application
participant SDK as SoundDevice I/O
participant CB as Stream Callback
participant SD as sounddevice
rect rgba(255, 200, 100, 0.5)
Note over App,SDK: New Callback-Based Flow
end
App->>SDK: start_speaking(audio_data, sample_rate)
SDK->>SDK: Store in _pending_audio, _pending_sample_rate
SDK->>SDK: Replace _stop_event with new Event()
App->>SDK: measure_percentage_spoken()
SDK->>SDK: Capture current session stop_event
SDK->>SD: Create OutputStream with callback
SD->>CB: Invoke callback (frames_per_chunk)
loop Playback Progress
CB->>CB: Read chunk from _pending_audio
CB->>CB: Increment position counter
CB->>SD: Write to outdata buffer
CB->>CB: Compute percentage = position/total_samples
end
App->>SDK: stop_speaking() [during playback]
SDK->>SDK: Set captured stop_event
CB->>CB: Check stop_event, raise CallbackStop
CB->>SD: Fill outdata with silence
SD->>App: Stream terminates
SDK->>SDK: Clear _pending_audio, reset _is_playing
SDK->>App: Return (interrupted, percentage)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/glados/audio_io/sounddevice_io.py`:
- Line 210: Guard the math that computes max_timeout and progress percentages by
validating sample_rate and total_samples before use and falling back to safe
defaults or using the actual queued audio length; specifically, in the code that
assigns max_timeout (variable max_timeout) and the later percent calculation
that uses sample_rate/total_samples, check that sample_rate > 0 and
total_samples > 0 and if not set max_timeout to a sensible default (e.g., 1.0s
or compute from queued_samples_count / sample_rate after validating both
values), and when possible compute total_samples from the actual queued buffer
length (e.g., queued_samples_count) to derive time and percentage so mismatched
caller values don't misreport progress. Ensure you clamp percentage to [0,100]
and avoid division when denominator is zero by returning 0% or using the queued
length fallback.
- Around line 226-227: The finalization in measure_percentage_spoken()
unconditionally clears shared playback state (_pending_audio and _is_playing)
which can clobber a newer start_speaking() session; fix by tracking a session
token or reference: when start_speaking() enqueues audio, assign a monotonic
session_id or attach the audio object to a local_session variable, pass that
id/reference into measure_percentage_spoken(), and only clear _pending_audio and
set _is_playing = False if the current global session id/reference still matches
the passed one (i.e., guard the cleanup with a equality check against the stored
session id/reference) so old sessions cannot wipe out newer queued sessions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 4e5e4257-bedc-4749-960f-9b71329aa2d7
📒 Files selected for processing (1)
src/glados/audio_io/sounddevice_io.py
| try: | ||
| logger.debug(f"Using sample rate: {sample_rate} Hz, total samples: {total_samples}") | ||
| stream = sd.OutputStream( | ||
| max_timeout = total_samples / sample_rate + 1 |
There was a problem hiding this comment.
Guard sample_rate/total_samples to prevent invalid math and wrong percentages
Line 210 and Line 228 trust caller-provided values. sample_rate <= 0 or total_samples <= 0 can break timeout/percentage math, and mismatches with actual queued audio length can misreport progress.
Suggested defensive calculation using queued audio length
- max_timeout = total_samples / sample_rate + 1
+ if sample_rate <= 0:
+ raise ValueError("sample_rate must be > 0")
+ effective_total = len(audio_data)
+ max_timeout = effective_total / sample_rate + 1
@@
- percentage_played = min(int(position / total_samples * 100), 100)
+ effective_total = len(audio_data)
+ percentage_played = 100 if effective_total == 0 else min(int(position / effective_total * 100), 100)Also applies to: 228-228
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/glados/audio_io/sounddevice_io.py` at line 210, Guard the math that
computes max_timeout and progress percentages by validating sample_rate and
total_samples before use and falling back to safe defaults or using the actual
queued audio length; specifically, in the code that assigns max_timeout
(variable max_timeout) and the later percent calculation that uses
sample_rate/total_samples, check that sample_rate > 0 and total_samples > 0 and
if not set max_timeout to a sensible default (e.g., 1.0s or compute from
queued_samples_count / sample_rate after validating both values), and when
possible compute total_samples from the actual queued buffer length (e.g.,
queued_samples_count) to derive time and percentage so mismatched caller values
don't misreport progress. Ensure you clamp percentage to [0,100] and avoid
division when denominator is zero by returning 0% or using the queued length
fallback.
| self._pending_audio = None | ||
| self._is_playing = False |
There was a problem hiding this comment.
Critical: old session cleanup can wipe a newer queued session
Line 226 and Line 227 unconditionally clear shared playback state. If a new start_speaking() runs before the previous measure_percentage_spoken() exits, the old session can null out the new _pending_audio and flip _is_playing to False.
Suggested guard to avoid cross-session clobber
- self._pending_audio = None
- self._is_playing = False
+ # Only clear shared state if this invocation still owns it
+ if self._pending_audio is audio_data:
+ self._pending_audio = None
+ if self._stop_event is stop_event:
+ self._is_playing = False🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/glados/audio_io/sounddevice_io.py` around lines 226 - 227, The
finalization in measure_percentage_spoken() unconditionally clears shared
playback state (_pending_audio and _is_playing) which can clobber a newer
start_speaking() session; fix by tracking a session token or reference: when
start_speaking() enqueues audio, assign a monotonic session_id or attach the
audio object to a local_session variable, pass that id/reference into
measure_percentage_spoken(), and only clear _pending_audio and set _is_playing =
False if the current global session id/reference still matches the passed one
(i.e., guard the cleanup with a equality check against the stored session
id/reference) so old sessions cannot wipe out newer queued sessions.
Fixes #177
Problem
On startup, GLaDOS prints an exception from sounddevice's internal CFFI callback:
This is a race condition caused by running two concurrent output streams:
start_speaking()callssd.play()to start audio playback (which creates sounddevice's internal stream with afinished_callbackthat doesdel self.out)measure_percentage_spoken()immediately creates a secondsd.OutputStreampurely for progress monitoring (playing silence viaoutdata.fill(0))When the
sd.play()stream finishes and itsfinished_callback_wrapperfires, sounddevice's_CallbackContextobject can be in a partially-cleaned-up state (racing with the second stream's teardown), causingdel self.outto fail withAttributeError.Solution
Consolidate into a single
sd.OutputStreamthat both plays the audio and tracks progress, eliminating the race condition entirely:start_speaking()now stores the audio data and creates a fresh per-sessionthreading.Eventstop signal, but no longer callssd.play()measure_percentage_spoken()plays the stored audio through a singleOutputStreamcallback that also tracks the playback positionstop_speaking()sets the per-session stop event instead of callingsd.stop(), so the active callback raisesCallbackStopand the stream exits cleanlystart_speaking()call gets its ownthreading.Event, so back-to-back playback sessions cannot interfere with each otherTesting
glados tuion Ubuntu 22.04 — startup announcement plays cleanly with noAttributeErrorin the outputinput_mode: audiostill captures microphone input normally alongside playbackSummary by CodeRabbit