Skip to content

fix: eliminate sounddevice race condition in audio playback#195

Open
octo-patch wants to merge 1 commit intodnhkng:mainfrom
octo-patch:fix/issue-177-sounddevice-race-condition
Open

fix: eliminate sounddevice race condition in audio playback#195
octo-patch wants to merge 1 commit intodnhkng:mainfrom
octo-patch:fix/issue-177-sounddevice-race-condition

Conversation

@octo-patch
Copy link
Copy Markdown
Contributor

@octo-patch octo-patch commented Apr 6, 2026

Fixes #177

Problem

On startup, GLaDOS prints an exception from sounddevice's internal CFFI callback:

Exception ignored from cffi callback <function _StreamBase.__init__.<locals>.finished_callback_wrapper ...>:
AttributeError: '_CallbackContext' object has no attribute 'out'

This is a race condition caused by running two concurrent output streams:

  1. start_speaking() calls sd.play() to start audio playback (which creates sounddevice's internal stream with a finished_callback that does del self.out)
  2. measure_percentage_spoken() immediately creates a second sd.OutputStream purely for progress monitoring (playing silence via outdata.fill(0))

When the sd.play() stream finishes and its finished_callback_wrapper fires, sounddevice's _CallbackContext object can be in a partially-cleaned-up state (racing with the second stream's teardown), causing del self.out to fail with AttributeError.

Solution

Consolidate into a single sd.OutputStream that both plays the audio and tracks progress, eliminating the race condition entirely:

  • start_speaking() now stores the audio data and creates a fresh per-session threading.Event stop signal, but no longer calls sd.play()
  • measure_percentage_spoken() plays the stored audio through a single OutputStream callback that also tracks the playback position
  • stop_speaking() sets the per-session stop event instead of calling sd.stop(), so the active callback raises CallbackStop and the stream exits cleanly
  • Each start_speaking() call gets its own threading.Event, so back-to-back playback sessions cannot interfere with each other

Testing

  • Ran glados tui on Ubuntu 22.04 — startup announcement plays cleanly with no AttributeError in the output
  • Interrupt (speaking over GLaDOS) correctly stops playback and returns the right percentage-played value
  • Confirmed input_mode: audio still captures microphone input normally alongside playback

Summary by CodeRabbit

  • Refactor
    • Enhanced internal audio playback mechanisms with improved queuing and interruption handling for more reliable audio session management.

…hkng#177)

Replace the dual-stream approach (sd.play() + separate monitoring OutputStream)
with a single OutputStream that both plays audio and tracks progress.

The previous design started playback via sd.play() in start_speaking() and then
created a second OutputStream in measure_percentage_spoken() purely for progress
monitoring. This caused a race condition in sounddevice's internal
finished_callback_wrapper where del self.out fails if the _CallbackContext is
partially cleaned up -- producing the AttributeError: _CallbackContext object
has no attribute out exception on every startup announcement.

The fix:
- start_speaking() stores audio data and creates a fresh per-session stop event,
  but no longer calls sd.play()
- measure_percentage_spoken() plays the stored audio through a single
  OutputStream callback that also tracks position
- stop_speaking() sets the stop event instead of calling sd.stop(), allowing the
  active callback to raise CallbackStop and exit cleanly
- Each start_speaking() call gets a new threading.Event, so concurrent playback
  sessions do not interfere with each other
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 6, 2026

📝 Walkthrough

Walkthrough

The audio I/O module refactors playback from immediate execution to a deferred callback-based approach. Audio is now queued in start_speaking(), played asynchronously via sd.OutputStream callback in measure_percentage_spoken(), with progress tracked and interruption signaled through event-based coordination rather than direct sd.stop() calls.

Changes

Cohort / File(s) Summary
Audio Playback Callback Refactoring
src/glados/audio_io/sounddevice_io.py
Modified start_speaking() to queue audio in pending fields; refactored measure_percentage_spoken() to execute playback via sd.OutputStream callback with position tracking and timeout-based interruption; changed stop_speaking() to signal via event instead of direct sd.stop() call. Captures session-specific stop events to isolate callback lifecycle from subsequent calls.

Sequence Diagram(s)

sequenceDiagram
    participant App as Application
    participant SDK as SoundDevice I/O
    participant CB as Stream Callback
    participant SD as sounddevice

    rect rgba(255, 200, 100, 0.5)
    Note over App,SDK: New Callback-Based Flow
    end

    App->>SDK: start_speaking(audio_data, sample_rate)
    SDK->>SDK: Store in _pending_audio, _pending_sample_rate
    SDK->>SDK: Replace _stop_event with new Event()

    App->>SDK: measure_percentage_spoken()
    SDK->>SDK: Capture current session stop_event
    SDK->>SD: Create OutputStream with callback
    SD->>CB: Invoke callback (frames_per_chunk)
    
    loop Playback Progress
        CB->>CB: Read chunk from _pending_audio
        CB->>CB: Increment position counter
        CB->>SD: Write to outdata buffer
        CB->>CB: Compute percentage = position/total_samples
    end

    App->>SDK: stop_speaking() [during playback]
    SDK->>SDK: Set captured stop_event
    CB->>CB: Check stop_event, raise CallbackStop
    CB->>SD: Fill outdata with silence
    SD->>App: Stream terminates

    SDK->>SDK: Clear _pending_audio, reset _is_playing
    SDK->>App: Return (interrupted, percentage)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • PR #149: Modifies the same measure_percentage_spoken() stream callback to use sounddevice's callback semantics with proper outdata handling and CallbackStop termination—directly related implementation patterns.

Poem

🐰 A queue of whispers, now deferred with care,
Callbacks dance smoothly through the audio air,
No more AttributeErrors in context frames,
Events orchestrate the playback games,
Progress counted chunk by precious chunk! 🎵

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix: eliminate sounddevice race condition in audio playback' directly and clearly summarizes the main change—fixing a race condition between concurrent sounddevice output streams.
Linked Issues check ✅ Passed The code changes fully address the root cause: consolidating two concurrent sounddevice streams into a single OutputStream, eliminating the race condition that triggered the AttributeError.
Out of Scope Changes check ✅ Passed All changes are directly scoped to fixing the sounddevice race condition—refactoring start_speaking(), measure_percentage_spoken(), and stop_speaking() with no unrelated modifications.
Docstring Coverage ✅ Passed Docstring coverage is 88.89% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/glados/audio_io/sounddevice_io.py`:
- Line 210: Guard the math that computes max_timeout and progress percentages by
validating sample_rate and total_samples before use and falling back to safe
defaults or using the actual queued audio length; specifically, in the code that
assigns max_timeout (variable max_timeout) and the later percent calculation
that uses sample_rate/total_samples, check that sample_rate > 0 and
total_samples > 0 and if not set max_timeout to a sensible default (e.g., 1.0s
or compute from queued_samples_count / sample_rate after validating both
values), and when possible compute total_samples from the actual queued buffer
length (e.g., queued_samples_count) to derive time and percentage so mismatched
caller values don't misreport progress. Ensure you clamp percentage to [0,100]
and avoid division when denominator is zero by returning 0% or using the queued
length fallback.
- Around line 226-227: The finalization in measure_percentage_spoken()
unconditionally clears shared playback state (_pending_audio and _is_playing)
which can clobber a newer start_speaking() session; fix by tracking a session
token or reference: when start_speaking() enqueues audio, assign a monotonic
session_id or attach the audio object to a local_session variable, pass that
id/reference into measure_percentage_spoken(), and only clear _pending_audio and
set _is_playing = False if the current global session id/reference still matches
the passed one (i.e., guard the cleanup with a equality check against the stored
session id/reference) so old sessions cannot wipe out newer queued sessions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4e5e4257-bedc-4749-960f-9b71329aa2d7

📥 Commits

Reviewing files that changed from the base of the PR and between a332559 and 5a69224.

📒 Files selected for processing (1)
  • src/glados/audio_io/sounddevice_io.py

try:
logger.debug(f"Using sample rate: {sample_rate} Hz, total samples: {total_samples}")
stream = sd.OutputStream(
max_timeout = total_samples / sample_rate + 1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard sample_rate/total_samples to prevent invalid math and wrong percentages

Line 210 and Line 228 trust caller-provided values. sample_rate <= 0 or total_samples <= 0 can break timeout/percentage math, and mismatches with actual queued audio length can misreport progress.

Suggested defensive calculation using queued audio length
-            max_timeout = total_samples / sample_rate + 1
+            if sample_rate <= 0:
+                raise ValueError("sample_rate must be > 0")
+            effective_total = len(audio_data)
+            max_timeout = effective_total / sample_rate + 1
@@
-        percentage_played = min(int(position / total_samples * 100), 100)
+        effective_total = len(audio_data)
+        percentage_played = 100 if effective_total == 0 else min(int(position / effective_total * 100), 100)

Also applies to: 228-228

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/glados/audio_io/sounddevice_io.py` at line 210, Guard the math that
computes max_timeout and progress percentages by validating sample_rate and
total_samples before use and falling back to safe defaults or using the actual
queued audio length; specifically, in the code that assigns max_timeout
(variable max_timeout) and the later percent calculation that uses
sample_rate/total_samples, check that sample_rate > 0 and total_samples > 0 and
if not set max_timeout to a sensible default (e.g., 1.0s or compute from
queued_samples_count / sample_rate after validating both values), and when
possible compute total_samples from the actual queued buffer length (e.g.,
queued_samples_count) to derive time and percentage so mismatched caller values
don't misreport progress. Ensure you clamp percentage to [0,100] and avoid
division when denominator is zero by returning 0% or using the queued length
fallback.

Comment on lines +226 to +227
self._pending_audio = None
self._is_playing = False
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: old session cleanup can wipe a newer queued session

Line 226 and Line 227 unconditionally clear shared playback state. If a new start_speaking() runs before the previous measure_percentage_spoken() exits, the old session can null out the new _pending_audio and flip _is_playing to False.

Suggested guard to avoid cross-session clobber
-        self._pending_audio = None
-        self._is_playing = False
+        # Only clear shared state if this invocation still owns it
+        if self._pending_audio is audio_data:
+            self._pending_audio = None
+        if self._stop_event is stop_event:
+            self._is_playing = False
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/glados/audio_io/sounddevice_io.py` around lines 226 - 227, The
finalization in measure_percentage_spoken() unconditionally clears shared
playback state (_pending_audio and _is_playing) which can clobber a newer
start_speaking() session; fix by tracking a session token or reference: when
start_speaking() enqueues audio, assign a monotonic session_id or attach the
audio object to a local_session variable, pass that id/reference into
measure_percentage_spoken(), and only clear _pending_audio and set _is_playing =
False if the current global session id/reference still matches the passed one
(i.e., guard the cleanup with a equality check against the stored session
id/reference) so old sessions cannot wipe out newer queued sessions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AttributeError: '_CallbackContext' object has no attribute 'out'

1 participant