docs(mod3): scrub participant_id examples (PR #6 ancillary)#9
Merged
chazmaniandinkle merged 8 commits intocogos-dev:mainfrom Apr 24, 2026
Conversation
Introduces SessionRegistry + GlobalSerializer + live output-device
resolution so multiple concurrent agents/users can share one Mod3
instance without colliding on voice, queue, or speaker.
- session_registry.py: SessionChannel, voice-pool greedy allocation,
per-session queues, round-robin/priority/fifo-global policies, live
device re-query per playback (ADR-082 2026-04-22 amendment - no
caching, macOS CoreAudio default tracked live).
- http_api.py: POST /v1/sessions/register, POST /v1/sessions/{id}/deregister,
GET /v1/sessions, GET /v1/sessions/{id}. Synthesize honors the
session's assigned voice when unspecified.
- server.py + mcp_shim.py: mirrored MCP tools (register_session,
deregister_session, list_sessions) so stdio MCP callers get the
same surface.
- Backward-compat: legacy callers without a session_id route to an
implicit "default" session.
Out of scope (later ADR phases): input routing, barge-in state
machine, native input provider.
Regression: with MOD3_USE_COGOS_AGENT=1, agent_loop's success path returns before the local-inference path's send_response_complete, leaving the dashboard's isResponding spinner hung forever. - channels.py: BrowserChannel.broadcast_response_complete(metrics, session_id) - thread-safe companion to broadcast_response_text, routes to the same channel that received the text frames. - cogos_agent_bridge.py: on agent_response receipt, emit the complete frame after the text frame. - demo/e2e_dashboard_harness.py + tests: updated to assert the completion frame fires on both code paths.
Wave 4.1 + 4.2 of the mod3-kernel integration. The dashboard now owns its
own bus identity instead of being an anonymous WebSocket client.
On page load:
1. Reuse a session_id from sessionStorage (refreshes stay on the same
identity).
2. Otherwise POST to the kernel's /v1/channel-sessions/register — ADR-082
Wave 3.5 says session-id minting is kernel-owned. On CORS / kernel-down
the JS falls back to mod3's /v1/sessions/register direct so the
dashboard keeps working in a mod3-only deployment.
3. Poll GET /v1/sessions every 4s, render the live roster.
4. On beforeunload, navigator.sendBeacon a best-effort deregister so the
voice returns to the pool without waiting for a sweep.
The participant panel is a collapsible drawer keyed off a header pill
(count + plural). Rows show participant_id, assigned_voice, session_id
prefix, age, and participant_type badge. The "self" row is pulled to the
top and highlighted with a green left-border + "you" pill.
window.__mod3Session is exposed for Wave 4.3 — the audio WebSocket
subscription will key off its session_id, and a "mod3-session-registered"
CustomEvent fires when registration completes so late-loaded scripts can
subscribe without polling.
Branching: stacked on feat/session-registry-adr-082-phase1 because the
/v1/sessions endpoints this UI depends on only exist on that branch
(Phase 1 of the session registry).
…back
Wave 4.3 mod3 side — route synthesized audio to the dashboard via a
per-session WebSocket instead of (or in addition to) the server's
sounddevice / afplay fallback.
New module: audio_subscribers.py
AudioSubscriberRegistry holds session_id → [subscriber] with
register/unregister/has_subscribers/count/emit_wav. emit_wav pushes
a JSON header frame + binary WAV frame through each subscriber's
WebSocket via run_coroutine_threadsafe on the socket's event loop,
matching the BrowserChannel.broadcast_trace_event pattern.
New endpoints (http_api.py):
WS /ws/audio/{session_id} — accept + register + hold open
GET /v1/sessions/{session_id}/subscribers — returns
{"session_id": ..., "subscribed": bool, "count": N}. Unknown
session_ids intentionally return subscribed=false instead of 404
so the kernel's pre-afplay check stays a single predicate.
/v1/synthesize now also emits the generated WAV over the WebSocket
when the request names a session and at least one subscriber is
attached. Emit is best-effort (disconnect mid-send just drops the
frame) and non-blocking on the HTTP path. A new X-Mod3-WS-Subscribers
response header reports how many subscribers received the blob;
callers use this to skip their local playback.
mcp_shim._play_wav_bytes gains a pre-check (_session_has_ws_subscriber)
that GETs /v1/sessions/{id}/subscribers with a 1.5s timeout. When
subscribed=true we skip sounddevice entirely and record
status=routed_ws in the job ledger. Keeps the legacy path unchanged
when no session is attached or the HTTP check fails.
Dashboard wiring (dashboard/index.html):
A new IIFE opens ws://host:7860/ws/audio/<session_id> after the
session-registered event fires, listens for audio_header + binary
frames, and plays the WAV through AudioContext.decodeAudioData.
Reconnect on close with exponential backoff up to 30s. The self-row
audio-dot indicator flips green while the WS is up. AudioContext is
resumed on first user gesture to satisfy the autoplay policy.
Tests (tests/test_audio_subscribers.py):
- AudioSubscriberRegistry unit tests: register/unregister, multiple
subscribers per session, empty-bucket pruning, emit_wav delivers
header+bytes, no-subscriber returns zero, default registry is a
shared singleton.
- HTTP tests via FastAPI TestClient: /subscribers returns false for
unknown sessions, /ws/audio upgrade flips subscribed=true and
disconnect flips it back to false.
- Integration test (guarded by SKIP_TTS_TESTS env var because it
loads Kokoro): /v1/synthesize with a session + subscriber pushes a
RIFF/WAVE binary frame through the WebSocket and reports
X-Mod3-WS-Subscribers: 1.
All 32 existing session-registry tests + 9 new tests pass (1 skipped
for Kokoro cold-start). Ruff clean.
Branch: feat/dashboard-wave4, stacked on
feat/session-registry-adr-082-phase1 (Phase 1 /v1/sessions surface).
The playWavBlob ArrayBuffer-extraction ternary had one branch that used Uint8Array.slice() — which returns another Uint8Array, not an ArrayBuffer. decodeAudioData then throws "parameter 1 is not of type 'ArrayBuffer'". The fix: use buffer.buffer.slice(0) in the "view covers whole buffer" branch so both branches emit an ArrayBuffer copy. Found via live smoke test: dashboard connected cleanly, WebSocket received WAV frames, but every playback failed silently in the console while the kernel and mod3 both reported routed_ws success. The server side was correct; this was purely a browser-side type error.
…evice The output-device <select> at the top of the dashboard was only calling _playback.setOutputDevice() — the legacy chat-path router. The Wave 4 per-session WebSocket path owns a separate AudioContext in the setupAudioSubscription IIFE and was ignoring the selection, so every session-routed playback landed on the system default even when the user had picked a different device in the UI. Fix: - setupAudioSubscription exposes window.__mod3AudioSink(deviceId). It remembers the selection in a closure and calls AudioContext.setSinkId once the context exists (Chrome 110+; older browsers silently skip). - ensureAudioCtx now applies the pending sink or the current dropdown value at creation time, so the first playback after a fresh page load already honors the selection. - The <select> change handler calls both _playback.setOutputDevice and window.__mod3AudioSink, keeping the legacy path and the WS path in sync. - window.__mod3AudioCtx is set for diagnostic access (evaluate_script, debugger probes). Found by Chaz during the Wave 4 smoke test: he set the dashboard output to MacBook Pro Speakers, but mod3_speak audio still played through the system-default Dell USB Audio. BrowserOS console confirmed the context had no sinkId set and the UI handler only logged.
Two linked fixes on top of the setSinkId wiring: 1. Timing race — populateOutputDevices is async; if the first mod3_speak arrives before enumeration completes, ensureAudioCtx reads a half-built dropdown and applies "default" even when the user had previously picked a different device. populateOutputDevices now pushes the resolved sink into window.__mod3AudioSink at the end of its own selection logic, so the context always gets bound to the final selection regardless of which race happens first. 2. Persistence — the selected device now round-trips through localStorage under key "mod3-output-device". populateOutputDevices consults the saved value before falling back to the "Default -" heuristic, so a reload keeps the user's previous choice instead of silently reverting. Found by Chaz during the Wave 4 smoke test: his MacBook Pro selection survived reload (browser form-state restoration), but the sink stayed on "default" because the race fired first. Both paths now converge on the right sink.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Replaces hardcoded 'slowbro' with generic 'alice' in MCP tool docstrings and JSON schemas (4 occurrences across http_api.py, mcp_shim.py, server.py). Designed to land alongside PR #6.