Dashboard: participant panel + /ws/audio per-session playback (Wave 4) by chazmaniandinkle · Pull Request #6 · cogos-dev/mod3

chazmaniandinkle · 2026-04-23T18:49:59Z

Summary

Makes the mod3 dashboard the primary audio interface. Two stacked commits.

Depends on: cogos-dev/mod3#5 (ADR-082 Phase 1). Diff will be large until #5 merges; please review the top two commits (69dd70d, a5321ee) individually.

Commits

`69dd70d` — feat(dashboard): participant panel + auto-register on page load

Fetches /v1/sessions every 3-5s; renders participant badges with session_id, voice, last_active
Auto-registers on load via kernel's /v1/channel-sessions/register (kernel-owned authority), falls back to mod3's direct /v1/sessions/register on CORS/unreachable (expected fallback today — kernel has no CORS yet)
Stores session_id in sessionStorage so refresh reuses; best-effort deregister on beforeunload
Self-row highlighted as "you" in green
Console logs and window.__mod3Session for observability; emits mod3-session-registered CustomEvent

`a5321ee` — feat(channels): /ws/audio/{session_id} WebSocket for per-session playback

New audio_subscribers.py module: thread-safe subscriber registry + emit_wav(session_id, payload, header) fanout helper
New GET /v1/sessions/{id}/subscribers endpoint returning {subscribed: bool, count: N} — the kernel's pre-afplay check uses this
POST /v1/synthesize emits WAV over WebSocket to subscribers of the caller's session_id and adds X-Mod3-WS-Subscribers response header; callers (kernel, MCP shim) honor it to avoid double-play
Wire contract: JSON audio_header frame followed by single binary WAV frame per job. No chunking for Kokoro-length outputs.
MCP shim gets a pre-play subscriber check; skips local sd.play when a WS subscriber exists
Browser JS opens ws://host:7860/ws/audio/<sid>, decodes WAV via AudioContext.decodeAudioData, plays on first user gesture (autoplay policy)

Architectural effect

Dashboard is the primary audio sink for its own session when open
Server-side afplay is now a fallback, not the default
Per-session routing: mod3_speak(session_id=...) lands on the right browser
Forward-compatible with Discord/REPL: same subscriber pattern can add a Discord voice subscriber at /v1/sessions/{id}/subscribers/discord later

Known caveats (noted in commit bodies)

Kernel CORS missing — browser hits mod3-direct register path today (still works; flagged for Wave 5)
AudioContext needs first user gesture to resume (one-shot click/keydown listeners installed)
_session_has_ws_subscriber has a 1.5s timeout — if mod3 HTTP wedges, adds 1.5s per speak before fallback

Test plan

41 mod3 tests green (new: TestAudioSubscriberRegistry 6 cases, TestSubscribersEndpoint 2 cases, TestSynthesizeEmitsOverWS 2 cases)
ruff clean
Manual smoke test (documented in commit bodies): open dashboard, verify participant badge, trigger speak with session_id, confirm audio plays in browser not speakers; close tab, confirm afplay fallback

Introduces SessionRegistry + GlobalSerializer + live output-device resolution so multiple concurrent agents/users can share one Mod3 instance without colliding on voice, queue, or speaker. - session_registry.py: SessionChannel, voice-pool greedy allocation, per-session queues, round-robin/priority/fifo-global policies, live device re-query per playback (ADR-082 2026-04-22 amendment - no caching, macOS CoreAudio default tracked live). - http_api.py: POST /v1/sessions/register, POST /v1/sessions/{id}/deregister, GET /v1/sessions, GET /v1/sessions/{id}. Synthesize honors the session's assigned voice when unspecified. - server.py + mcp_shim.py: mirrored MCP tools (register_session, deregister_session, list_sessions) so stdio MCP callers get the same surface. - Backward-compat: legacy callers without a session_id route to an implicit "default" session. Out of scope (later ADR phases): input routing, barge-in state machine, native input provider.

Regression: with MOD3_USE_COGOS_AGENT=1, agent_loop's success path returns before the local-inference path's send_response_complete, leaving the dashboard's isResponding spinner hung forever. - channels.py: BrowserChannel.broadcast_response_complete(metrics, session_id) - thread-safe companion to broadcast_response_text, routes to the same channel that received the text frames. - cogos_agent_bridge.py: on agent_response receipt, emit the complete frame after the text frame. - demo/e2e_dashboard_harness.py + tests: updated to assert the completion frame fires on both code paths.

Wave 4.1 + 4.2 of the mod3-kernel integration. The dashboard now owns its own bus identity instead of being an anonymous WebSocket client. On page load: 1. Reuse a session_id from sessionStorage (refreshes stay on the same identity). 2. Otherwise POST to the kernel's /v1/channel-sessions/register — ADR-082 Wave 3.5 says session-id minting is kernel-owned. On CORS / kernel-down the JS falls back to mod3's /v1/sessions/register direct so the dashboard keeps working in a mod3-only deployment. 3. Poll GET /v1/sessions every 4s, render the live roster. 4. On beforeunload, navigator.sendBeacon a best-effort deregister so the voice returns to the pool without waiting for a sweep. The participant panel is a collapsible drawer keyed off a header pill (count + plural). Rows show participant_id, assigned_voice, session_id prefix, age, and participant_type badge. The "self" row is pulled to the top and highlighted with a green left-border + "you" pill. window.__mod3Session is exposed for Wave 4.3 — the audio WebSocket subscription will key off its session_id, and a "mod3-session-registered" CustomEvent fires when registration completes so late-loaded scripts can subscribe without polling. Branching: stacked on feat/session-registry-adr-082-phase1 because the /v1/sessions endpoints this UI depends on only exist on that branch (Phase 1 of the session registry).

…back Wave 4.3 mod3 side — route synthesized audio to the dashboard via a per-session WebSocket instead of (or in addition to) the server's sounddevice / afplay fallback. New module: audio_subscribers.py AudioSubscriberRegistry holds session_id → [subscriber] with register/unregister/has_subscribers/count/emit_wav. emit_wav pushes a JSON header frame + binary WAV frame through each subscriber's WebSocket via run_coroutine_threadsafe on the socket's event loop, matching the BrowserChannel.broadcast_trace_event pattern. New endpoints (http_api.py): WS /ws/audio/{session_id} — accept + register + hold open GET /v1/sessions/{session_id}/subscribers — returns {"session_id": ..., "subscribed": bool, "count": N}. Unknown session_ids intentionally return subscribed=false instead of 404 so the kernel's pre-afplay check stays a single predicate. /v1/synthesize now also emits the generated WAV over the WebSocket when the request names a session and at least one subscriber is attached. Emit is best-effort (disconnect mid-send just drops the frame) and non-blocking on the HTTP path. A new X-Mod3-WS-Subscribers response header reports how many subscribers received the blob; callers use this to skip their local playback. mcp_shim._play_wav_bytes gains a pre-check (_session_has_ws_subscriber) that GETs /v1/sessions/{id}/subscribers with a 1.5s timeout. When subscribed=true we skip sounddevice entirely and record status=routed_ws in the job ledger. Keeps the legacy path unchanged when no session is attached or the HTTP check fails. Dashboard wiring (dashboard/index.html): A new IIFE opens ws://host:7860/ws/audio/<session_id> after the session-registered event fires, listens for audio_header + binary frames, and plays the WAV through AudioContext.decodeAudioData. Reconnect on close with exponential backoff up to 30s. The self-row audio-dot indicator flips green while the WS is up. AudioContext is resumed on first user gesture to satisfy the autoplay policy. Tests (tests/test_audio_subscribers.py): - AudioSubscriberRegistry unit tests: register/unregister, multiple subscribers per session, empty-bucket pruning, emit_wav delivers header+bytes, no-subscriber returns zero, default registry is a shared singleton. - HTTP tests via FastAPI TestClient: /subscribers returns false for unknown sessions, /ws/audio upgrade flips subscribed=true and disconnect flips it back to false. - Integration test (guarded by SKIP_TTS_TESTS env var because it loads Kokoro): /v1/synthesize with a session + subscriber pushes a RIFF/WAVE binary frame through the WebSocket and reports X-Mod3-WS-Subscribers: 1. All 32 existing session-registry tests + 9 new tests pass (1 skipped for Kokoro cold-start). Ruff clean. Branch: feat/dashboard-wave4, stacked on feat/session-registry-adr-082-phase1 (Phase 1 /v1/sessions surface).

The playWavBlob ArrayBuffer-extraction ternary had one branch that used Uint8Array.slice() — which returns another Uint8Array, not an ArrayBuffer. decodeAudioData then throws "parameter 1 is not of type 'ArrayBuffer'". The fix: use buffer.buffer.slice(0) in the "view covers whole buffer" branch so both branches emit an ArrayBuffer copy. Found via live smoke test: dashboard connected cleanly, WebSocket received WAV frames, but every playback failed silently in the console while the kernel and mod3 both reported routed_ws success. The server side was correct; this was purely a browser-side type error.

…evice The output-device <select> at the top of the dashboard was only calling _playback.setOutputDevice() — the legacy chat-path router. The Wave 4 per-session WebSocket path owns a separate AudioContext in the setupAudioSubscription IIFE and was ignoring the selection, so every session-routed playback landed on the system default even when the user had picked a different device in the UI. Fix: - setupAudioSubscription exposes window.__mod3AudioSink(deviceId). It remembers the selection in a closure and calls AudioContext.setSinkId once the context exists (Chrome 110+; older browsers silently skip). - ensureAudioCtx now applies the pending sink or the current dropdown value at creation time, so the first playback after a fresh page load already honors the selection. - The <select> change handler calls both _playback.setOutputDevice and window.__mod3AudioSink, keeping the legacy path and the WS path in sync. - window.__mod3AudioCtx is set for diagnostic access (evaluate_script, debugger probes). Found by Chaz during the Wave 4 smoke test: he set the dashboard output to MacBook Pro Speakers, but mod3_speak audio still played through the system-default Dell USB Audio. BrowserOS console confirmed the context had no sinkId set and the UI handler only logged.

Two linked fixes on top of the setSinkId wiring: 1. Timing race — populateOutputDevices is async; if the first mod3_speak arrives before enumeration completes, ensureAudioCtx reads a half-built dropdown and applies "default" even when the user had previously picked a different device. populateOutputDevices now pushes the resolved sink into window.__mod3AudioSink at the end of its own selection logic, so the context always gets bound to the final selection regardless of which race happens first. 2. Persistence — the selected device now round-trips through localStorage under key "mod3-output-device". populateOutputDevices consults the saved value before falling back to the "Default -" heuristic, so a reload keeps the user's previous choice instead of silently reverting. Found by Chaz during the Wave 4 smoke test: his MacBook Pro selection survived reload (browser form-state restoration), but the sink stayed on "default" because the race fired first. Both paths now converge on the right sink.

…examples-pr6 docs(mod3): scrub participant_id examples (PR #6 ancillary)

chazmaniandinkle added 7 commits April 23, 2026 11:46

This was referenced Apr 24, 2026

chore: add CONTRIBUTING.md #7

Draft

docs(mod3): scrub participant_id examples (PR #6 ancillary) #9

Merged

chazmaniandinkle added a commit that referenced this pull request Apr 24, 2026

Merge pull request #9 from chazmaniandinkle/fix/scrub-participant-id-…

1462de2

…examples-pr6 docs(mod3): scrub participant_id examples (PR #6 ancillary)

chazmaniandinkle merged commit 08d679f into cogos-dev:main Apr 24, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dashboard: participant panel + /ws/audio per-session playback (Wave 4)#6

Dashboard: participant panel + /ws/audio per-session playback (Wave 4)#6
chazmaniandinkle merged 7 commits intocogos-dev:mainfrom
chazmaniandinkle:feat/dashboard-wave4

chazmaniandinkle commented Apr 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

chazmaniandinkle commented Apr 23, 2026

Summary

Commits

69dd70d — feat(dashboard): participant panel + auto-register on page load

a5321ee — feat(channels): /ws/audio/{session_id} WebSocket for per-session playback

Architectural effect

Known caveats (noted in commit bodies)

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`69dd70d` — feat(dashboard): participant panel + auto-register on page load

`a5321ee` — feat(channels): /ws/audio/{session_id} WebSocket for per-session playback