docs(mod3): scrub participant_id examples (PR #6 ancillary) by chazmaniandinkle · Pull Request #9 · cogos-dev/mod3

chazmaniandinkle · 2026-04-24T18:44:12Z

Replaces hardcoded 'slowbro' with generic 'alice' in MCP tool docstrings and JSON schemas (4 occurrences across http_api.py, mcp_shim.py, server.py). Designed to land alongside PR #6.

Introduces SessionRegistry + GlobalSerializer + live output-device resolution so multiple concurrent agents/users can share one Mod3 instance without colliding on voice, queue, or speaker. - session_registry.py: SessionChannel, voice-pool greedy allocation, per-session queues, round-robin/priority/fifo-global policies, live device re-query per playback (ADR-082 2026-04-22 amendment - no caching, macOS CoreAudio default tracked live). - http_api.py: POST /v1/sessions/register, POST /v1/sessions/{id}/deregister, GET /v1/sessions, GET /v1/sessions/{id}. Synthesize honors the session's assigned voice when unspecified. - server.py + mcp_shim.py: mirrored MCP tools (register_session, deregister_session, list_sessions) so stdio MCP callers get the same surface. - Backward-compat: legacy callers without a session_id route to an implicit "default" session. Out of scope (later ADR phases): input routing, barge-in state machine, native input provider.

Regression: with MOD3_USE_COGOS_AGENT=1, agent_loop's success path returns before the local-inference path's send_response_complete, leaving the dashboard's isResponding spinner hung forever. - channels.py: BrowserChannel.broadcast_response_complete(metrics, session_id) - thread-safe companion to broadcast_response_text, routes to the same channel that received the text frames. - cogos_agent_bridge.py: on agent_response receipt, emit the complete frame after the text frame. - demo/e2e_dashboard_harness.py + tests: updated to assert the completion frame fires on both code paths.

Wave 4.1 + 4.2 of the mod3-kernel integration. The dashboard now owns its own bus identity instead of being an anonymous WebSocket client. On page load: 1. Reuse a session_id from sessionStorage (refreshes stay on the same identity). 2. Otherwise POST to the kernel's /v1/channel-sessions/register — ADR-082 Wave 3.5 says session-id minting is kernel-owned. On CORS / kernel-down the JS falls back to mod3's /v1/sessions/register direct so the dashboard keeps working in a mod3-only deployment. 3. Poll GET /v1/sessions every 4s, render the live roster. 4. On beforeunload, navigator.sendBeacon a best-effort deregister so the voice returns to the pool without waiting for a sweep. The participant panel is a collapsible drawer keyed off a header pill (count + plural). Rows show participant_id, assigned_voice, session_id prefix, age, and participant_type badge. The "self" row is pulled to the top and highlighted with a green left-border + "you" pill. window.__mod3Session is exposed for Wave 4.3 — the audio WebSocket subscription will key off its session_id, and a "mod3-session-registered" CustomEvent fires when registration completes so late-loaded scripts can subscribe without polling. Branching: stacked on feat/session-registry-adr-082-phase1 because the /v1/sessions endpoints this UI depends on only exist on that branch (Phase 1 of the session registry).

…back Wave 4.3 mod3 side — route synthesized audio to the dashboard via a per-session WebSocket instead of (or in addition to) the server's sounddevice / afplay fallback. New module: audio_subscribers.py AudioSubscriberRegistry holds session_id → [subscriber] with register/unregister/has_subscribers/count/emit_wav. emit_wav pushes a JSON header frame + binary WAV frame through each subscriber's WebSocket via run_coroutine_threadsafe on the socket's event loop, matching the BrowserChannel.broadcast_trace_event pattern. New endpoints (http_api.py): WS /ws/audio/{session_id} — accept + register + hold open GET /v1/sessions/{session_id}/subscribers — returns {"session_id": ..., "subscribed": bool, "count": N}. Unknown session_ids intentionally return subscribed=false instead of 404 so the kernel's pre-afplay check stays a single predicate. /v1/synthesize now also emits the generated WAV over the WebSocket when the request names a session and at least one subscriber is attached. Emit is best-effort (disconnect mid-send just drops the frame) and non-blocking on the HTTP path. A new X-Mod3-WS-Subscribers response header reports how many subscribers received the blob; callers use this to skip their local playback. mcp_shim._play_wav_bytes gains a pre-check (_session_has_ws_subscriber) that GETs /v1/sessions/{id}/subscribers with a 1.5s timeout. When subscribed=true we skip sounddevice entirely and record status=routed_ws in the job ledger. Keeps the legacy path unchanged when no session is attached or the HTTP check fails. Dashboard wiring (dashboard/index.html): A new IIFE opens ws://host:7860/ws/audio/<session_id> after the session-registered event fires, listens for audio_header + binary frames, and plays the WAV through AudioContext.decodeAudioData. Reconnect on close with exponential backoff up to 30s. The self-row audio-dot indicator flips green while the WS is up. AudioContext is resumed on first user gesture to satisfy the autoplay policy. Tests (tests/test_audio_subscribers.py): - AudioSubscriberRegistry unit tests: register/unregister, multiple subscribers per session, empty-bucket pruning, emit_wav delivers header+bytes, no-subscriber returns zero, default registry is a shared singleton. - HTTP tests via FastAPI TestClient: /subscribers returns false for unknown sessions, /ws/audio upgrade flips subscribed=true and disconnect flips it back to false. - Integration test (guarded by SKIP_TTS_TESTS env var because it loads Kokoro): /v1/synthesize with a session + subscriber pushes a RIFF/WAVE binary frame through the WebSocket and reports X-Mod3-WS-Subscribers: 1. All 32 existing session-registry tests + 9 new tests pass (1 skipped for Kokoro cold-start). Ruff clean. Branch: feat/dashboard-wave4, stacked on feat/session-registry-adr-082-phase1 (Phase 1 /v1/sessions surface).

The playWavBlob ArrayBuffer-extraction ternary had one branch that used Uint8Array.slice() — which returns another Uint8Array, not an ArrayBuffer. decodeAudioData then throws "parameter 1 is not of type 'ArrayBuffer'". The fix: use buffer.buffer.slice(0) in the "view covers whole buffer" branch so both branches emit an ArrayBuffer copy. Found via live smoke test: dashboard connected cleanly, WebSocket received WAV frames, but every playback failed silently in the console while the kernel and mod3 both reported routed_ws success. The server side was correct; this was purely a browser-side type error.

…evice The output-device <select> at the top of the dashboard was only calling _playback.setOutputDevice() — the legacy chat-path router. The Wave 4 per-session WebSocket path owns a separate AudioContext in the setupAudioSubscription IIFE and was ignoring the selection, so every session-routed playback landed on the system default even when the user had picked a different device in the UI. Fix: - setupAudioSubscription exposes window.__mod3AudioSink(deviceId). It remembers the selection in a closure and calls AudioContext.setSinkId once the context exists (Chrome 110+; older browsers silently skip). - ensureAudioCtx now applies the pending sink or the current dropdown value at creation time, so the first playback after a fresh page load already honors the selection. - The <select> change handler calls both _playback.setOutputDevice and window.__mod3AudioSink, keeping the legacy path and the WS path in sync. - window.__mod3AudioCtx is set for diagnostic access (evaluate_script, debugger probes). Found by Chaz during the Wave 4 smoke test: he set the dashboard output to MacBook Pro Speakers, but mod3_speak audio still played through the system-default Dell USB Audio. BrowserOS console confirmed the context had no sinkId set and the UI handler only logged.

Two linked fixes on top of the setSinkId wiring: 1. Timing race — populateOutputDevices is async; if the first mod3_speak arrives before enumeration completes, ensureAudioCtx reads a half-built dropdown and applies "default" even when the user had previously picked a different device. populateOutputDevices now pushes the resolved sink into window.__mod3AudioSink at the end of its own selection logic, so the context always gets bound to the final selection regardless of which race happens first. 2. Persistence — the selected device now round-trips through localStorage under key "mod3-output-device". populateOutputDevices consults the saved value before falling back to the "Default -" heuristic, so a reload keeps the user's previous choice instead of silently reverting. Found by Chaz during the Wave 4 smoke test: his MacBook Pro selection survived reload (browser form-state restoration), but the sink stayed on "default" because the race fired first. Both paths now converge on the right sink.

chazmaniandinkle added 8 commits April 23, 2026 11:46

docs(mod3): use generic example identifier in MCP tool schemas

ae167d6

chazmaniandinkle marked this pull request as ready for review April 24, 2026 22:31

chazmaniandinkle merged commit 1462de2 into cogos-dev:main Apr 24, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(mod3): scrub participant_id examples (PR #6 ancillary)#9

docs(mod3): scrub participant_id examples (PR #6 ancillary)#9
chazmaniandinkle merged 8 commits intocogos-dev:mainfrom
chazmaniandinkle:fix/scrub-participant-id-examples-pr6

chazmaniandinkle commented Apr 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

chazmaniandinkle commented Apr 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant