Make Call object reusable after leave()#1605
Make Call object reusable after leave()#1605aleksandar-apostolov wants to merge 13 commits intodevelopfrom
Conversation
Enable Call objects to be rejoined after calling leave() by making coroutine scopes recreatable. This fixes the "scope canceled" error that occurred when integrations cached Call objects and tried to rejoin after leaving. Changes: - Make scopes recreatable: Change scope, supervisorJob, scopeProvider from val to var in Call.kt - Add reset() method to ScopeProvider interface and implementation - Add resetScopes() method called after cleanup() to recreate scopes - Reset isDestroyed flag to allow rejoin - Keep Call objects in StreamVideoClient.calls map after leave() - Remove destroyedCalls workaround mechanism (no longer needed) - Remove deliverIntentToDestroyedCalls() and notifyDestroyedCalls() - Deprecate callUpdatesAfterLeave parameter (now always enabled) Benefits: - Call objects can be reused: join() -> leave() -> join() works - Fixes duplicate Call object creation issue - Eliminates need for workaround code (destroyedCalls, etc.) - CallState continues receiving updates after leave() - Backwards compatible: deprecated parameter kept for binary compat 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
… rejoin MediaManager was storing a reference to the scope passed at initialization, which became stale after scope recreation in resetScopes(). This caused video and audio to not work when rejoining after leave(). Changes: - Remove scope parameter from MediaManagerImpl constructor - Add scope as a computed property that accesses call.scope dynamically - Update Call.kt to not pass scope when creating MediaManagerImpl This ensures MediaManager always uses the current active scope, even after it's been recreated by resetScopes(). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
When rejoining after leave(), the Call was reusing the same sessionId and unifiedSessionId. The SFU server treats these as identifying a specific session, so reusing them prevented proper track subscription and RTC connection establishment. Changes: - Change unifiedSessionId from val to var to allow regeneration - Reset both sessionId and unifiedSessionId in resetScopes() - Generate new UUIDs to ensure fresh session identity on rejoin This ensures the server treats each rejoin as a new session, allowing proper peer connection setup and track subscription. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
After leave(), device managers (camera, microphone, speaker, screen share) retained their enabled/disabled status. On rejoin, updateMediaManagerFromSettings() only initializes devices with NotSelected status, so video/audio wasn't set up. Changes: - Add reset() method to MediaManagerImpl to reset all device statuses - Change device _status fields from private to internal for reset access - Call mediaManager.reset() in resetScopes() after cleanup - Add reset methods to SpeakerManager and CameraManager for consistency This ensures video and audio tracks get properly recreated and configured when rejoining after leave(). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
After leave(), participant state with disposed video/audio tracks remained in CallState.internalParticipants. On rejoin, new tracks were received but the UI was still observing old ParticipantState objects with disposed tracks, causing remote participants to not be visible. Changes: - Call state.clearParticipants() in resetScopes() to remove stale participants - Ensures fresh ParticipantState objects are created on rejoin with new tracks - Remote participant video and audio now properly render after rejoin 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
cleanup() and resetScopes() were running asynchronously in a launched coroutine. If join() was called immediately after leave(), it could run before cleanup/reset completed, causing stale state and preventing proper subscriber setup. Changes: - Add cleanupJob to track the async cleanup coroutine - Make _join() wait for cleanupJob to complete before proceeding - Clear cleanupJob after waiting This ensures all cleanup and reset completes before allowing rejoin, fixing audio subscription issues. Video tracks still need investigation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Restore deprecated `scope` parameter in MediaManagerImpl constructor - Make reset() methods internal to avoid public API additions - Remove empty shutDownJobsGracefully() function - Fix documentation comment in MicrophoneManager Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
PR checklist ✅All required conditions are satisfied:
🎉 Great job! This PR is ready for review. |
WalkthroughThe changes refactor Call lifecycle management to support call object reuse across multiple sessions. Key modifications include making core scope properties mutable, introducing cleanup job tracking, reworking cleanup and reset semantics to preserve StateFlow lifetimes, adding reset methods to media managers and scope providers, and removing LruCache-based call destruction tracking. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Tip Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
stream-video-android-core/src/main/kotlin/io/getstream/video/android/core/Call.kt (1)
974-985: Ensure resetScopes always runs, even if cleanup fails.If
cleanup()throws or the coroutine is cancelled mid‑cleanup,resetScopes()won’t run, leavingisDestroyedtrue and potentially blocking future leaves/rejoins. Atry/finallyhere keeps the object reusable even on partial failures.🐛 Suggested fix
cleanupJob = clientImpl.scope.launch { - safeCall { - session?.sfuTracer?.trace( - "leave-call", - "[reason=$reason, error=${disconnectionReason?.message}]", - ) - val stats = collectStats() - session?.sendCallStats(stats) - } - cleanup() - resetScopes() + try { + safeCall { + session?.sfuTracer?.trace( + "leave-call", + "[reason=$reason, error=${disconnectionReason?.message}]", + ) + val stats = collectStats() + session?.sendCallStats(stats) + } + cleanup() + } finally { + resetScopes() + } }As per coding guidelines, ensure cleanup/teardown paths handle cancellation and failure.
Also applies to: 1512-1546
🧹 Nitpick comments (2)
stream-video-android-core/src/main/kotlin/io/getstream/video/android/core/MediaManager.kt (2)
553-556: Reset should clear stale microphone state beyond status.Consider also clearing
priorStatusand device selections so rejoin doesn’t inherit old pause/resume or routing state.♻️ Suggested tweak
internal fun reset() { _status.value = DeviceStatus.NotSelected + priorStatus = null + _selectedDevice.value = null + nonHeadsetFallbackDevice = null }Also applies to: 722-727
1401-1410: Delegate to the new reset helpers to avoid duplication.Now that each manager exposes
reset(), using them here keeps reset behavior centralized and less error‑prone if those methods evolve.♻️ Suggested tweak
internal fun reset() { - camera._status.value = DeviceStatus.NotSelected - microphone._status.value = DeviceStatus.NotSelected - speaker._status.value = DeviceStatus.NotSelected - screenShare._status.value = DeviceStatus.NotSelected + camera.reset() + microphone.reset() + screenShare.reset() + speaker._status.value = DeviceStatus.NotSelected }
SDK Size Comparison 📏
|
| * StateFlows depend on the original scope. The scope lives for the entire lifetime of | ||
| * the Call object. | ||
| */ | ||
| private fun resetScopes() { |
There was a problem hiding this comment.
I think we should use mutex to re-init scopes and update isDestroyed to prevent race-condition
There was a problem hiding this comment.
Yes, we should
- Add lifecycleLock to synchronize join/leave lifecycle operations - Protect isDestroyed and cleanupJob access with synchronized blocks - Move atomicLeave reset to after successful join to prevent race conditions - Fix MicrophoneManager.setup() to restart audioHandler after cleanup - Fix AudioSwitchHandler.stop() to reset isAudioSwitchInitScheduled flag Fixes: speakerphone not working after rejoin Fixes: CoroutineCancellationException during rapid join/leave cycles Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
This pull request has been automatically marked as stale because it has been inactive for 14 days. It will be closed in 7 days if no further activity occurs. |
Fixes a race condition where join() could proceed before cleanup completed: 1. Merged two separate synchronized blocks in internalLeave() into one. Previously, cleanupJob was assigned in a second synchronized block, creating a window where join() could read cleanupJob as null and proceed without waiting for cleanup. 2. Added robust while-loop in _join() to handle edge case where resetScopes() clears cleanupJob but a new leave() sets it again. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
If cleanup is stuck or takes too long, _join() will now fail after ~10 seconds with a clear error instead of hanging forever. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
| /** | ||
| * Resets state to allow the Call to be reusable after leave(). | ||
| * Generates new session IDs, resets the scopeProvider, clears participants, and resets device statuses. | ||
| * | ||
| * IMPORTANT: We do NOT recreate [scope] or [supervisorJob] because [CallState] and its | ||
| * StateFlows depend on the original scope. The scope lives for the entire lifetime of | ||
| * the Call object. |
There was a problem hiding this comment.
Risk Analysis: Not cancelling call.scope in call.leave()
The problem
We want to make the Call class reusable after leave(). But scope.cancel() in shutDownJobsGracefully() cancels all Flow.stateIn() operators — they cannot be restarted.
If we remove scope.cancel(), we introduce risks in already stable code.
Risk: Zombie coroutines
call.scope is not just used in Call.kt. It is passed down to multiple components that internally launch their own coroutines:
If scope is not cancelled, we need to micromanage every usage of scope running after leave() — Coordinator Event processing/SFU event processing, SFU event listeners, ICE monitoring, stats reporting, audio processing, and more. I am aware we are handling them but introducing micromanag does not not appear scalable
These are zombie operations on a dead session.
Approaches to solve this
Approach 1: Manually cancel each scope.launch
Go through every scope.launch{} usage and cancel them individually in leave().
Downside: Fragile. Easy to miss one. Every new feature that adds a scope.launch needs to remember to add cancellation. Not scalable.
Approach 2: Create a child scope for session work, keep scope for stateIn
Create a new sessionScope (child of scope). Pass it to all components. Cancel only sessionScope in leave(). stateIn flows stay on scope and survive.
Downside: We need to audit every usage of scope and decide — does this go on scope or sessionScope? High risk of miscategorizing. Changes the cancellation behavior of existing stable code.
Approach 3: Create a new scope just for stateIn flows ✅
Keep scope exactly as-is. Create a new stateScope only for the stateIn calls that need to survive.
scope — untouched. Cancelled in leave(). All existing scope.launch code works exactly as today.
stateScope — new. Only the ~6 stateIn calls move here.
Benefit: Existing code path has zero behavior change. If we miss moving a stateIn, it stays on scope and gets cancelled like today — safe default.
Trade-off: stateScope stays alive after leave(). But it only holds StateFlows with their last emitted value — no active work, no upstream collection, no side effects. For non-reusable calls, we can cancel it in cleanup().
Approach 3b: Replace stateIn with a restartable operator ✅
Instead of managing scope lifetimes, replace stateIn with RestartableStateFlow + RestartableScope:
RestartableScope — wraps a coroutine scope. Can be cancelled and restarted.
RestartableStateFlow — registers itself with RestartableScope. On restart, automatically re-subscribes to upstream. Preserves last emitted value across cancel/restart.
Benefit:
shutDownJobsGracefully() still cancels everything (including RestartableScope) — existing behavior preserved
On join(), calling restartableScope.restart() brings all stateIn flows back to life automatically
Change surface is small — only the ~6 stateIn call sites change
No new scope that lives forever
Trade-off: Introduces a new abstraction (RestartableStateFlow). Needs to be well-tested.
Recommendation
I'd lean towards Approach 3b (restartable operator). It keeps the existing cancel-everything behavior intact, isolates the change to a small surface area, and solves the reuse problem cleanly.
Happy to discuss implementation details in a separate thread.
There was a problem hiding this comment.
Approach 3b cannot work because the state flows need to be observable even if the call is joined/left multiple times even if you are not in a session you still need to be able to get updates via call.get() or client.queryCalls.


Goal
Make
Callobjects reusable afterleave()so integrations that cache Call objects can rejoin without creating new instances. Currently, callingleave()cancels coroutine scopes making the Call unusable.Implementation
resetScopes()method called after cleanup to reset state for rejoin:ScopeProviderto allow executor recreationcleanupJobto prevent race conditions between leave cleanup and rapid rejoincallsmap after leave (removedestroyedCallscache)MediaManager.scopea dynamic getter fromcall.scopescopeparameter inMediaManagerImplTesting
./gradlew apiCheckto confirm no breaking API changes