wslc: idle-terminate per-user session VMs when inactive#40781
wslc: idle-terminate per-user session VMs when inactive#40781benhillis wants to merge 3 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds on-demand creation and idle-termination of per-user WSLC session VMs (for sessions with persistent storage), so memory can be reclaimed while keeping the session object and storage intact. It also introduces VM-liveness/activity bookkeeping to prevent teardown during in-flight operations and adds new E2E coverage around VM lifecycle behavior.
Changes:
- Implement lazy VM bring-up and idle shutdown in
wslcsessionvia an idle worker, activity counting/tokens, and aVmLeaseused by VM-requiring operations. - Add client-side “operation keep-alive” usage in
wslc.execontainer operations to prevent VM teardown betweenOpenContainerand subsequent calls/streaming. - Add a new E2E test suite validating lazy start, idle stop, persistence across restarts, keep-alive for root-namespace processes, and teardown/recreate races.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| test/windows/wslc/e2e/WSLCE2EVmIdleTests.cpp | New E2E tests covering lazy VM start, idle stop, persistence, keep-alive, and race scenarios. |
| test/windows/wslc/e2e/WSLCE2EHelpers.h | Exposes the underlying IWSLCSession* for diagnostics/test-only calls. |
| src/windows/wslcsession/WSLCSession.h | Adds VM lifecycle state, idle worker/tokens/lease declarations, and new session methods. |
| src/windows/wslcsession/WSLCSession.cpp | Implements lazy VM creation, idle teardown, activity tokens, and VM diagnostics reporting. |
| src/windows/wslcsession/WSLCProcessControl.cpp | Preserves a real exit code when signaling container release, only synthesizing SIGKILL when needed. |
| src/windows/wslcsession/WSLCProcess.h | Stores a keep-alive token on root-namespace processes to keep the VM alive for their lifetime. |
| src/windows/wslcsession/WSLCContainer.cpp | Signals idle re-checks on terminal container transitions; holds a VM lease during delete. |
| src/windows/wslcsession/IORelay.h | Adds IsRelayThread() to safely avoid destroying the relay on its own thread. |
| src/windows/wslcsession/IORelay.cpp | Co-initializes the relay thread into the MTA; implements IsRelayThread(). |
| src/windows/wslc/services/SessionModel.h | Adds a helper to acquire/hold a keep-alive token for client-side container operations. |
| src/windows/wslc/services/ContainerService.cpp | Uses the keep-alive token across container operations (attach/start/stop/kill/delete/exec/etc.). |
| src/windows/service/inc/wslc.idl | Adds VM diagnostics type + new session methods for diagnostics and operation keep-alive. |
| src/windows/service/exe/WSLCSessionManager.cpp | Updates comments to reflect on-demand VM creation and recreation after idle termination. |
c12d7e1 to
fa2eb47
Compare
fa2eb47 to
ea2254c
Compare
b870044 to
4bcd87f
Compare
4bcd87f to
348e2e1
Compare
benhillis
left a comment
There was a problem hiding this comment.
Reviewed VM-related comments - all have been addressed:
Comment on WSLCSession.h:84: Already correct - lines 77-78 say "IWSLCVirtualMachineFactory" and "lazily on first use"
Comment on WSLCSession.cpp:571: Fixed by AddRef/Release activity tracking - idle worker checks ActivityCount (line 779), and container proxies increment it on AddRef 1→2 transition. VM will not idle-terminate while clients hold container proxies. See lines 672-674 comment.
Comment on WSLCSession.cpp:380: Already has exception handling - IdleWorker() is wrapped in CATCH_LOG() at lines 375-379
Comment on Session.cpp:65: Already correct - wil::unique_threadpool_wait (line 56) calls WaitForThreadpoolWaitCallbacks in destructor automatically
e99a7f2 to
ea6bad8
Compare
ea6bad8 to
6e9729a
Compare
6e9729a to
fff0325
Compare
Per-user WSLC container session VMs now idle-terminate when no container is in a non-terminal (Created/Running) state, freeing host memory, and lazily restart on the next operation that needs the VM. - Centralize VM lifecycle in WSLCSession via TearDownVmLockHeld / StartVmLockHeld and an atomic VmExitDisposition (Active / StopRequested / ExitClaimed) to arbitrate expected stops vs. spontaneous VM exits without a polling thread. - Gate VM-requiring entrypoints behind AcquireVmLease(), which brings the VM up on demand and keeps it alive for the operation's duration. - Add IWSLCSession::BeginContainerOperation so a CLI command can hold the VM alive across resolve + operate + streamed output. - Preserve the session WarningCallback for the lifetime of the session so warnings emitted by the lazy VM start (e.g. resource recovery) are still delivered to the CLI invocation. - Remove the dtor lock in HcsVirtualMachine; OnExit/OnCrash are lock-free. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Only Running containers now hold an activity reference that keeps the
per-user session VM alive. Previously a container in either Created or
Running state held the reference, so a `create`d-but-never-started
container pinned the VM indefinitely and defeated idle termination.
A created container's metadata persists on the containerd VHD across VM
teardown and is rebuilt by RecoverExistingContainers on the next
VM-requiring operation, so create -> idle-terminate -> start later works;
the 30s grace period covers the common create-then-start gap.
Also fix m_stateChangedAt recovery for created containers: docker inspect
reports FinishedAt as the zero date ("0001-01-01T00:00:00Z") for a
never-started container, which parsed to year 1 and rendered as "created
2026 years ago". Use the container's Created time for the Created state.
This recovery path was previously unreachable, since created containers
never got torn down.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
fff0325 to
dd57d4c
Compare
| // Copy and invoke outside the lock: OnIdle takes the session lock, and holding this | ||
| // lock across that would invert the session-lock -> idle-lock ordering. | ||
| onIdle = self->m_onIdle; | ||
| } |
| // Acquires or releases the activity hold so it is held exactly while the container is in an | ||
| // active (Created/Running) state, keeping the session's VM alive across idle teardown. | ||
| __requires_lock_held(m_lock) void UpdateActivityHoldLockHeld() noexcept; |
| // Held (non-empty) exactly while the container is Created/Running so the session's VM stays | ||
| // alive even when no client holds the wrapper (e.g. a detached `run -d` container). Maintained | ||
| // by UpdateActivityHoldLockHeld(); released automatically when the container is destroyed. | ||
| ActivityRef m_activityHold; |
Addresses review feedback: WSLCContainer.h still described the activity hold as held while Created/Running, but it now only pins the VM while Running. Update the two header comments to match the implementation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary
Idle-terminates a per-user WSLC session's backing VM when it has been inactive, freeing memory while the session object (and its persistent storage) lives on. The VM is transparently recreated on the next operation.
Builds on #40770 (IWSLCVirtualMachineFactory).
Behavior
Testing
WSLCE2EVmIdleTestsE2E suite (5 tests) includingWSLCE2E_VmIdle_RootProcessKeepsVmAlive.WSLCTests::CreateRootNamespaceProcessstill passes.Notes / follow-ups (deferred)
Note
Draft for early review.