CLI: Fix hang with exec/run -i with self-exiting command and open stdin#40921
CLI: Fix hang with exec/run -i with self-exiting command and open stdin#40921dkbennett wants to merge 3 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
This PR fixes a client-side hang in wslc container exec -i/-it when stdin is redirected to a synchronous (non-overlapped) pipe and the remote process exits without consuming stdin. The fix ensures the stdin relay worker can always be interrupted during teardown, matching the expected docker exec -i behavior.
Changes:
- Add a shared teardown helper in
ConsoleServicethat signals the exit event and repeatedly callsCancelSynchronousIobefore joining the stdin relay thread. - Treat
ERROR_OPERATION_ABORTEDfromReadFileas EOF in the relay read path so cancellation cleanly terminates the relay loop. - Add two E2E regression tests covering
exec -iandexec -itwith a self-exiting command while keeping stdin open.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| test/windows/wslc/e2e/WSLCE2EContainerExecTests.cpp | Adds regression tests asserting exec -i/-it echo hello exits without requiring client stdin closure. |
| src/windows/wslc/services/ConsoleService.cpp | Introduces teardown logic to interrupt synchronous stdin reads (via CancelSynchronousIo) before joining relay worker threads. |
| src/windows/common/relay.cpp | Maps ERROR_OPERATION_ABORTED from ReadFile to EOF (0 bytes) to allow canceled reads to end relay loops cleanly. |
| const auto threadHandle = static_cast<HANDLE>(inputThread.native_handle()); | ||
| DWORD waitResult = WAIT_TIMEOUT; | ||
| while (waitResult == WAIT_TIMEOUT) | ||
| { | ||
| if (!CancelSynchronousIo(threadHandle)) | ||
| { | ||
| // ERROR_NOT_FOUND just means nothing was in flight to cancel (expected); any other error | ||
| // means the cancel cannot make progress and would otherwise spin this loop, so throw. | ||
| const auto cancelError = GetLastError(); | ||
| THROW_WIN32_IF(cancelError, cancelError != ERROR_NOT_FOUND); | ||
| } | ||
|
|
||
| waitResult = WaitForSingleObject(threadHandle, 50); | ||
| } | ||
|
|
||
| // The loop only breaks on a non-timeout result; anything but WAIT_OBJECT_0 (e.g. WAIT_FAILED) is an | ||
| // error -- surface it instead of join()ing on a thread we can't confirm has exited. | ||
| THROW_LAST_ERROR_IF(waitResult != WAIT_OBJECT_0); | ||
|
|
||
| inputThread.join(); | ||
| } |
There was a problem hiding this comment.
Only real thing to do here is change to a failfast but its going to crash either way. It will not "mask the original failure" because the scope exit log will log the failure first, so this is really not a "high" priority issue, it's just a discussion about whether we want a failfast crash or an abort() crash w/ a log message.
Summary of the Pull Request
wslc exec -i <container> echo hellodidn't return until the client's stdin was closed, while docker exec -i returns as soon as the process exits. The hang is entirely client‑side.When wslc's stdin is a pipe (not a console), the stdin relay runs on a worker thread blocked in a synchronous ReadFile . That read can't be interrupted by the relay's exit event. So when the remote process exits and teardown tries to join() the worker, the join blocks until stdin is closed externally — the hang.
The fix
A shared teardown helper, InterruptAndJoinInputThread (used by both the TTY and non‑TTY relay paths), now signals the exit event and then calls CancelSynchronousIo on the worker (retried until it exits) before joining — which aborts the otherwise‑uninterruptible read. A matching one‑line change in relay.cpp treats the resulting ERROR_OPERATION_ABORTED as EOF so the relay loop ends cleanly. Termination is guaranteed: every state the worker can block in is broken by either the cancel or the already‑set exit event.
Two regression tests are also added which exercise this hang and I confirmed these tests fail without the fix, and pass with the fix.
PR Checklist
Detailed Description of the Pull Request / Additional comments
Validation Steps Performed
Tests
Two regression tests run a self‑exiting command ( echo hello ) and assert the process exits before stdin is closed — one for exec -i , one for exec -it , since the two paths have separate teardown logic that could regress independently. Both detect the symptom (process must exit within the timeout), so any teardown hang fails the test.
These tests fail without the fix and pass with the fix.