Skip to content

CLI: Fix hang with exec/run -i with self-exiting command and open stdin#40921

Draft
dkbennett wants to merge 3 commits into
masterfrom
user/dkbennett/execstdin
Draft

CLI: Fix hang with exec/run -i with self-exiting command and open stdin#40921
dkbennett wants to merge 3 commits into
masterfrom
user/dkbennett/execstdin

Conversation

@dkbennett

Copy link
Copy Markdown
Member

Summary of the Pull Request

wslc exec -i <container> echo hello  didn't return until the client's stdin was closed, while  docker exec -i  returns as soon as the process exits. The hang is entirely client‑side.

When wslc's stdin is a pipe (not a console), the stdin relay runs on a worker thread blocked in a synchronous  ReadFile . That read can't be interrupted by the relay's exit event. So when the remote process exits and teardown tries to  join()  the worker, the join blocks until stdin is closed externally — the hang.

The fix

A shared teardown helper,  InterruptAndJoinInputThread  (used by both the TTY and non‑TTY relay paths), now signals the exit event and then calls  CancelSynchronousIo  on the worker (retried until it exits) before joining — which aborts the otherwise‑uninterruptible read. A matching one‑line change in  relay.cpp  treats the resulting  ERROR_OPERATION_ABORTED  as EOF so the relay loop ends cleanly. Termination is guaranteed: every state the worker can block in is broken by either the cancel or the already‑set exit event.

Two regression tests are also added which exercise this hang and I confirmed these tests fail without the fix, and pass with the fix.

PR Checklist

  • Closes: Link to issue #xxx
  • Communication: I've discussed this with core contributors already. If work hasn't been agreed, this work might be rejected
  • Tests: Added/updated if needed and all pass
  • Localization: All end user facing strings can be localized
  • Dev docs: Added/updated if needed
  • Documentation updated: If checked, please file a pull request on our docs repo and link it here: #xxx

Detailed Description of the Pull Request / Additional comments

Validation Steps Performed

Tests

Two regression tests run a self‑exiting command ( echo hello ) and assert the process exits before stdin is closed — one for  exec -i , one for  exec -it , since the two paths have separate teardown logic that could regress independently. Both detect the symptom (process must exit within the timeout), so any teardown hang fails the test.

These tests fail without the fix and pass with the fix.

Copilot AI review requested due to automatic review settings June 25, 2026 22:51

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a client-side hang in wslc container exec -i/-it when stdin is redirected to a synchronous (non-overlapped) pipe and the remote process exits without consuming stdin. The fix ensures the stdin relay worker can always be interrupted during teardown, matching the expected docker exec -i behavior.

Changes:

  • Add a shared teardown helper in ConsoleService that signals the exit event and repeatedly calls CancelSynchronousIo before joining the stdin relay thread.
  • Treat ERROR_OPERATION_ABORTED from ReadFile as EOF in the relay read path so cancellation cleanly terminates the relay loop.
  • Add two E2E regression tests covering exec -i and exec -it with a self-exiting command while keeping stdin open.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
test/windows/wslc/e2e/WSLCE2EContainerExecTests.cpp Adds regression tests asserting exec -i/-it echo hello exits without requiring client stdin closure.
src/windows/wslc/services/ConsoleService.cpp Introduces teardown logic to interrupt synchronous stdin reads (via CancelSynchronousIo) before joining relay worker threads.
src/windows/common/relay.cpp Maps ERROR_OPERATION_ABORTED from ReadFile to EOF (0 bytes) to allow canceled reads to end relay loops cleanly.

Comment thread test/windows/wslc/e2e/WSLCE2EContainerExecTests.cpp Outdated
Comment thread src/windows/wslc/services/ConsoleService.cpp Outdated
Comment thread src/windows/wslc/services/ConsoleService.cpp Outdated
Comment thread src/windows/common/relay.cpp
Comment thread src/windows/wslc/services/ConsoleService.cpp Outdated
Comment thread src/windows/wslc/services/ConsoleService.cpp Outdated
@dkbennett dkbennett changed the title CLI: Fix hang with exec -i with self-exiting command and open stdin CLI: Fix hang with exec/run -i with self-exiting command and open stdin Jun 26, 2026
Copilot AI review requested due to automatic review settings June 26, 2026 19:19

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

Comment on lines +46 to +66
const auto threadHandle = static_cast<HANDLE>(inputThread.native_handle());
DWORD waitResult = WAIT_TIMEOUT;
while (waitResult == WAIT_TIMEOUT)
{
if (!CancelSynchronousIo(threadHandle))
{
// ERROR_NOT_FOUND just means nothing was in flight to cancel (expected); any other error
// means the cancel cannot make progress and would otherwise spin this loop, so throw.
const auto cancelError = GetLastError();
THROW_WIN32_IF(cancelError, cancelError != ERROR_NOT_FOUND);
}

waitResult = WaitForSingleObject(threadHandle, 50);
}

// The loop only breaks on a non-timeout result; anything but WAIT_OBJECT_0 (e.g. WAIT_FAILED) is an
// error -- surface it instead of join()ing on a thread we can't confirm has exited.
THROW_LAST_ERROR_IF(waitResult != WAIT_OBJECT_0);

inputThread.join();
}

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only real thing to do here is change to a failfast but its going to crash either way. It will not "mask the original failure" because the scope exit log will log the failure first, so this is really not a "high" priority issue, it's just a discussion about whether we want a failfast crash or an abort() crash w/ a log message.

@dkbennett dkbennett marked this pull request as ready for review June 26, 2026 22:50
@dkbennett dkbennett requested a review from a team as a code owner June 26, 2026 22:50
@dkbennett dkbennett marked this pull request as draft June 26, 2026 22:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants