ThreadingWin: force Thread::suspend to be synchronous#183
Conversation
On multi-core Windows systems, SuspendThread merely requests a suspension from the kernel -- the target thread may continue to execute user-mode instructions briefly after SuspendThread returns, until the kernel reschedules it away. Callers that read registers (MachineStackMarker) happen to force a sync by calling GetThreadContext right after suspend, but callers that only want the target stopped -- notably the libpas scavenger's pasSuspenderBeginSuspend path -- would otherwise race the mutator: the scavenger decommits TLC pages while the mutator is still writing to them, violating pas_thread_suspender.h's contract and causing intermittent heap corruption or access violations. Call GetThreadContext once with CONTEXT_INTEGER immediately after a successful SuspendThread. GetThreadContext synchronizes with the scheduler and does not return until the target is actually stopped. On failure (rare -- handle invalidation), undo the SuspendThread and surface the error code so the caller sees a clean Expected failure instead of a half-suspended thread. Add a regression test in TestWebKitAPI: a worker busy-increments a counter; while suspended, the counter must not advance. The test is cross-platform (Darwin mach_thread_suspend and Linux signal-based suspend are already synchronous) and would have flagged this bug had it existed then.
Preview Builds
|
WalkthroughThis change enhances the Windows thread suspension mechanism by adding synchronous verification. The Changes
🚥 Pre-merge checks | ✅ 1 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@Source/WTF/wtf/win/ThreadingWin.cpp`:
- Around line 220-221: The check after SuspendThread in ThreadingWin.cpp
incorrectly returns the sentinel value (DWORD)-1 instead of the real error;
replace the early return that calls makeUnexpected(result) when result ==
(DWORD)-1 with code that captures the real error via GetLastError() and returns
makeUnexpected(errorCode) (i.e., call GetLastError() into a DWORD and pass that
to makeUnexpected). Update the SuspendThread failure branch so callers like
MachineStackMarker.cpp receive the actual OS error code.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7cd3cf05-dcba-4204-96c7-3ff7b9ec70ac
📒 Files selected for processing (2)
Source/WTF/wtf/win/ThreadingWin.cppTools/TestWebKitAPI/Tests/WTF/Threading.cpp
| if (result == (DWORD)-1) | ||
| return makeUnexpected(result); |
There was a problem hiding this comment.
Inconsistent error handling: return actual error code, not the failure sentinel.
When SuspendThread fails, result is (DWORD)-1, which is the failure sentinel—not the actual error code. This is inconsistent with line 232 where GetLastError() is correctly called. Callers like MachineStackMarker.cpp may inspect the error value for diagnostics.
🐛 Proposed fix
DWORD result = SuspendThread(m_handle);
- if (result == (DWORD)-1)
- return makeUnexpected(result);
+ if (result == (DWORD)-1) {
+ DWORD error = GetLastError();
+ return makeUnexpected(error);
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (result == (DWORD)-1) | |
| return makeUnexpected(result); | |
| if (result == (DWORD)-1) { | |
| DWORD error = GetLastError(); | |
| return makeUnexpected(error); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@Source/WTF/wtf/win/ThreadingWin.cpp` around lines 220 - 221, The check after
SuspendThread in ThreadingWin.cpp incorrectly returns the sentinel value
(DWORD)-1 instead of the real error; replace the early return that calls
makeUnexpected(result) when result == (DWORD)-1 with code that captures the real
error via GetLastError() and returns makeUnexpected(errorCode) (i.e., call
GetLastError() into a DWORD and pass that to makeUnexpected). Update the
SuspendThread failure branch so callers like MachineStackMarker.cpp receive the
actual OS error code.
On multi-core Windows systems, SuspendThread merely requests a suspension
from the kernel -- the target thread may continue to execute user-mode
instructions briefly after SuspendThread returns, until the kernel
reschedules it away. Callers that read registers (MachineStackMarker)
happen to force a sync by calling GetThreadContext right after suspend,
but callers that only want the target stopped -- notably the libpas
scavenger's pasSuspenderBeginSuspend path -- would otherwise race the
mutator: the scavenger decommits TLC pages while the mutator is still
writing to them, violating pas_thread_suspender.h's contract and
causing intermittent heap corruption or access violations.
Call GetThreadContext once with CONTEXT_INTEGER immediately after a
successful SuspendThread. GetThreadContext synchronizes with the
scheduler and does not return until the target is actually stopped.
On failure (rare -- handle invalidation), undo the SuspendThread and
surface the error code so the caller sees a clean Expected failure
instead of a half-suspended thread.
Add a regression test in TestWebKitAPI: a worker busy-increments a
counter; while suspended, the counter must not advance. The test is
cross-platform (Darwin mach_thread_suspend and Linux signal-based
suspend are already synchronous) and would have flagged this bug had
it existed then.