Skip to content

fix: avoid taskkill during Windows shutdown#137

Merged
zouyonghe merged 1 commit into
AstrBotDevs:mainfrom
zouyonghe:fix/windows-shutdown-no-taskkill
Jun 10, 2026
Merged

fix: avoid taskkill during Windows shutdown#137
zouyonghe merged 1 commit into
AstrBotDevs:mainfrom
zouyonghe:fix/windows-shutdown-no-taskkill

Conversation

@zouyonghe

@zouyonghe zouyonghe commented Jun 10, 2026

Copy link
Copy Markdown
Member

Summary / 改动概述

Fixes #135 after #136 added the Windows shutdown message hook but still left the shutdown cleanup path launching taskkill.exe.

PR #136 did not fully address the reported issue because the issue's user-visible failure is taskkill.exe - Application Error (0xc0000142) during Windows shutdown. The merged hook handles WM_QUERYENDSESSION, but it calls BackendState::stop_backend_with_timeout(), which still delegates to process_control::stop_child_process_gracefully() on Windows. That function launches taskkill /pid ... /t and may later launch taskkill /f, so the system-shutdown path can still start taskkill.exe at exactly the time Windows may fail to initialize new helper processes.

This PR keeps the normal app-exit behavior unchanged, but changes the Windows OS-shutdown path to avoid taskkill.exe entirely:

  • add a Windows-only BackendState::stop_backend_for_system_shutdown() path
  • add process_control::stop_child_process_for_system_shutdown() that snapshots the child process tree with ToolHelp APIs and terminates processes with Win32 TerminateProcess
  • keep normal exits using the existing taskkill-based graceful stop behavior
  • update the WM_QUERYENDSESSION handler to use the new native shutdown stop path
  • add a regression test documenting that Windows system shutdown must not use external stop commands

Verification / 验证方式

Local macOS verification:

$ cd src-tauri && cargo fmt --check && cargo check
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.36s
$ cd src-tauri && cargo test
running 142 tests
...
test result: ok. 142 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.02s

Windows-only API compile smoke check via a temporary minimal crate that imports src-tauri/src/process_control.rs and depends only on windows-sys, avoiding the local machine's missing full Windows cross C toolchain for ring:

$ cargo check --target x86_64-pc-windows-gnu --lib
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.30s

Full cargo check --target x86_64-pc-windows-gnu for the Tauri app still cannot complete on this macOS host because ring requires x86_64-w64-mingw32-gcc, which is not installed:

error occurred in cc-rs: failed to find tool "x86_64-w64-mingw32-gcc": No such file or directory

Checklist / 检查清单

  • This change is not a breaking change. / 此改动不是破坏性变更。
  • I have verified the change locally (and provided logs/screenshots above). / 我已在本地验证并在上方提供日志/截图。
  • No workflows are changed. / 未修改工作流。
  • If dependencies are updated, lockfiles are updated accordingly (src-tauri/Cargo.lock, lockfiles under changed package dirs). / 如果引入或升级依赖,已同步更新相关 lock 文件(如 src-tauri/Cargo.lock、对应包目录中的 lock 文件)。
  • This PR does not include malicious code. / 此 PR 不包含恶意代码。

Co-Authored-By: Warp agent@warp.dev

Summary by Sourcery

Prevent Windows system shutdown from invoking external task-kill helpers and instead terminate the backend process tree natively during OS shutdown.

Bug Fixes:

  • Avoid launching taskkill.exe during Windows shutdown to prevent 0xc0000142 errors when the OS can no longer start helper processes.

Enhancements:

  • Introduce a Windows-specific backend shutdown path that uses native Win32 APIs to snapshot and terminate the backend process tree during system shutdown while keeping normal app-exit behavior unchanged.

Build:

  • Enable the Win32_System_Diagnostics_ToolHelp feature on the windows-sys dependency to support native process snapshotting on Windows.

Tests:

  • Add regression tests for Windows process-tree collection and expected shutdown termination error handling to document and guard the new shutdown behavior.

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • In snapshot_process_tree, the repeated tree.contains checks inside the nested loop make this O(n²); consider tracking discovered PIDs in a HashSet<u32> to avoid duplicate work and simplify the tree-building logic.
  • The new native shutdown path (terminate_process_by_pid / terminate_process_tree_native) logs on any OpenProcess/TerminateProcess failure, which may be expected during shutdown (e.g., exited or protected processes); consider downgrading or filtering these errors so logs don’t get noisy on normal shutdown.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `snapshot_process_tree`, the repeated `tree.contains` checks inside the nested loop make this O(n²); consider tracking discovered PIDs in a `HashSet<u32>` to avoid duplicate work and simplify the tree-building logic.
- The new native shutdown path (`terminate_process_by_pid` / `terminate_process_tree_native`) logs on any `OpenProcess`/`TerminateProcess` failure, which may be expected during shutdown (e.g., exited or protected processes); consider downgrading or filtering these errors so logs don’t get noisy on normal shutdown.

## Individual Comments

### Comment 1
<location path="src-tauri/src/process_control.rs" line_range="31" />
<code_context>
 #[cfg(target_os = "windows")]
 const WINDOWS_CREATE_NO_WINDOW: u32 = 0x0800_0000;

+#[derive(Clone, Copy, Debug, Eq, PartialEq)]
+enum ChildStopMode {
+    NormalExit,
</code_context>
<issue_to_address>
**issue (complexity):** Consider removing the unused ChildStopMode abstraction and simplifying the terminate_process_tree_native logging API to reduce indirection and type-level complexity without changing behavior.

You can simplify a bit without changing behavior by removing the unused abstraction and narrowing the internal helper’s API.

### 1. Remove `ChildStopMode` / `stop_mode_uses_external_command`

The enum and helper are only used in `debug_assert!` and unit tests and don’t drive any real branching. You can inline the invariant directly where it matters and drop the extra indirection:

```rust
// Remove this:
#[derive(Clone, Copy, Debug, Eq, PartialEq)]
enum ChildStopMode {
    NormalExit,
    #[cfg_attr(not(target_os = "windows"), allow(dead_code))]
    WindowsSystemShutdown,
}

fn stop_mode_uses_external_command(mode: ChildStopMode) -> bool {
    match mode {
        ChildStopMode::NormalExit => true,
        ChildStopMode::WindowsSystemShutdown => false,
    }
}
```

Replace the `debug_assert!`s with a direct statement of intent:

```rust
#[cfg(target_os = "windows")]
pub fn stop_child_process_gracefully<F>(child: &mut Child, timeout: Duration, log: F) -> bool
where
    F: Fn(&str) + Copy,
{
    // In "graceful" mode we currently always use an external command (taskkill).
    // If this ever changes, review the shutdown behavior to keep parity.
    let pid = child.id();
    let pid_arg = pid.to_string();
    // ...
}

#[cfg(not(target_os = "windows"))]
pub fn stop_child_process_gracefully<F>(child: &mut Child, timeout: Duration, log: F) -> bool
where
    F: Fn(&str) + Copy,
{
    // In "graceful" mode we currently always use an external command (kill -TERM).
    let pid = child.id();
    let pid_arg = pid.to_string();
    // ...
}
```

And drop the unit tests that only exercise `stop_mode_uses_external_command`, since they don’t cover real process behavior.

If you later add a path that actually needs a mode (e.g. sharing a single function for normal vs shutdown stop), you can reintroduce a parameter once it drives real branching.

### 2. Simplify `terminate_process_tree_native` logging parameter

Since `terminate_process_tree_native` is an internal helper and the log callback is only used synchronously, you can avoid a generic and make the signature easier to read:

```rust
#[cfg(target_os = "windows")]
fn terminate_process_tree_native(
    root_pid: u32,
    log: &dyn Fn(&str),
) {
    let process_tree = match snapshot_process_tree(root_pid) {
        Ok(process_tree) => process_tree,
        Err(error) => {
            log(&format!(
                "native Windows process tree snapshot failed: pid={root_pid}, error={error}"
            ));
            vec![root_pid]
        }
    };

    for pid in process_tree {
        if let Err(error) = terminate_process_by_pid(pid) {
            log(&format!(
                "native Windows process termination failed: pid={pid}, error={error}"
            ));
        }
    }
}
```

Call sites can pass `&log`:

```rust
pub fn stop_child_process_for_system_shutdown<F>(
    child: &mut Child,
    timeout: Duration,
    log: F,
) -> bool
where
    F: Fn(&str) + Copy,
{
    let pid = child.id();
    terminate_process_tree_native(pid, &log);
    wait_for_child_exit(child, timeout)
}
```

This keeps all behavior intact while reducing the amount of indirection and type-level complexity around these helpers.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread src-tauri/src/process_control.rs Outdated

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a native Windows process tree termination mechanism to safely stop the backend during system shutdown without spawning external helper processes like taskkill.exe, which can fail late in the shutdown sequence. The review feedback highlights two important issues: first, a compilation error on Windows due to calling is_null() on a HANDLE (which is an isize in windows-sys 0.59 and should be compared to 0 instead); second, a process termination order issue where parent processes are terminated before their children, which could be resolved by implementing a post-order traversal to prevent orphaned child processes.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

#[cfg(target_os = "windows")]
fn terminate_process_by_pid(pid: u32) -> io::Result<()> {
let handle = unsafe { OpenProcess(PROCESS_TERMINATE, 0, pid) };
if handle.is_null() {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

In windows-sys version 0.59, HANDLE is defined as isize rather than a pointer type. Since primitive types like isize do not have an is_null() method, calling handle.is_null() will result in a compilation error on Windows. You should check if the handle is equal to 0 instead.

Suggested change
if handle.is_null() {
if handle == 0 {

Comment thread src-tauri/src/process_control.rs Outdated
Comment on lines +172 to +182
let mut tree = Vec::new();
let mut stack = vec![root_pid];
while let Some(parent_pid) = stack.pop() {
for (pid, candidate_parent_pid) in &entries {
if *candidate_parent_pid == parent_pid && !tree.contains(pid) {
tree.push(*pid);
stack.push(*pid);
}
}
}
tree.push(root_pid);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current process tree traversal terminates parent processes before their children (e.g., C2 is terminated before GC2 because C2 is added to tree first). This can lead to orphaned child processes escaping termination or spawning new processes before they are terminated.

To ensure robust process cleanup, we should perform a post-order (bottom-up) traversal so that children are guaranteed to be terminated before their parents. This can be cleanly implemented using a recursive helper function with a HashSet to track visited nodes and prevent cycles.

    let mut tree = Vec::new();
    let mut visited = std::collections::HashSet::new();

    fn build_tree(
        pid: u32,
        entries: &[(u32, u32)],
        visited: &mut std::collections::HashSet<u32>,
        tree: &mut Vec<u32>,
    ) {
        if !visited.insert(pid) {
            return;
        }
        for &(child_pid, parent_pid) in entries {
            if parent_pid == pid {
                build_tree(child_pid, entries, visited, tree);
            }
        }
        tree.push(pid);
    }

    build_tree(root_pid, &entries, &mut visited, &mut tree);

@zouyonghe zouyonghe force-pushed the fix/windows-shutdown-no-taskkill branch from 9780eab to 0437e6b Compare June 10, 2026 01:01
@zouyonghe

Copy link
Copy Markdown
Member Author

@sourcery-ai review

1 similar comment
@zouyonghe

Copy link
Copy Markdown
Member Author

@sourcery-ai review

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue

Prompt for AI Agents
Please address the comments from this code review:

## Individual Comments

### Comment 1
<location path="src-tauri/src/process_control.rs" line_range="194-213" />
<code_context>
+}
+
+#[cfg(target_os = "windows")]
+fn terminate_process_by_pid(pid: u32) -> io::Result<()> {
+    let handle = unsafe { OpenProcess(PROCESS_TERMINATE, 0, pid) };
+    if handle.is_null() {
+        return Err(io::Error::last_os_error());
+    }
+
+    let result = unsafe { TerminateProcess(handle, 1) };
+    let error = if result == 0 {
+        Some(io::Error::last_os_error())
+    } else {
+        None
+    };
+    close_handle(handle);
+
+    match error {
+        Some(error) => Err(error),
+        None => Ok(()),
</code_context>
<issue_to_address>
**suggestion:** Errors from `OpenProcess` are treated as unexpected, even when they might correspond to expected shutdown races.

In `terminate_process_tree_native`, `TerminateProcess` errors are filtered via `is_expected_shutdown_termination_error`, but `OpenProcess` errors are always treated as hard errors and logged. During shutdown, `OpenProcess` can legitimately return errors like `ERROR_ACCESS_DENIED` or `ERROR_INVALID_PARAMETER` for processes exiting concurrently. Consider either reusing `is_expected_shutdown_termination_error` for `OpenProcess` failures, or adding a similar filter specifically for `OpenProcess` so shutdown logs don’t get cluttered with expected races.

```suggestion
#[cfg(target_os = "windows")]
fn terminate_process_by_pid(pid: u32) -> io::Result<()> {
    let handle = unsafe { OpenProcess(PROCESS_TERMINATE, 0, pid) };
    if handle.is_null() {
        let error = io::Error::last_os_error();
        if is_expected_shutdown_termination_error(&error) {
            // During shutdown, races where the process exits between enumeration and
            // OpenProcess are expected; treat them as successful termination.
            return Ok(());
        }
        return Err(error);
    }

    let result = unsafe { TerminateProcess(handle, 1) };
    let error = if result == 0 {
        Some(io::Error::last_os_error())
    } else {
        None
    };
    close_handle(handle);

    match error {
        Some(error) => Err(error),
        None => Ok(()),
    }
}
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread src-tauri/src/process_control.rs
@zouyonghe zouyonghe force-pushed the fix/windows-shutdown-no-taskkill branch from 0437e6b to e398fde Compare June 10, 2026 01:12
@zouyonghe

Copy link
Copy Markdown
Member Author

@sourcery-ai review

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've reviewed your changes and they look great!


Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've reviewed your changes and they look great!


Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue

Prompt for AI Agents
Please address the comments from this code review:

## Individual Comments

### Comment 1
<location path="src-tauri/src/process_control.rs" line_range="182-192" />
<code_context>
+}
+
+#[cfg(target_os = "windows")]
+fn snapshot_process_tree(root_pid: u32) -> io::Result<Vec<u32>> {
+    let snapshot = unsafe { CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, 0) };
+    if snapshot == INVALID_HANDLE_VALUE {
+        return Err(io::Error::last_os_error());
+    }
+
+    let mut entries = Vec::new();
+    let mut entry: PROCESSENTRY32W = unsafe { std::mem::zeroed() };
+    entry.dwSize = std::mem::size_of::<PROCESSENTRY32W>() as u32;
+
+    let first_ok = unsafe { Process32FirstW(snapshot, &mut entry) } != 0;
+    if !first_ok {
+        let error = io::Error::last_os_error();
+        close_handle(snapshot);
+        return Err(error);
+    }
+
+    loop {
+        entries.push((entry.th32ProcessID, entry.th32ParentProcessID));
+        let next_ok = unsafe { Process32NextW(snapshot, &mut entry) } != 0;
+        if !next_ok {
+            break;
+        }
+    }
+    close_handle(snapshot);
+
+    Ok(collect_descendant_processes(root_pid, &entries))
+}
+
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Process32NextW failures are treated as end-of-enumeration; consider distinguishing real errors from ERROR_NO_MORE_FILES.

In the loop, any `Process32NextW` failure is treated as normal termination, but the call can fail for other reasons (e.g., invalid snapshot) that should be surfaced. Consider checking `io::Error::last_os_error()` when `next_ok` is false and only treating `ERROR_NO_MORE_FILES` as end-of-enumeration; for other errors, clean up the handle and return an error, as you already do for `Process32FirstW`.

```suggestion
    loop {
        entries.push((entry.th32ProcessID, entry.th32ParentProcessID));
        let next_ok = unsafe { Process32NextW(snapshot, &mut entry) } != 0;
        if !next_ok {
            let error = io::Error::last_os_error();
            // Treat ERROR_NO_MORE_FILES as normal end-of-enumeration; anything
            // else is a real error that should be surfaced.
            if let Some(code) = error.raw_os_error() {
                if code as u32 == winapi::shared::winerror::ERROR_NO_MORE_FILES {
                    break;
                }
            }
            close_handle(snapshot);
            return Err(error);
        }
    }
    close_handle(snapshot);

    Ok(collect_descendant_processes(root_pid, &entries))
}
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread src-tauri/src/process_control.rs
Co-Authored-By: Warp <agent@warp.dev>
@zouyonghe zouyonghe force-pushed the fix/windows-shutdown-no-taskkill branch from e398fde to ce445f6 Compare June 10, 2026 01:21
@zouyonghe zouyonghe merged commit 697178a into AstrBotDevs:main Jun 10, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Windows 关机时弹出 taskkill.exe 错误 (0xc0000142),阻断正常关机

1 participant