fix(file_search): honor cancel_token to keep stop button responsive#2044
fix(file_search): honor cancel_token to keep stop button responsive#2044h3c-hexin wants to merge 1 commit into
Conversation
`FileSearchTool::execute` is async but `search_files()` synchronously iterates the directory tree via `ignore::WalkBuilder`. On large trees ($HOME, mounted volumes, deep node_modules/), this blocks the tokio task for tens of seconds — `Op::Cancel` and the frontend stop button are no-ops until the walk completes. `grep_files` already solved this in search.rs by polling `ToolContext.cancel_token` between iterations. This applies the same pattern to `file_search`: thread the token through `search_files()` and call `check_cancelled()` once per `WalkBuilder` entry. The shared helper is promoted to `pub(super)` so the two sibling modules can reuse it. This supersedes Hmbown#1790 (spawn_blocking + 30s timeout), which the gemini reviewer correctly flagged as the wrong shape — the CancellationToken approach leaves no orphan threads and surfaces cancellation immediately.
There was a problem hiding this comment.
Code Review
This pull request introduces cancellation support to the file search tool by integrating a CancellationToken into the search process and verifying it with a new test case. The review feedback highlights that the search operation is still synchronous and blocking, which could lead to executor starvation on large directory trees; it is recommended to offload this task using tokio::task::spawn_blocking.
| let matches = search_files( | ||
| query, | ||
| &base_path, | ||
| extensions, | ||
| exclude_patterns, | ||
| limit, | ||
| context.cancel_token.as_ref(), | ||
| )?; |
There was a problem hiding this comment.
While adding the cancel_token makes the search interruptible, the search_files function is still a synchronous, blocking operation. On large directory trees, this will block the Tokio worker thread for tens of seconds (as noted in the PR description). This can lead to latency spikes for other async tasks or even executor starvation.
Consider wrapping this call in tokio::task::spawn_blocking to offload the heavy I/O and iteration to a dedicated thread pool, while still passing the cancel_token to ensure the background thread exits promptly upon cancellation.
Includes cancellable file_search/list_dir behavior, deterministic sub-agent whale-species nicknames, and an honest /balance command scaffold for the provider-billing work. Reviewed the overlapping file-search cancellation contribution in #2044 from @h3c-hexin; that PR's Windows failure was unrelated to the cancellation code, and the contributor context is preserved here.
|
Thanks @h3c-hexin for the clear repro and the cancellation-token patch. The file_search stop/cancel fix landed through #2035 (merge commit d22da53) in the v0.8.45 control-plane slice, with cancellation checks plus blocking-worker isolation/timeout so long walks do not pin the async runtime.\n\nI also checked the #2044 Windows failure before closing this: it timed out in unrelated tools::js_execution tests after Linux/macOS/lint/npm smoke had passed, so I treated the contribution as superseded rather than rejected. Appreciate the focused writeup and test shape here. |
- Update @reidliu41 entry to add user-message transcript highlight (#1995) - Update @zlh124 entry to add SKILL.md YAML block-scalar parser (#1908) - Update @h3c-hexin entry to add file_search cancellation report (#2044) - Add @lpeng1711694086-lang for user-message TUI contrast feature request (#1672)
- Update @reidliu41 entry to add user-message transcript highlight (#1995) - Update @zlh124 entry to add SKILL.md YAML block-scalar parser (#1908) - Update @h3c-hexin entry to add file_search cancellation report (#2044) - Add @lpeng1711694086-lang for user-message TUI contrast feature request (#1672)
Problem
FileSearchTool::executeisasync fnbutsearch_files()synchronously iterates the directory tree viaignore::WalkBuilder. On large trees ($HOME, mounted network volumes, deepnode_modules/), this blocks the tokio task for tens of seconds. While the walker runs the engine cannot observe the sharedCancellationToken, soOp::Canceland frontend stop buttons are effectively no-ops until the walk completes. Observed in a wrapping GUI: the stop button was unresponsive for 90+ seconds when the model ranfile_searchwithworkspace=$HOME.Fix
grep_filesalready solved this insearch.rsby pollingToolContext.cancel_tokenbetween iterations and returning aToolErroron cancel. This PR applies the same pattern tofile_search:cancel_tokenthroughsearch_files().check_cancelled()once perWalkBuilderentry — cheap, returns fast, letsOp::Cancelactually stop the walk.search.rs::check_cancelled(promoted topub(super)) rather than duplicating the 5-line helper.test_file_search_respects_cancel_tokenparalleling the existingtest_grep_files_respects_cancel_token.Relationship to #1790
This supersedes #1790 (closed as stale post v0.8.41 rebrand). That PR wrapped the walker in
spawn_blockingwith a 30s hard timeout. The gemini reviewer on #1790 correctly flagged that approach as the wrong shape:This patch follows that guidance:
spawn_blocking— the walker still runs on the async task, but yields cancel responsiveness via the explicit polling points (same asgrep_files).check_cancelled()returns an error that propagates up cleanly.Tests
tools::file_search::tests::test_file_search_respects_cancel_token— pre-cancelled token returns an error containing "cancelled".file_searchtests + all 9grep_filestests pass locally.Diff stat