Release v0.3.48 by sammuti · Pull Request #196 · tower/tower-cli

sammuti · 2026-02-06T10:19:16Z

[TOW-1342] Timeout auto cleanup for subprocess

Summary by CodeRabbit

New Features
- Automatic cleanup mechanism: The runtime now monitors subprocess execution until completion, waits a configurable timeout period, and removes temporary working directories if cleanup wasn't explicitly triggered. This improves resource management by ensuring temporary files are cleaned up even if manual cleanup is missed.
Chores
- Version bumped to 0.3.48.

* [TOW-1342] Timeout auto cleanup for subprocess * [TOW-1342] Timeout auto cleanup for subprocess * adjust polling interval

coderabbitai · 2026-02-06T10:19:36Z

📝 Walkthrough

Walkthrough

Version bumped to 0.3.48 alongside implementation of automatic cleanup for subprocess runs. New cleanup monitor spawns as background task to poll app status, wait post-terminal state, then asynchronously remove temporary directories if explicit cleanup wasn't triggered, with integrated logging.

Changes

Cohort / File(s)	Summary
Version Updates `Cargo.toml`, `pyproject.toml`	Workspace and project version bumped from 0.3.47 to 0.3.48.
Cleanup Module Infrastructure `crates/tower-runtime/src/auto_cleanup.rs`	New module providing `spawn_cleanup_monitor` function that polls app status, waits configurable timeout post-terminal state, and performs automatic cleanup of `uv_temp_dir` and `package_tmp_dir`. Includes unit tests for auto-cleanup, explicit cleanup override, and non-terminal wait scenarios.
Library Module Exports `crates/tower-runtime/src/lib.rs`	Added public `auto_cleanup` module and convenience method `Status::is_terminal()` for detecting terminal states (Exited or Crashed).
Subprocess Lifecycle Integration `crates/tower-runtime/src/subprocess.rs`	Refactored temp directory ownership from locally scoped to `Arc<Mutex<>>` wrappers; integrated cleanup monitor spawn during subprocess creation; centralized explicit cleanup path via `ExecutionHandle` to mark cleanup flag, terminate app, and remove temp directories; replaced Drop-based cleanup with coordinated monitor-driven approach.

Sequence Diagram

sequenceDiagram
    participant SubprocessHandle
    participant CleanupMonitor as Cleanup Monitor
    participant App
    participant FileSystem

    SubprocessHandle->>CleanupMonitor: spawn_cleanup_monitor (run_id, app, temp dirs, cleanup_called flag)
    
    loop Poll until terminal
        CleanupMonitor->>App: poll status
        App-->>CleanupMonitor: Non-terminal status
    end
    
    CleanupMonitor->>App: poll status
    App-->>CleanupMonitor: Terminal status (Exited/Crashed)
    
    CleanupMonitor->>CleanupMonitor: wait cleanup_timeout
    
    alt Explicit cleanup called
        SubprocessHandle->>SubprocessHandle: cleanup() invoked
        SubprocessHandle->>App: terminate app
        SubprocessHandle->>FileSystem: remove uv_temp_dir
        SubprocessHandle->>FileSystem: remove package_tmp_dir
        SubprocessHandle->>CleanupMonitor: mark cleanup_called flag
    else Auto-cleanup (cleanup_called still false)
        CleanupMonitor->>App: verify cleanup not called
        CleanupMonitor->>FileSystem: remove uv_temp_dir
        CleanupMonitor->>FileSystem: remove package_tmp_dir
        CleanupMonitor->>CleanupMonitor: log warning (auto-cleanup performed)
    end

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~50 minutes

🐰 The cleanup hops along with grace,
No temp dirs left without a trace,
Monitors spawn with watchful eyes,
Terminal states trigger swift replies,
Five minutes pass, then cleanup flies! ✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately reflects the main purpose of the changeset, which is a version release bump from 0.3.47 to 0.3.48 paired with the implementation of automatic cleanup functionality for subprocesses.
Docstring Coverage	✅ Passed	Docstring coverage is 92.31% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch develop

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@crates/tower-runtime/src/subprocess.rs`:
- Around line 265-289: In cleanup(), don’t return immediately if
self.terminate().await? fails; instead call self.cleanup_called.store(true,
Ordering::Relaxed) (already done), capture the Result from
self.terminate().await into a local variable (e.g., terminate_res), then always
proceed to run the uv_temp_dir and package_tmp_dir removal logic (locking
uv_temp_dir and package_tmp_dir and remove_dir_all as currently written), and
after both cleanup attempts return the original terminate error if terminate_res
was Err or Ok(()) otherwise; this ensures terminate() errors do not skip the
temp-directory removal while preserving the original error from terminate().

🧹 Nitpick comments (3)

crates/tower-runtime/src/auto_cleanup.rs (3)

17-41: Avoid the initial polling delay before the first status check.

Line 40 sleeps before the first status check, so runs already in a terminal state wait up to 5s before the cleanup timer starts. Consider checking status first, then sleeping.

♻️ Suggested refactor

-        loop {
-            tokio::time::sleep(STATUS_POLL_INTERVAL).await;
-            let status = app.lock().await.status().await;
-            if matches!(status, Ok(s) if s.is_terminal()) {
+        loop {
+            let status = app.lock().await.status().await;
+            if matches!(status, Ok(s) if s.is_terminal()) {
                 info!(
                     "Run {} finished, starting {}s automatic cleanup timer",
                     run_id,
                     cleanup_timeout.as_secs()
                 );
                 break;
             }
+            tokio::time::sleep(STATUS_POLL_INTERVAL).await;
         }

67-73: Surface temp directory cleanup failures.

remove_dir_all errors are discarded, which makes cleanup failures hard to diagnose. Logging those errors would help ops visibility.

📝 Suggested logging

-        if let Some(temp_dir) = uv_temp_dir.lock().await.take() {
-            let _ = tokio::fs::remove_dir_all(&temp_dir).await;
-        }
+        if let Some(temp_dir) = uv_temp_dir.lock().await.take() {
+            if let Err(e) = tokio::fs::remove_dir_all(&temp_dir).await {
+                warn!("Failed to clean up uv temp directory: {:?}", e);
+            }
+        }
 
-        if let Some(tmp_dir) = package_tmp_dir.lock().await.take() {
-            let _ = tokio::fs::remove_dir_all(tmp_dir.to_path_buf()).await;
-        }
+        if let Some(tmp_dir) = package_tmp_dir.lock().await.take() {
+            if let Err(e) = tokio::fs::remove_dir_all(tmp_dir.to_path_buf()).await {
+                warn!("Failed to clean up package temp directory: {:?}", e);
+            }
+        }

139-324: Tests rely on long wall-clock sleeps; consider time control.

These tests wait real time (≥7–12s with the 5s poll interval), which can slow CI and add flakiness. If tokio’s paused time is available, #[tokio::test(start_paused = true)] plus tokio::time::advance(...) would make the tests faster and more deterministic.

coderabbitai · 2026-02-06T10:30:59Z

crates/tower-runtime/src/subprocess.rs

    async fn cleanup(&mut self) -> Result<(), Error> {
+        use tower_telemetry::{debug, info};
+
+        info!("Explicit cleanup called for run {}", self.id);
+
+        // Mark cleanup as called (prevents timer from running)
+        self.cleanup_called.store(true, Ordering::Relaxed);
+
        // Ensure the app is terminated
        self.terminate().await?;

-        // Clean up uv's temp directory if it was created
-        if let Some(ref temp_dir) = self.uv_temp_dir {
-            if let Err(e) = tokio::fs::remove_dir_all(temp_dir).await {
-                // Log but don't fail - cleanup is best-effort
-                tower_telemetry::debug!("Failed to clean up uv temp directory: {:?}", e);
+        // Clean up uv's temp directory
+        if let Some(temp_dir) = self.uv_temp_dir.lock().await.take() {
+            if let Err(e) = tokio::fs::remove_dir_all(&temp_dir).await {
+                debug!("Failed to clean up uv temp directory: {:?}", e);
            }
        }

        // Clean up package temp directory
-        if let Some(tmp_dir) = self.package_tmp_dir.take() {
+        if let Some(tmp_dir) = self.package_tmp_dir.lock().await.take() {
            let path = tmp_dir.to_path_buf();
            if let Err(e) = tokio::fs::remove_dir_all(&path).await {
-                // Log but don't fail - cleanup is best-effort
-                tower_telemetry::debug!("Failed to clean up package temp directory: {:?}", e);
+                debug!("Failed to clean up package temp directory: {:?}", e);
            }
        }


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

rg "cleanup_called" crates/tower-runtime/src/subprocess.rs -B 3 -A 3

Repository: tower/tower-cli

Length of output: 1227

🏁 Script executed:

rg "cleanup_called" crates/tower-runtime/src/ -B 2 -A 2

Repository: tower/tower-cli

Length of output: 7841

🏁 Script executed:

fd "subprocess.rs" crates/tower-runtime/src/ -x cat -n {}

Repository: tower/tower-cli

Length of output: 12369

Always attempt directory cleanup even if terminate() fails.

If terminate() returns an error at line 274, the cleanup function returns early and skips directory removal (lines 277-289). Since cleanup_called was already set to true, the auto-cleanup monitor will also be suppressed and skip its backup cleanup. This causes temp directories to leak. Capture the terminate result and always proceed with cleanup before returning the error.

Suggested fix

- // Ensure the app is terminated - self.terminate().await?; + // Ensure the app is terminated, but still attempt temp cleanup + let terminate_result = self.terminate().await; // Clean up uv's temp directory if let Some(temp_dir) = self.uv_temp_dir.lock().await.take() { if let Err(e) = tokio::fs::remove_dir_all(&temp_dir).await { debug!("Failed to clean up uv temp directory: {:?}", e); } } // Clean up package temp directory if let Some(tmp_dir) = self.package_tmp_dir.lock().await.take() { let path = tmp_dir.to_path_buf(); if let Err(e) = tokio::fs::remove_dir_all(&path).await { debug!("Failed to clean up package temp directory: {:?}", e); } } - Ok(()) + terminate_result?; + Ok(())

🤖 Prompt for AI Agents

In `@crates/tower-runtime/src/subprocess.rs` around lines 265 - 289, In cleanup(), don’t return immediately if self.terminate().await? fails; instead call self.cleanup_called.store(true, Ordering::Relaxed) (already done), capture the Result from self.terminate().await into a local variable (e.g., terminate_res), then always proceed to run the uv_temp_dir and package_tmp_dir removal logic (locking uv_temp_dir and package_tmp_dir and remove_dir_all as currently written), and after both cleanup attempts return the original terminate error if terminate_res was Err or Ok(()) otherwise; this ensures terminate() errors do not skip the temp-directory removal while preserving the original error from terminate().

[TOW-1342] Timeout auto cleanup for subprocess (#195)

01f934a

* [TOW-1342] Timeout auto cleanup for subprocess * [TOW-1342] Timeout auto cleanup for subprocess * adjust polling interval

sammuti added 2 commits February 6, 2026 11:21

Version bump to 0.3.48

71809cb

uv lock

f9db30d

coderabbitai bot reviewed Feb 6, 2026

View reviewed changes

bradhe approved these changes Feb 6, 2026

View reviewed changes

sammuti merged commit 9a2c123 into main Feb 6, 2026
20 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release v0.3.48#196

Release v0.3.48#196
sammuti merged 3 commits intomainfrom
develop

sammuti commented Feb 6, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 6, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated Code Review Effort

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sammuti commented Feb 6, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated Code Review Effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sammuti commented Feb 6, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 6, 2026 •

edited

Loading