[TOW-1299] App Isolation traits by sammuti · Pull Request #159 · tower/tower-cli

sammuti · 2026-01-12T18:17:28Z

Introduces ExecutionBackend trait abstraction to support multiple compute substrates (local subprocesses, Kubernetes
pods, etc.) through a uniform interface. Refactors execution to cleanly separate CLI local runs from tower-runner
server-side execution.

Changes

New abstraction layer (crates/tower-runtime/src/execution.rs)

ExecutionBackend trait - defines interface for compute substrates
ExecutionHandle trait - manages running executions (status, logs, termination)
ExecutionSpec - unified specification for execution requests
Supporting types: BundleRef, RuntimeConfig, CacheConfig, ResourceLimits, NetworkingSpec

Backend implementations

SubprocessBackend - For tower-runner server-side native execution
- Implements ExecutionBackend with full logs() support for multiple consumers
- Used by tower-runner to stream logs to control plane
- Located in crates/tower-runtime/src/backends/subprocess.rs

Refactoring

Updated tower-cmd/run.rs to use SubprocessBackend for --local runs
Removed dead code: AppLauncher struct (replaced by direct backend usage), unused imports

Design Rationale

The abstraction cleanly separates two distinct use cases:

CLI --local execution: Simple, single consumer pattern for user-facing CLI runs
tower-runner server execution: Multi-consumer pattern supporting log streaming to control plane

Copilot

Pull request overview

This PR introduces an ExecutionBackend trait abstraction to enable Tower to support multiple compute substrates (local processes, Kubernetes pods, etc.) through a uniform interface, while refactoring existing local execution to implement this new abstraction.

Changes:

Added new execution abstraction layer with ExecutionBackend and ExecutionHandle traits
Implemented LocalBackend wrapping existing LocalApp functionality
Added dependencies async-trait and uuid for trait support and ID generation

Reviewed changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
crates/tower-runtime/src/execution.rs	Defines core execution traits, types, and abstractions for backend-agnostic execution management
crates/tower-runtime/src/local.rs	Implements `LocalBackend` and `LocalHandle` to adapt existing subprocess execution to new abstraction
crates/tower-runtime/src/lib.rs	Exports new execution module
crates/tower-runtime/src/errors.rs	Adds error variants for execution abstraction (`AppNotStarted`, `NoHandle`, `InvalidPackage`)
crates/tower-runtime/Cargo.toml	Adds `async-trait` and `uuid` dependencies
Cargo.toml	Defines workspace-level versions for `async-trait` and `uuid`

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-12T18:19:25Z

crates/tower-runtime/src/local.rs

+            package: match spec.bundle {
+                BundleRef::Local { path } => Package::from_unpacked_path(path).await,
+            },


The Package::from_unpacked_path call is not wrapped in error handling. If this operation fails, the error message will be generic. Consider adding context about which bundle path failed to load to improve debugging.

Copilot · 2026-01-12T18:19:25Z

crates/tower-runtime/src/local.rs

+            typical_cold_start_ms: 1000,     // ~1s for venv + sync
+            typical_warm_start_ms: 100,      // ~100ms with warm cache


These hardcoded timing estimates should be documented as approximate values that may vary based on system resources and bundle complexity. Consider adding a comment explaining these are typical values, not guarantees.

Suggested change

typical_cold_start_ms: 1000, // ~1s for venv + sync

typical_warm_start_ms: 100, // ~100ms with warm cache

// The following timing values are typical, approximate estimates and may vary

// based on system resources, bundle complexity, and runtime conditions.

typical_cold_start_ms: 1000, // ~1s for venv + sync on a typical development machine

typical_warm_start_ms: 100, // ~100ms with a warm cache under typical conditions

Copilot · 2026-01-12T18:19:26Z

crates/tower-runtime/src/local.rs

+        loop {
+            let status = self.status().await?;
+            match status {
+                ExecutionStatus::Preparing | ExecutionStatus::Running => {
+                    tokio::time::sleep(Duration::from_millis(100)).await;
+                }
+                _ => return Ok(status),
+            }
+        }


The polling interval of 100ms is hardcoded. For long-running executions, this creates unnecessary overhead. Consider making the polling interval configurable or implementing an event-based notification mechanism instead of polling.

Copilot · 2026-01-12T18:19:26Z

crates/tower-runtime/src/execution.rs

+    pub async fn status(&self) -> Result<ExecutionStatus, Error> {
+        self.app
+            .as_ref()
+            .ok_or(Error::AppNotStarted)?


The error message 'app not started' is vague. Consider a more descriptive error such as 'cannot get status: no app is currently running' to provide better context to users.

Copilot · 2026-01-12T18:19:26Z

crates/tower-runtime/src/execution.rs

+pub struct AppLauncher<A: App> {
+    backend: Arc<A::Backend>,
+    app: Option<A>,
+}


The AppLauncher struct and its methods lack documentation comments. Since this is a public API component of the new abstraction, it should include doc comments explaining its purpose, usage patterns, and lifecycle management responsibilities.

bradhe

Is this something that you want to land? It's pretty WIP-y it seems to me, has loads of duplicated stuff from elsewhere in the tower-runtime crate. I've left some comments for now, please let me know how you'd like to proceed.

crates/tower-runtime/src/execution.rs

bradhe · 2026-01-12T18:25:29Z

crates/tower-runtime/src/local.rs

+        let opts = StartOptions {
+            ctx: spec.telemetry_ctx,
+            package: match spec.bundle {
+                BundleRef::Local { path } => Package::from_unpacked_path(path).await,


This should be called PackageRef not BundleRef

sammuti · 2026-01-13T17:55:29Z

crates/tower-runtime/Cargo.toml

+uuid = { workspace = true }
+
+# K8s dependencies (optional)
+k8s-openapi = { version = "0.23", features = ["v1_31"], optional = true }


Update deps to latest

bradhe

Did another review here. Let's review my feedback synchronously.

bradhe · 2026-01-14T15:40:02Z

crates/tower-cmd/src/run.rs

 /// monitor_local_status is a helper function that will monitor the status of a given app and waits for
 /// it to progress to a terminal state.
-async fn monitor_local_status(app: Arc<Mutex<LocalApp>>) -> Status {
-    debug!("Starting status monitoring for LocalApp");
+async fn monitor_cli_status(handle: Arc<Mutex<tower_runtime::backends::cli::CliHandle>>) -> Status {


This was renamed to "cli" but runner in third party infrastructure (e.g. self-hosted runners) will use local processes too, not Kubernetes...

bradhe · 2026-01-14T15:41:49Z

crates/tower-runtime/src/backends/k8s.rs

+        // Build container spec
+        // Note: In K8s, 'command' = entrypoint, 'args' = command
+        let container = Container {
+            name: "app".to_string(),
+            image: Some(spec.runtime.image.clone()),
+            env: Some(env_vars),
+            command: spec.runtime.entrypoint.clone(), // K8s command = entrypoint
+            args: spec.runtime.command.clone(),       // K8s args = command
+            volume_mounts: if volume_mounts.is_empty() {
+                None
+            } else {
+                Some(volume_mounts)
+            },
+            resources: Some(resources),
+            working_dir: Some("/app".to_string()),
+            ..Default::default()
+        };
+
+        // Build pod spec
+        let pod_spec = PodSpec {
+            containers: vec![container],
+            volumes: if volumes.is_empty() {
+                None
+            } else {
+                Some(volumes)
+            },
+            restart_policy: Some("Never".to_string()),
+            ..Default::default()
+        };
+
+        Ok(Pod {
+            metadata: k8s_openapi::apimachinery::pkg::apis::meta::v1::ObjectMeta {
+                name: Some(format!("tower-run-{}", spec.id)),
+                namespace: Some(self.namespace.clone()),
+                labels: Some(labels),
+                ..Default::default()
+            },
+            spec: Some(pod_spec),
+            ..Default::default()
+        })


I'm assuming a change for this is coming?

bradhe · 2026-01-14T15:44:11Z

crates/tower-runtime/src/execution.rs

+    /// Get current execution status
+    async fn status(&self) -> Result<Status, Error>;


Does this get the status of the execution environment setup or the status of the app that's running? Or both?

both as we discussed, we may need to consolidate given the number of statuses should be expanded to match

bradhe · 2026-01-14T15:45:57Z

crates/tower-runtime/src/backends/k8s.rs

+        // Create ConfigMap with bundle contents and get path mapping
+        let path_mapping = self.create_bundle_configmap(&spec).await?;


Just calling this out for myself that I expect this will go away.

bradhe · 2026-01-14T15:47:20Z

crates/tower-runtime/src/backends/k8s.rs

+        Ok(match phase.as_str() {
+            "Pending" => Status::None,
+            "Running" => Status::Running,
+            "Succeeded" => Status::Exited,


If this function is meant to get the status of the app in it's lifecycle, this means that once the Pod is provisioned, it'll get marked as "Exited" right?

bradhe · 2026-01-14T15:48:14Z

crates/tower-runtime/src/backends/k8s.rs

+            if tokio::time::timeout(std::time::Duration::from_secs(60), condition)
+                .await
+                .is_ok()


What happens if a container takes longer than 60 seconds to log?

bradhe · 2026-01-14T15:48:31Z

crates/tower-runtime/src/backends/k8s.rs

+                            line,
+                        };
+                        if tx.send(output).is_err() {
+                            break;


probs wanna log the error?

bradhe · 2026-01-14T15:49:04Z

crates/tower-runtime/src/backends/k8s.rs

+    async fn terminate(&mut self) -> Result<(), Error> {
+        let pods: Api<Pod> = Api::namespaced(self.client.clone(), &self.namespace);
+
+        pods.delete(&self.pod_name, &DeleteParams::default())
+            .await
+            .map_err(|_| Error::TerminateFailed)?;
+
+        Ok(())
+    }


Does SubprocessHandle guarantee the process is dead by the end of terminate or is it fire and forget?

bradhe · 2026-01-14T15:49:33Z

crates/tower-runtime/src/backends/k8s.rs

+        // Delete pod
+        self.terminate().await?;


Cleanup is typically called after the app is already terminated/exited.

bradhe · 2026-01-14T15:50:43Z

crates/tower-runtime/src/local.rs

Do we need a k8s.rs in here for a kubernetes app now?

Removed k8s stuff from here and the main abstractions are ExecutionBackend and ExecutionHandle as explained offline

tests/integration/test-home/.config/tower/session.json

crates/tower-runtime/src/backends/k8s.rs

sammuti · 2026-01-15T11:07:52Z

crates/tower-runtime/src/execution.rs

+    /// Get current execution status
+    async fn status(&self) -> Result<Status, Error>;


both as we discussed, we may need to consolidate given the number of statuses should be expanded to match

sammuti · 2026-01-15T11:08:07Z

crates/tower-runtime/src/local.rs

Removed k8s stuff from here and the main abstractions are ExecutionBackend and ExecutionHandle as explained offline

bradhe

Looks good, let's do it.

sammuti requested review from bradhe, Copilot, giray123, jo-sm, konstantinoscs and socksy January 12, 2026 18:17

Copilot AI reviewed Jan 12, 2026

View reviewed changes

bradhe reviewed Jan 12, 2026

View reviewed changes

sammuti changed the title ~~[TOW-1299] App Isolation traits~~ [WIP][TOW-1299] App Isolation traits Jan 12, 2026

sammuti force-pushed the feature/tow-1299 branch from 6183c33 to 7432ed6 Compare January 13, 2026 17:53

sammuti commented Jan 13, 2026

View reviewed changes

sammuti changed the title ~~[WIP][TOW-1299] App Isolation traits~~ [TOW-1299] App Isolation traits Jan 14, 2026

bradhe reviewed Jan 14, 2026

View reviewed changes

sammuti commented Jan 15, 2026

View reviewed changes

bradhe approved these changes Jan 16, 2026

View reviewed changes

sammuti added 15 commits January 20, 2026 16:03

impl first iter

797dffc

Working k8s backend

0b16d11

move k8s backend to tower-runner

c1c9e01

Minor

d7883db

Remove cache abstractions

57945ad

Refactor run.rs correctly

c162ed4

Refactor back the k8s runtime

475e471

minor

99e299d

Make AppLauncher generic over ExecutionBackend instead of app

45d876e

Renaming local -> subprocess

1290c5e

CliBackend

9a307f9

remove AppLauncher

27c5a14

minor

f218f6c

Update deps

c964f3b

dep updates

44681f3

sammuti added 9 commits January 20, 2026 16:03

BundleRef -> PackageRef

5e2d0bc

minor

63c7264

Remove cli backend

0cdbaa8

Move out k8s backend

d72cb7f

revert session.json

95dbc2b

minro

00cede6

Add ExecutionSpec.package_stream to pass the package

93b1240

minor

74dd0e0

rebase

1454a5d

sammuti force-pushed the feature/tow-1299 branch from 52f21a5 to 1454a5d Compare January 20, 2026 15:06

sammuti merged commit 2fe8ba7 into develop Jan 20, 2026
5 checks passed

sammuti deleted the feature/tow-1299 branch January 20, 2026 15:22

		typical_cold_start_ms: 1000, // ~1s for venv + sync
		typical_warm_start_ms: 100, // ~100ms with warm cache

-            typical_cold_start_ms: 1000,     // ~1s for venv + sync
-            typical_warm_start_ms: 100,      // ~100ms with warm cache
+            // The following timing values are typical, approximate estimates and may vary
+            // based on system resources, bundle complexity, and runtime conditions.
+            typical_cold_start_ms: 1000,     // ~1s for venv + sync on a typical development machine
+            typical_warm_start_ms: 100,      // ~100ms with a warm cache under typical conditions

		/// Get current execution status
		async fn status(&self) -> Result<Status, Error>;

		// Create ConfigMap with bundle contents and get path mapping
		let path_mapping = self.create_bundle_configmap(&spec).await?;

Conversation

sammuti commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

New abstraction layer (crates/tower-runtime/src/execution.rs)

Backend implementations

Refactoring

Design Rationale

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

bradhe left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bradhe left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bradhe left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

sammuti commented Jan 12, 2026 •

edited

Loading