refactor(core): route external commands through CommandExecutor seam …#9
Merged
Conversation
…(M2.1)
Pure refactor. No user-visible behavior change. Sets up the injection
point that future sandbox runtimes (Docker, Firecracker, WSL...) will
plug into without touching any existing tool call site.
Surface introduced:
packages/core/src/executor.ts
type CommandExecutor = { run(input): Promise<ExecutorRunResult> }
function createSpawnExecutor() // default: child_process.spawn
packages/core/src/types.ts
ToolContext.executor?: CommandExecutor // optional, falls back to
// createSpawnExecutor()
The result-Promise shape (rather than returning a ChildProcess-like
handle) is what makes mock executors trivial in tests:
`{ run: async () => ({ stdout, stderr, exitCode, timedOut }) }`.
ExecutorRunInput.outputSink covers both call sites' needs:
- `{ kind: "capture", maxBytes }` for the Bash tool (collects up to
MAX_BASH_OUTPUT_CHARS in memory, truncates mid-stream)
- `{ kind: "fd", fd }` for the task worker (pipes stdout+stderr
directly to the task output file so long-running commands don't
balloon the heap)
Migration scope (intentionally narrow):
IN:
- packages/tools/src/index.ts runSpawnedCommand (Bash tool)
- packages/core/src/task.ts runSpawnedReadOnlyCommand (task worker)
OUT (each for a specific reason):
- packages/core/src/hooks.ts spawn — hooks use shell:true and have
exit-code-2 blocking semantics that are hook-pipeline-specific;
mixing into the executor would conflate two different contracts.
- packages/cli/src/index.ts startDetachedTaskWorker — spawns
myagent's own binary detached for background task workers; that's
process control flow, not external command execution.
The CLI's runAgentTurn injects createSpawnExecutor() into the ToolContext
so the default for every real agent run is unchanged.
Tests added in packages/core/test/security/executor-seam.test.ts (8
cases): smoke for the real spawn executor (stdout capture, abort,
timeout), routing proof for the task worker via a recording mock,
pid recording via the onPid callback, and a negative assertion that
Bash builtins (pwd/ls/cat/find) deliberately bypass the seam while
spawn commands (git/rg/grep) do route through.
Removed: the inline runSpawnedCommand promise wrangling in
packages/tools/src/index.ts and the parallel block in
packages/core/src/task.ts. Both are now ~10 lines of executor.run()
each. Unused `capString` helper deleted from tools/src/index.ts (now
lives inside the default executor).
Verified locally: 168 tests, 3/3 stable runs.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
…(M2.1)
Pure refactor. No user-visible behavior change. Sets up the injection point that future sandbox runtimes (Docker, Firecracker, WSL...) will plug into without touching any existing tool call site.
Surface introduced:
packages/core/src/executor.ts
type CommandExecutor = { run(input): Promise }
function createSpawnExecutor() // default: child_process.spawn
packages/core/src/types.ts
ToolContext.executor?: CommandExecutor // optional, falls back to
// createSpawnExecutor()
The result-Promise shape (rather than returning a ChildProcess-like handle) is what makes mock executors trivial in tests:
{ run: async () => ({ stdout, stderr, exitCode, timedOut }) }.ExecutorRunInput.outputSink covers both call sites' needs:
{ kind: "capture", maxBytes }for the Bash tool (collects up to MAX_BASH_OUTPUT_CHARS in memory, truncates mid-stream){ kind: "fd", fd }for the task worker (pipes stdout+stderr directly to the task output file so long-running commands don't balloon the heap)Migration scope (intentionally narrow):
IN:
- packages/tools/src/index.ts runSpawnedCommand (Bash tool)
- packages/core/src/task.ts runSpawnedReadOnlyCommand (task worker)
OUT (each for a specific reason):
- packages/core/src/hooks.ts spawn — hooks use shell:true and have exit-code-2 blocking semantics that are hook-pipeline-specific; mixing into the executor would conflate two different contracts.
- packages/cli/src/index.ts startDetachedTaskWorker — spawns myagent's own binary detached for background task workers; that's process control flow, not external command execution.
The CLI's runAgentTurn injects createSpawnExecutor() into the ToolContext so the default for every real agent run is unchanged.
Tests added in packages/core/test/security/executor-seam.test.ts (8 cases): smoke for the real spawn executor (stdout capture, abort, timeout), routing proof for the task worker via a recording mock, pid recording via the onPid callback, and a negative assertion that Bash builtins (pwd/ls/cat/find) deliberately bypass the seam while spawn commands (git/rg/grep) do route through.
Removed: the inline runSpawnedCommand promise wrangling in packages/tools/src/index.ts and the parallel block in packages/core/src/task.ts. Both are now ~10 lines of executor.run() each. Unused
capStringhelper deleted from tools/src/index.ts (now lives inside the default executor).Verified locally: 168 tests, 3/3 stable runs.