feat(providers): add @langchain/wasmsh — in-process shell + Python sandbox#427
feat(providers): add @langchain/wasmsh — in-process shell + Python sandbox#427Johann-Peter Hartmann (johannhartmann) wants to merge 11 commits into
Conversation
🦋 Changeset detectedLatest commit: 3409442 The changes in this PR will be included in the next version bump. This PR includes changesets to release 4 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
See example under https://deepagents.data.mayflower.tech/, a demo of an in-browser running deepagent, sandbox and LLM - source in https://github.com/mayflower/langchainjs-mediapipe-agent-demo. |
commit: |
WasmshSandbox wraps the wasmsh-pyodide npm package to provide a WASM-based sandbox backend for DeepAgents. Runs bash (88 utilities) + Python 3.13 with micropip in-process — no containers, no server. Supports Node.js (child process) and browser (Web Worker) modes. Includes unit tests (97/97 pass), integration tests, and examples.
- Add browser entry points for deepagents core (index.browser.ts) - Add Playwright E2E tests running a DeepAgent with wasmsh in-browser - Add Node.js agent integration tests exercising file ops, Python, shell scripting, and multi-step pipelines
cd0eba4 to
cbe8fa5
Compare
|
Johann-Peter Hartmann (@johannhartmann) this is rad! 🤯 Let me discuss with the team and get back to you! |
Brings the @langchain/wasmsh provider to parity with @langchain/quickjs by
adding an interpreter middleware on top of the existing sandbox surface.
# New exports
* `createWasmshInterpreterMiddleware(options)` — exposes the sandbox as a
single `py_eval` agent tool. Variables, imports, and defined functions
persist across calls within the same session (via the sandbox's
globals pickle, transparently). Top-level `await` works.
* `WasmshFilesystemBackend(sandbox, { namespace })` — adapts a
WasmshSandbox as a deepagents `BackendProtocolV2` memory backend with
optional namespace prefixing. Composable as a sub-backend in
`CompositeBackend`. Mirrors `WasmshFilesystemBackend` from the Python
adapter.
* `scanSkillReferences`, `loadSkill`, `installPendingSkills` — Python
skills loading. Scans user code for `import skills.<name>` and stages
the matching skill directory from a `BackendProtocol` into the sandbox
VFS under `/skills/<package_name>/`. Auto-generates `__init__.py`.
* `WasmshSandbox.runPtc(code, tools, onHostCall)` — passthrough to the
underlying npm session's runPtc method (requires
@mayflowergmbh/wasmsh-pyodide ≥ 0.6.4; surfaces a clear error against
older builds via duck-type check).
# Programmatic tool calling
Selected agent tools can be exposed inside the sandbox as
`tools.<snake_name>` awaitables. The model can fan out via
`asyncio.gather`, branch on results, and chain dependent calls — all
within one `py_eval` invocation. PTC calls bypass the regular ToolNode
path, so `interrupt_on` approval hooks are not enforced; treat the
allowlist as the permission boundary.
PTC config shapes mirror the QuickJS middleware:
* `false` (default) — disabled.
* `true` — every agent tool except the default vfs helpers.
* `string[]` — explicit allowlist.
* `{ include: string[] }` / `{ exclude: string[] }` — include/exclude shapes.
# Tests
* 16 new unit tests in `middleware.test.ts` cover: tool registration,
custom tool name, sandbox round-trip via runPtc, error envelope
formatting, skills scanner, snake-case conversion, Python identifier
validation, envelope formatting (incl. truncation + error blocks).
* Existing sandbox tests (30) continue to pass.
# Notes
Bumps the `@mayflowergmbh/wasmsh-pyodide` dependency to `^0.6.4`,
which adds the runPtc client method needed for the PTC bridge.
There was a problem hiding this comment.
Security Issues
- Path Traversal within WasmshFilesystemBackend namespace
TheWasmshFilesystemBackend's#scope()method (filesystem-backend.ts:59-65) concatenates the configured namespace prefix with the caller-supplied path using simple string concatenation — e.g.${this.#namespace}${abs}— without normalizing..segments or verifying that the resulting path stays beneath the namespace root. Because the downstreamWasmshSandbox(backed by Pyodide's POSIX VFS via@mayflowergmbh/wasmsh-pyodide) does resolve..segments at the filesystem level, a traversal payload is fully effective inside the sandbox.
The path flows from user input through the LangChain tool layer with no sanitization: the read_file, write_file, and edit_file tools registered in libs/deepagents/src/middleware/fs.ts accept any file_path string (z.string(), no pattern constraint) and pass it verbatim to resolvedBackend.read/write/edit(), which then calls this.#scope(filePath) and forwards the result to the sandbox. If a WasmshFilesystemBackend is used as a namespaced sub-backend — the explicit design goal of the class — an LLM agent or any caller with tool-call access can pass ../../skills/secret.py to escape from /memories into /skills (or any other directory in the shared VFS), enabling unauthorized cross-namespace reads and writes.
Recommendations
- Normalize and enforce containment in
#scope: resolve the joined path withpath.posix.resolve(namespace, userPath)and throw (or return an error) if the result does not start with the namespace prefix. - Apply the same containment check in
#unscopeto avoid leaking non-namespaced paths back to callers. - Consider adding a
.includes("../")/.includes("..")early-exit guard as a defense-in-depth measure before the resolve step.
| const abs = path.startsWith("/") ? path : `/${path}`; | ||
| if (abs === "/") return this.#namespace || "/"; | ||
| return `${this.#namespace}${abs}`; | ||
| } |
There was a problem hiding this comment.
The namespace scoping concatenates the namespace and the user-supplied path without validating for directory traversal ("..") or enforcing that the resolved path stays under the namespace. This allows escaping the intended namespace in the wasm VFS and accessing other directories within the same sandbox.
Vulnerable code:
#scope(path: string | null | undefined): string {
if (path == null) return this.#namespace || "/";
if (!this.#namespace) return path;
const abs = path.startsWith("/") ? path : `/${path}`;
if (abs === "/") return this.#namespace || "/";
return `${this.#namespace}${abs}`;
}The #scope() method at filesystem-backend.ts:59-65 performs simple string concatenation — return \${this.#namespace}${abs}`— without normalizing.. segments or verifying the resulting path stays within the namespace. The Pyodide POSIX VFS (@mayflowergmbh/wasmsh-pyodide) that backs WasmshSandboxdoes resolve..` at the filesystem layer, so traversal payloads are fully effective inside the sandbox.
The attack is directly reachable: the read_file, write_file, and edit_file LangChain tools registered in libs/deepagents/src/middleware/fs.ts accept file_path as a plain z.string() with no pattern constraint and pass the value verbatim to resolvedBackend.read/write/edit(), which calls #scope(). An LLM agent (or an adversarial prompt injected into its context) can pass "../../skills/secret.py" to escape from /memories into /skills, enabling cross-namespace reads and writes.
Remediation: Normalize and enforce containment in #scope. Resolve the final path and reject it if it falls outside the namespace boundary:
import path from 'path'; // posix in browser builds
#scope(p?: string | null): string {
const ns = this.#namespace || '/';
if (p == null || p === '/') return ns;
const rel = p.startsWith('/') ? p.slice(1) : p;
const full = path.posix.resolve(ns, rel);
const boundary = ns.endsWith('/') ? ns : ns + '/';
if (full !== ns && !full.startsWith(boundary)) {
throw new Error('invalid_path: traversal outside namespace');
}
return full;
}Also harden #unscope so it only strips the namespace prefix when the path actually starts with it.
Attack Path
-
An LLM agent — or an adversarial prompt injection — issues a
read_filetool call withfile_path = "../../skills/secret.py". -
libs/deepagents/src/middleware/fs.ts:586passes the value verbatim toresolvedBackend.read("../../skills/secret.py", offset, limit)with no sanitization. -
resolvedBackendis aWasmshFilesystemBackendconfigured with namespace"/memories". Itsread()method (filesystem-backend.ts:91) callsthis.#scope("../../skills/secret.py"). -
#scope(filesystem-backend.ts:59-65) returns"/memories/../../skills/secret.py"— no traversal check performed. -
That path is forwarded to
WasmshSandbox.read("/memories/../../skills/secret.py"), which builds a shell command withshellQuote(escapes shell-special characters but does not strip..). -
The Pyodide WASM session resolves the
..components and returns the content of/skills/secret.py, escaping the/memoriesnamespace. -
The same technique applies to
write_fileandedit_file, enabling cross-namespace writes.
For more details, see the finding in Corridor.
Provide feedback: Reply with whether this is a valid vulnerability or false positive to help improve Corridor's accuracy.
* Replace `instanceof Uint8Array` with the `typeof === "string"` / fallthrough pattern used in `internal.ts::toInitialFiles`. * Replace `instanceof Error` in the skills loader's catch with a structural check on the `message` property. * Drop `console.warn` in favour of a single `process.stderr.write` to surface broken skills without tripping `no-console`. * Apply oxfmt across the new files.
…runPtc passthrough
The original 16 unit tests covered middleware shape, the scanner regex, and
the formatting helpers. This adds the previously-untested surface:
* `filesystem-backend.test.ts` (12 cases) — namespace prefix application
across every protocol method, namespace normalisation (trailing slash,
no leading slash, `/` → bare root), result-path unscope for ls/glob/grep/
upload/download, and pass-through of error results.
* `skills.test.ts` (12 cases) — `loadSkill` covering synthesised vs.
author-supplied `__init__.py`, kebab→snake renaming, invalid skill
names, empty dir, missing entrypoint, download errors, no-module
metadata; `installPendingSkills` covering scanner-driven staging,
caching across calls, per-skill failure isolation, and the no-skill
short-circuit.
* `ptc.test.ts` (12 cases) — every `ptc` config shape (false / true /
array / `{include}` / `{exclude}`), self-tool exclusion, kebab→snake
exposure, identifier validation, plus the `onHostCall` dispatcher
pipeline: success path, UnknownToolError, isolated throw → error envelope.
* `sandbox-runPtc.test.ts` (3 cases) — the `WasmshSandbox.runPtc`
passthrough forwards the right shape, the duck-check surfaces a clear
error against older sessions, and a stopped sandbox rejects cleanly.
Total: 85 unit tests pass (up from 46).
…mBackend `WasmshFilesystemBackend.#scope` concatenated the configured namespace with the caller-supplied path, then handed the result to the wasmsh sandbox — whose Pyodide VFS resolves `..` segments at the filesystem layer. An LLM-controlled `file_path` like `../../skills/secret.py` would resolve to a different namespace (or root) once the sandbox saw it, defeating the very isolation the `namespace=` knob promises. The fix: * Normalise the joined path with `posixpath.normpath` and assert the result still sits at-or-below the namespace prefix; reject otherwise. * Apply the matching containment check on the inbound (`_unscope`) side so an upstream bug elsewhere can't leak non-namespaced paths into the caller's view. * Anchor the prefix match with a trailing slash so a sibling whose name shares the namespace prefix (`/memstore` vs `/mem`) is rejected. * Surface the rejection as `WasmshNamespaceEscapeError` — a subclass of `PermissionError` so existing error-handlers that map OS permission errors to `"permission_denied"` continue to do the right thing without an additional catch. Adds 6 regression tests covering direct `..`, multi-segment payloads, interior `..` landing outside, sibling-prefix attacks, allowed `./` and interior `..` that stays inside, plus an upstream-leak check on the unscope path. Caught by corridor-security on langchain-ai/deepagentsjs#427.
`WasmshFilesystemBackend.#scope` concatenated the configured namespace with the caller-supplied path, then handed the result to the wasmsh sandbox — whose Pyodide VFS resolves `..` segments at the filesystem layer. An LLM-controlled `file_path` like `../../skills/secret.py` would resolve to a different namespace (or root) once the sandbox saw it, defeating the very isolation the `namespace=` knob promises. The fix: * Normalise the joined path with `posix.resolve` and assert the result still sits at-or-below the namespace prefix; reject otherwise. * Apply the matching containment check on the inbound (`#unscope`) side so an upstream bug elsewhere can't leak non-namespaced paths. * Anchor the prefix match with a trailing slash so a sibling whose name shares the namespace prefix (`/memstore` vs `/mem`) is rejected. * Surface the rejection as `WasmshNamespaceEscapeError`. Adds 8 regression tests covering direct `..`, multi-segment payloads, interior `..` landing outside, sibling-prefix attacks, allowed `./` and interior `..` that stays inside, plus the upstream-leak check on the unscope path and the no-namespace passthrough. Reported by corridor-security on PR langchain-ai#427.
|
Thanks for the catch — this is a real escape. Fixed in 799d461 on this branch. Fix: The rejection surfaces as Tests added (8 cases in
The same bug was in the Python adapter ( |
…ytes, cover skills seam Addresses findings from the comprehensive PR review on langchain-ai#427. # Silent-failure fixes (high impact) * `installPendingSkills` now re-throws `WasmshNamespaceEscapeError` instead of demoting it to a stderr log. A malicious skill metadata `path` containing `..` segments would otherwise have the namespace guard's signal swallowed by the best-effort skill-load catch. * `asBytes` now validates its argument with a structural ArrayBufferView check and throws on unsupported shapes. The previous `return content as Uint8Array` cast would silently propagate `undefined byteLength`, poisoning the bundle-size cap and surfacing as a cryptic TypeError at upload time. * `process.stderr.write` in the best-effort skill-load catch is now guarded against undefined `process.stderr`, so the path works in browser environments (the provider ships a browser build). # Comment accuracy * `middleware.ts` claimed to inject `timeoutMs` into the system prompt and described `afterAgent` as a careful "no-op by design" — neither was true. Replaced with accurate explanations of the actual behaviour. * `types.ts` `timeoutMs` docstring updated to reflect that the option is accepted for API parity but not yet wired into the prompt or budget. # Coverage gaps closed * `middleware-skills.test.ts` (new, 2 cases) — wires `createWasmshInterpreterMiddleware({ skillsBackend })` with a mocked `getCurrentTaskInput` and verifies the skill is actually staged into the sandbox before the eval runs, plus the negative case (no references → no uploads). Closes the highest-priority test gap identified in the review: the middleware ↔ skills loader seam. * `filesystem-backend.test.ts` — new test that exercises `#isContained`'s trailing-`/` anchor directly via an `#unscope` leak, not via the resolver. Closes the gap where the sibling-prefix rejection only worked because of `posix.resolve`, masking a hypothetical regression in the anchor check. * `skills.test.ts` — two new cases: (a) `WasmshNamespaceEscapeError` thrown from inside `loadSkill` is re-thrown by `installPendingSkills`, not swallowed; (b) `asBytes` rejects non-binary backend content. Total: 98 unit tests pass (up from 93). Typecheck, oxlint, oxfmt clean.
Three follow-ups from the PR review: - Add `WasmshLogger` interface and thread it through the two catch sites that previously swallowed errors into stderr: PTC dispatch (`dispatchHostCall`) and best-effort skill loading (`installPendingSkills`). When a logger is configured it becomes authoritative; the stderr fallback only fires when no logger is wired. Logger contract documents that implementations must not throw, and the middleware swallows logger exceptions so observability bugs can't break the agent loop. - Deterministic agent integration test driven by a scripted chat model that emits a prebuilt tool-call → final-answer sequence. Pins the full LLM → middleware → sandbox.runPtc → ToolMessage → next-turn shape without the LLM round-trip. - Adapter-layer integration test for `WasmshSandbox.runPtc` against real Pyodide, covering plain eval, host_call round-trip, error envelopes, and globals persistence across calls. Gated on built Pyodide assets.
Summary
Adds
@langchain/wasmsh, a sandbox provider that runs a full Bash-compatibleshell (88 utilities including grep, sed, awk, jq, curl) and Python 3.13 with
pip — entirely in-process via WebAssembly. No containers, no cloud services,
no API keys.
WasmshSandbox.createNode()— spawns a local host processWasmshSandbox.createBrowserWorker()— runs in a Web WorkerBacked by wasmsh and
Pyodide, the shell and Python share a virtual
filesystem. Agents get
execute,read_file,write_file,edit_file,ls,grep,glob— same tools as remote sandboxes, zero infrastructure.What's included
feat: expose filesystemOptions in createDeepAgentfeat: add @langchain/wasmsh sandbox providerfeat: add browser build and LLM agent integration testsfix: browser subagent state handlingruntime.stateinstead ofgetCurrentTaskInput()in browser environmentschore: add changesetWhy not just use containers?
Containers are the right choice for untrusted code or system-level operations.
Wasmsh is for the common case where agents need a filesystem, a shell, and
Python — and you don't want to spin up infrastructure for it. Tests run in
<1s, CI needs no secrets, and it works in the browser.
Test plan
pnpm format:checkcleanpnpm lintclean (0 errors)pnpm buildsucceeds