feat: Node.js plugin isolation via workerd sandbox#426
feat: Node.js plugin isolation via workerd sandbox#426BenjaminPrice wants to merge 13 commits intoemdash-cms:mainfrom
Conversation
…isolation Proves that miniflare (wrapping workerd) supports all capabilities needed for sandboxed plugin execution on the Node deployment path: - Plugin code loads from strings (no filesystem, bundles from DB/R2 work) - Service bindings between workers provide capability scoping - External service bindings route plugin calls to Node handler functions - KV namespace bindings provide per-plugin isolated storage - Plugins without bindings cannot access unavailable capabilities - Dispose/recreate cycle supports plugin install/uninstall Key finding: miniflare's serviceBindings with async Node handlers eliminates the need for a separate HTTP backing service server. The bridge calls route directly from workerd isolates to Node functions.
…p config Implements the SandboxRunner interface for Node.js deployments using workerd as a sidecar process: - WorkerdSandboxRunner: spawns workerd via child_process, manages lifecycle with epoch-based stale handle detection and health checks - Backing service: authenticated HTTP server in Node handling plugin bridge calls (content, media, KV, storage, email, users, network) - Auth: per-startup HMAC secret, per-plugin tokens encoding capabilities. Server-side capability validation on every request. - capnp config generator: creates workerd config from plugin manifests, each plugin as a nanoservice with its own port - Plugin wrapper: generates JS that runs inside workerd isolate, proxying ctx.* calls via HTTP fetch to the backing service - Wall-time enforcement via Promise.race (matching Cloudflare pattern)
Extends the SandboxRunner interface with isHealthy() for sidecar-based runners where the sandbox process can crash independently of the host. - SandboxRunner.isHealthy(): returns false when sidecar is down - SandboxUnavailableError: typed error for stale handles and unavailable sandbox - NoopSandboxRunner: implements isHealthy() (always false) - CloudflareSandboxRunner: implements isHealthy() (delegates to isAvailable) - WorkerdSandboxRunner: exponential backoff restart on crash (1s, 2s, 4s, cap 30s, give up after 5 failures in 60s), SIGTERM forwarding to child - SandboxNotAvailableError message updated to mention both Cloudflare and workerd sandbox runners (no longer Cloudflare-specific)
…F protection Replaces the naive hostname-only check in the workerd backing service with core's createHttpAccess/createUnrestrictedHttpAccess. This gives the workerd sandbox runner identical behavior to in-process plugins: - Redirect targets revalidated against allowedHosts on each hop - Credential headers stripped on cross-origin redirects - SSRF protection blocks private IPs, cloud metadata endpoints - Max 5 redirects enforced Exports createHttpAccess and createUnrestrictedHttpAccess from the emdash package so platform adapters can reuse the shared policy layer.
…y warnings Adds debugging escape hatch and clearer messaging for sandbox availability: - sandbox: false config option explicitly disables plugin sandboxing even when a sandboxRunner is configured, for isolating whether bugs are in plugin code or in the sandbox runtime - Upgrades sandbox-unavailable log from console.debug to console.warn with actionable message mentioning workerd installation - SandboxNotAvailableError message now references both @emdash-cms/cloudflare/sandbox and @emdash-cms/workerd/sandbox as options
Adds dev-mode miniflare integration and refactors bridge logic: - MiniflareDevRunner: uses miniflare's outboundService to intercept plugin fetch() calls and route bridge calls to Node handler functions. No HTTP server, no capnp config, no child process management. - bridge-handler.ts: extracted shared bridge dispatch logic used by both the production HTTP backing service and the dev miniflare runner. Single source of truth for capability enforcement and DB queries. - backing-service.ts: simplified to auth token validation + delegation to the shared bridge handler. ~440 LOC removed. - Factory function auto-detects dev mode (NODE_ENV !== production) and uses MiniflareDevRunner when miniflare is available, falling back to WorkerdSandboxRunner for production.
Tests the shared bridge handler that both production (workerd) and dev (miniflare) runners use. 19 tests covering: - KV operations: set, get, delete, list, per-plugin isolation - Capability enforcement: read:content, write:content (implies read), read:users, network:fetch, email:send - Plugin storage: declared collections only, put/get, per-plugin isolation - Error handling: unknown methods, missing parameters - Logging: works without capabilities Uses real in-memory SQLite (better-sqlite3 + Kysely), matching core's test infrastructure pattern. No mocking.
Updates plugin sandbox documentation to reflect the new workerd-based isolation on Node.js: - Adds step-by-step setup guide for @emdash-cms/workerd/sandbox - Documents sandbox: false debugging escape hatch - Updates security comparison table with 3-column layout (Cloudflare, Node+workerd, Node trusted-only) - Adds self-hosted security note about workerd vs Cloudflare hardening - Updates recommendations for Node.js deployments Also cleans up the workerd package: - Moves miniflare from dependencies to devDependencies (production uses raw workerd, miniflare is only for dev mode) - Adds workerd as a peerDependency - Adds @types/better-sqlite3 to pnpm catalog, updates core and marketplace packages to use catalog: reference - Renames loader-spike.test.ts to miniflare-isolation.test.ts with updated descriptions (integration tests, not spike artifacts) - Removes test:spike script from package.json - Adds author field
Rewrites the bridge handler to match the Cloudflare PluginBridge
behavior exactly:
- KV: uses _plugin_storage with collection='__kv' (was _emdash_options
with key prefix). Returns { key, value }[] for list, boolean for delete.
- Content: adds rowToContentItem() transform stripping system columns and
parsing JSON. Implements create (ULID, version tracking), update (version
bump, partial field updates), and delete (soft-delete via deleted_at).
Adds collection name validation to prevent SQL injection.
- Media: fixes table name to 'media' (was '_emdash_media'). Returns
{ id, filename, mimeType, size, url, createdAt } shape with url built
from storage_key. Filters by status='ready' for list. Supports mimeType
filter and cursor pagination.
- Users: fixes table name to 'users' (was '_emdash_users'). Lowercases
email in getByEmail. Adds cursor pagination to list.
- Storage: adds count, getMany, putMany, deleteMany methods. Returns
{ hasMore, cursor } pagination matching Cloudflare bridge.
Removes the TODO comment. All bridge operations now match the Cloudflare
bridge's return types and behavior, except media upload which requires
the Storage interface (documented inline).
Adds a "Testing in the Sandbox" section to creating-plugins.mdx covering: - How to install and configure @emdash-cms/workerd/sandbox in a test site - Using sandbox: false as a debugging escape hatch - What behaves differently in sandbox vs trusted mode (capabilities, network access, Node.js builtins, env vars, resource limits) Also adds a cross-link from sandbox.mdx to the new testing section.
…erations Tests the bridge handler with the same operations EmDash's shipped plugins perform (modeled after the sandboxed-test plugin's routes): - KV round-trip: set, get, delete (matching kv/test route) - Storage round-trip: put, get, count (matching storage/test route) - Content list with read:content (matching content/list route) - Content lifecycle: create with ULID, read, update with version bump, soft-delete (write:content operations) - Capability enforcement: read-only plugin cannot write, cannot email, cannot access undeclared storage collections - Cross-plugin isolation: KV and storage data scoped per plugin Uses real SQLite with schema matching production migrations. Adds ulidx dependency for content creation.
🦋 Changeset detectedLatest commit: a846039 The changes in this PR will be included in the next version bump. This PR includes changesets to release 10 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Scope checkThis PR changes 3,719 lines across 29 files. Large PRs are harder to review and more likely to be closed without review. If this scope is intentional, no action needed. A maintainer will review it. If not, please consider splitting this into smaller PRs. See CONTRIBUTING.md for contribution guidelines. |
There was a problem hiding this comment.
Pull request overview
Adds a new Node.js sandbox runner based on workerd (plus a dev runner using Miniflare) to bring Cloudflare-like plugin isolation/capability enforcement to non-Workers deployments, along with core interface updates (SandboxRunner.isHealthy(), SandboxUnavailableError) and accompanying docs/tests.
Changes:
- Introduces new
@emdash-cms/workerdpackage implementing theSandboxRunnercontract (workerd sidecar + backing service; Miniflare dev runner). - Extends core sandbox APIs (new
isHealthy()+SandboxUnavailableError) and wires an opt-outsandbox?: booleanintegration flag. - Updates docs and adds a new conformance/integration test suite for bridge behavior and isolation.
Reviewed changes
Copilot reviewed 28 out of 29 changed files in this pull request and generated 16 comments.
Show a summary per file
| File | Description |
|---|---|
| pnpm-workspace.yaml | Adds @types/better-sqlite3 to the shared catalog. |
| pnpm-lock.yaml | Locks new workspace package/deps and catalog specifier updates. |
| packages/workerd/tsconfig.json | TypeScript config for the new workerd package build output. |
| packages/workerd/test/plugin-integration.test.ts | Integration tests for real plugin-like operations against the bridge handler. |
| packages/workerd/test/miniflare-isolation.test.ts | Miniflare isolation behavior tests (service bindings, KV, dynamic code, etc.). |
| packages/workerd/test/bridge-handler.test.ts | Bridge handler conformance tests (capabilities, KV/storage isolation, errors). |
| packages/workerd/src/sandbox/wrapper.ts | Generates the plugin wrapper code that proxies ctx.* via HTTP bridge calls. |
| packages/workerd/src/sandbox/runner.ts | Production workerd sidecar runner (process lifecycle, tokens, backing service). |
| packages/workerd/src/sandbox/index.ts | Public exports for the workerd sandbox surface. |
| packages/workerd/src/sandbox/dev-runner.ts | Development runner intended to use Miniflare for faster iteration. |
| packages/workerd/src/sandbox/capnp.ts | Generates workerd capnp config for per-plugin nanoservices. |
| packages/workerd/src/sandbox/bridge-handler.ts | Shared bridge dispatcher implementing capability enforcement and DB access. |
| packages/workerd/src/sandbox/backing-service.ts | Node HTTP backing service wrapper that authenticates and dispatches bridge calls. |
| packages/workerd/src/index.ts | Package entrypoint export wiring. |
| packages/workerd/package.json | Defines the new @emdash-cms/workerd package metadata and deps. |
| packages/marketplace/package.json | Switches @types/better-sqlite3 to catalog version. |
| packages/core/src/plugins/sandbox/types.ts | Adds isHealthy() to SandboxRunner and introduces SandboxUnavailableError. |
| packages/core/src/plugins/sandbox/noop.ts | Implements isHealthy() and updates the “sandbox not available” error text. |
| packages/core/src/plugins/sandbox/index.ts | Exports SandboxUnavailableError. |
| packages/core/src/plugins/index.ts | Re-exports the new sandbox error and HTTP access helpers. |
| packages/core/src/index.ts | Exposes SandboxUnavailableError and HTTP access creation helpers publicly. |
| packages/core/src/emdash-runtime.ts | Improves startup warning when sandbox runner is configured but unavailable. |
| packages/core/src/astro/integration/vite-config.ts | Adds sandbox: false escape hatch behavior to virtual module generation. |
| packages/core/src/astro/integration/runtime.ts | Adds sandbox?: boolean config option documentation/type. |
| packages/core/package.json | Switches @types/better-sqlite3 to catalog version. |
| packages/cloudflare/src/sandbox/runner.ts | Implements isHealthy() for the Cloudflare runner. |
| docs/src/content/docs/plugins/sandbox.mdx | Documents Node+workerd sandboxing and adds the escape hatch + comparison table. |
| docs/src/content/docs/plugins/creating-plugins.mdx | Adds “Testing in the Sandbox” guidance and behavior differences table. |
| .changeset/bumpy-crabs-nail.md | Changeset describing new workerd sandboxing feature and related exports. |
Files not reviewed (1)
- pnpm-lock.yaml: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Overlapping PRsThis PR modifies files that are also changed by other open PRs:
This may cause merge conflicts or duplicated work. A maintainer will coordinate. |
Consolidates fixes from Codex/Copilot review rounds against the
node-plugin-isolation branch.
## Workerd runner
- Per-startup invoke token authenticates inbound hook/route HTTP calls
(constant-time comparison since workerd has no timingSafeEqual).
Prevents same-host attackers from invoking plugin hooks via the
per-plugin TCP listener on 127.0.0.1.
- Readiness probe sends the invoke token; treats 404 as ready.
- Resolve workerd binary from package bin/workerd; use execFileSync so
paths with spaces aren't shell-split.
- stdout/stderr drained to prevent pipe buffer deadlock.
- HMAC token compared via timingSafeEqual.
- stopWorkerd: fast-path on already-exited; SIGKILL fallback uses local
exited flag (proc.killed flips on signal queue, not actual exit).
- Crash exit handler restarts on signal-based termination too (OOM/kill).
- intentionalStop flag suppresses crash recovery on intentional reloads
(plugin install/uninstall) so they don't cascade into restart loops.
- Deferred startup with serialized startupPromise; needsRestart only
cleared after successful start so transient failures retry on next
invocation. scheduleRestart only sets needsRestart, not direct restart.
- Per-startup invoke token + WorkerdSandboxedPlugin sends it on every
invocation; checkEpoch replaced with ensureReady(); SandboxUnavailableError
thrown when sandbox is down.
- isHealthy() returns false when needsRestart set so external monitors
see "running" only when actually running.
- Storage configs (with indexes + uniqueIndexes) looked up by id+version
so plugin upgrades don't see stale schemas.
- terminate() calls runner.unloadPlugin() so marketplace update/uninstall
actually drops old plugins (no leaked listeners or stale entries).
- Factory only picks dev runner when NODE_ENV === "development". Unset
NODE_ENV (default for `node server.js`, `astro preview`) uses production
WorkerdSandboxRunner so production hardening isn't silently dropped.
- MiniflareDevRunner statically imported so dev path works in published
installs (not just source tree).
- capnp config: globalOutbound routes all fetch through backing service.
Comments document that direct fetch() returns 500 "Unknown bridge method"
by design (forces ctx.http.fetch + capability/host enforcement).
- Resource limits documented honestly: cpuMs/memoryMb/subrequests are
Cloudflare platform features, not standalone workerd. Only wallTimeMs
is enforced (Promise.race). Startup warning if operators set unenforced
limits. Docs updated with caution box and recommendations.
## Bridge handler
- Delegates storage operations to PluginStorageRepository so where/orderBy/
cursor/count work correctly. Fixes infinite-loop pagination on shipped
plugins like forms-submissions and incorrect filtered counts.
- Strict capability enforcement: write:content does NOT imply read:content
(matches Cloudflare bridge). network:fetch:any still satisfies
network:fetch.
- ctx.http.fetch returns base64-encoded bytes preserving binary content
(atproto cover images, webhook payloads). Wrapper rebuilds Response
with proper bytes via base64 decode.
- RequestInit marshaling preserves Headers (multi-value via [name, value]
pairs), Blob/File bodies, FormData, URLSearchParams, ArrayBuffer with
byteOffset/byteLength preserved.
- Media upload writes bytes to storage via the configured Storage adapter,
sets status='ready' (not 'pending'). DB insert failure rolls back the
storage object (best-effort cleanup with warning logged on failure).
- Media delete deletes the storage object too (best-effort) so files
don't leak.
- ctx.media.upload accepts ArrayBuffer/Uint8Array/any TypedArray/DataView,
preserving the byte window via buffer+byteOffset+byteLength.
- getMany serializes as [[id, data], ...] pairs not a plain object so
special IDs like "__proto__" survive transport.
- mediaUpload, mediaDelete take optional Storage interface from
BridgeHandlerOptions.
- Error messages match Cloudflare PluginBridge format ("Missing capability:
X", "Storage collection not declared: X").
## Core / runtime
- SandboxRunner interface: isHealthy() added; SandboxUnavailableError class
added and exported; mediaStorage field added to SandboxOptions
(upload + delete methods); CloudflareSandboxRunner implements isHealthy.
- Cloudflare PluginBridge: storageQuery/storageCount delegate to
PluginStorageRepository for parity with the workerd bridge fix.
storageConfig added to PluginBridgeProps so indexes propagate.
- ContentRepository, MediaRepository, PluginStorageRepository,
UserRepository, OptionsRepository exported from emdash so platform
adapters can reuse them.
- createHttpAccess and createUnrestrictedHttpAccess exported for platform
adapters (workerd uses these for SSRF and host allowlist enforcement).
- New emdash config option: sandbox: false (debugging escape hatch).
When set, sandboxed plugin entries load in-process via adaptSandboxEntry
+ data URL import, get added to allPipelinePlugins and configuredPlugins,
and respect _plugin_state. adminPages and adminWidgets passed through.
- Marketplace plugins also load in-process under sandbox: false.
loadMarketplacePluginsBypassed runs before pipeline creation on cold
start; syncMarketplacePluginsBypassed handles runtime install/update/
uninstall (rebuilds the hook pipeline so changes take effect immediately).
- handleMarketplaceInstall/Update accept sandboxBypassed flag, skip the
SANDBOX_NOT_AVAILABLE gate when set. Routes pass emdash.isSandboxBypassed().
- mediaStorage threaded from runtime into sandbox runner via SandboxOptions
(both build-time and marketplace cold-start paths).
- sandboxBypassed flag plumbed through virtual:emdash/sandbox-runner module
via namespace import (handles missing export when not in bypass mode).
- SandboxNotAvailableError message updated to mention both
@emdash-cms/cloudflare/sandbox and @emdash-cms/workerd/sandbox.
## Tests
- bridge-handler.test.ts updated for strict capability enforcement
(write does not imply read) and matching Cloudflare error messages.
- plugin-integration.test.ts: write-only plugin tests assert read:content
and read:media are NOT implied by their write counterparts.
@emdash-cms/admin
@emdash-cms/auth
@emdash-cms/blocks
@emdash-cms/cloudflare
emdash
create-emdash
@emdash-cms/gutenberg-to-portable-text
@emdash-cms/x402
@emdash-cms/plugin-ai-moderation
@emdash-cms/plugin-atproto
@emdash-cms/plugin-audit-log
@emdash-cms/plugin-color
@emdash-cms/plugin-embeds
@emdash-cms/plugin-forms
@emdash-cms/plugin-webhook-notifier
commit: |
What does this PR do?
Adds workerd-based plugin sandboxing for Node.js deployments, closing EmDash's most significant architectural gap: plugins that run sandboxed on Cloudflare Workers now run sandboxed on Node.js too, with matching capability enforcement and isolation guarantees.
Discussion: #425
How it works
A new
@emdash-cms/workerdpackage implements theSandboxRunnerinterface (same contract as@emdash-cms/cloudflare). In production it spawns workerd as a child process with a generated capnp config. In development it uses miniflare for faster startup. Plugin workers run in V8 isolates and communicate with Node via an authenticated HTTP backing service.Architecture:
WorkerdSandboxRunner(production): spawns workerd, generates capnp config per plugin, manages lifecycle with exponential backoff restart, SIGTERM forwarding, intentional-stop guard, and epoch-based stale handle detectionMiniflareDevRunner(dev): uses miniflare'soutboundServiceto route bridge calls to Node handler functions; non-bridge fetches are blocked to prevent capability bypassbridge-handler.ts: shared bridge dispatch logic used by both runners, with capability enforcement matching the Cloudflare PluginBridgeSecurity model matches Cloudflare:
timingSafeEqualin Node, hand-rolled XOR in workerd where Web Crypto doesn't expose it)createHttpAccess(redirect revalidation on each hop, credential stripping on cross-origin redirects, SSRF blocking, max 5 redirects)write:contentdoes NOT implyread:content(matches Cloudflare bridge)fetch()from plugin code is blocked in dev (miniflareoutboundServicereturns 403) and routed through the backing service in prod (capnpglobalOutbound), forcing all network access throughctx.http.fetchand capability/host checksHonest about resource limits:
cpuMs,memoryMb, andsubrequestsare Cloudflare platform features, not standalone workerd. OnlywallTimeMsis enforced (viaPromise.race). Operators get a startup warning if they configure unenforced limits, and the docs include a caution box.Sandbox bypass mode (
sandbox: false): Debugging escape hatch that loads sandboxed plugin entries in-process viaadaptSandboxEntry+ data URLimport(). Build-time and marketplace plugins both work under bypass; marketplace install/update routes skip theSANDBOX_NOT_AVAILABLEgate when bypassed and rebuild the hook pipeline so changes take effect immediately.Type of change
Checklist
pnpm typecheckpassespnpm --silent lint:json | jq '.diagnostics | length'returns 0 errorspnpm testpasses (or targeted tests for my change)pnpm formathas been runAI-generated code disclosure
Tests
36 new tests in
@emdash-cms/workerdacross 3 files, plus 2163 existing core tests passing:miniflare-isolation.test.ts(6 tests): V8 isolate isolation, service bindings, KV namespaces, dynamic code loading, worker reconfigurationbridge-handler.test.ts(19 tests): KV CRUD + isolation, strict capability enforcement (read/write content do not imply each other;network:fetch:anysatisfiesnetwork:fetch), storage collection validation, error message parity with Cloudflare bridgeplugin-integration.test.ts(11 tests): Real plugin operations modeled after thesandboxed-testplugin (KV round-trip, storage round-trip, content lifecycle with ULID/versioning/soft-delete, cross-plugin isolation, write-only plugins cannot read)All tests use real in-memory SQLite with production schema, no mocking.
Review iterations
This PR went through ~20 rounds of automated and human review. Notable hardening from review feedback:
timingSafeEqual+ hand-rolled for workerd), miniflareoutboundServiceblocks non-bridge fetches, strict capability enforcement (write:content≠read:content)startupPromise,isHealthy()returns false when restart pending,stopWorkerd()fast-paths on already-exited and uses localexitedflag for SIGKILL fallbackctx.http.fetchreturns base64-encoded bytes preserving binary content, RequestInit marshaling preserves multi-value headers ([[name, value]]pairs), Blob/File/FormData/URLSearchParams bodies, ArrayBuffer withbyteOffset/byteLengthwindowstorageQuery/storageCounttoPluginStorageRepositorysowhere/orderBy/cursor/countwork correctly. Fixes infinite-loop pagination on shipped plugins likeforms-submissions.Storageadapter, setsstatus='ready', rolls back storage object on DB failure (best-effort cleanup); delete removes the storage object toobin/workerd(nonpxruntime download),execFileSyncfor paths with spaces, stdout/stderr drained to prevent pipe deadlockloadMarketplacePluginsBypassedruns before pipeline creation on cold start;syncMarketplacePluginsBypassedhandles runtime install/update/uninstall and rebuilds the hook pipelineChanges
New:
packages/workerd/src/sandbox/runner.tsWorkerdSandboxRunnerimplementingSandboxRunner, child process lifecycle, auth token generation, intentional-stop guard, epoch-based stale handle detectionsrc/sandbox/dev-runner.tsMiniflareDevRunnerfor development, auto-selected whenNODE_ENV === "development"src/sandbox/bridge-handler.tsPluginStorageRepositorysrc/sandbox/backing-service.tssrc/sandbox/capnp.tsglobalOutboundroutes through backing servicesrc/sandbox/wrapper.tsModified:
packages/core/plugins/sandbox/types.tsisHealthy()toSandboxRunnerinterface,SandboxUnavailableErrorclass,mediaStorageinSandboxOptionsplugins/sandbox/noop.tsisHealthy(), error message mentions both CF and workerdastro/integration/runtime.tssandbox?: booleanconfig option (escape hatch)astro/integration/virtual-modules.tssandboxBypassedflag whensandbox: falseastro/middleware.tssandboxBypassedvia namespace import (handles missing export)emdash-runtime.tsloadBypassedPlugins+loadMarketplacePluginsBypassed, runtime sync,isSandboxBypassed(), threadsmediaStorageinto sandbox runnersapi/handlers/marketplace.tshandleMarketplaceInstall/handleMarketplaceUpdateacceptsandboxBypassed, skipSANDBOX_NOT_AVAILABLEgate when setindex.tsSandboxUnavailableError,createHttpAccess,createUnrestrictedHttpAccess, repositories used by platform adaptersModified:
packages/cloudflare/sandbox/runner.tsisHealthy()(delegates toisAvailable()), passesstorageConfigto bridge bindingssandbox/bridge.tsPluginStorageRepository(parity fix)Docs
plugins/sandbox.mdxplugins/creating-plugins.mdxChangeset
.changeset/bumpy-crabs-nail.md—emdashminor,@emdash-cms/cloudflarepatch,@emdash-cms/workerdminor.