Sandcaster

Open-source runtime for general-purpose AI agents in isolated sandboxes.

CLI, API, TypeScript SDK, and chat bots (Slack, Discord, Telegram) with streaming, file uploads, and config-driven behavior.

Built on Pi-mono for multi-provider LLM orchestration and pluggable sandbox backends (E2B, Vercel, Docker, Cloudflare Containers).

Sandcaster is a TypeScript rewrite of Sandstorm for people who want real agent work, not a chat wrapper:

Research Acme's competitors, crawl their sites and recent news, and write a one-page branded briefing PDF with sources
Analyze uploaded transcripts or PDFs
Triage incoming support tickets
Run a security audit with coordinated sub-agents across multiple sandboxes
Turn docs into a draft OpenAPI spec
Deploy a chat bot that runs sandboxed agents from Slack, Discord, or Telegram

Terminal demo

$ sandcaster init research-brief
$ cd research-brief
$ sandcaster "Research Acme's competitors, crawl their sites and recent news, and write a one-page branded briefing PDF with sources."

[tool: web_fetch]
[tool: web_fetch]
[tool: file_write]
[file: briefing.pdf]
[tool: file_write]
[file: reports/sources.md]

Analyzed 5 competitor sites and recent coverage. Briefing and sources written.

✓ Completed · sonnet · 12 turns · $0.043 · 38s

The point is not that an agent can answer a question. It starts from a runnable starter, gets a fresh sandbox, can read uploads or crawl the web, writes artifacts like briefing.pdf, streams its work in a rich TUI, and tears itself down when the run is done.

60-second path

# from npm
npx @sandcaster/cli init hello-sandbox
cd hello-sandbox
npx @sandcaster/cli "What can you do in this sandbox?"

# or from source
git clone https://github.com/iamladi/sandcaster.git
cd sandcaster
bun install && bunx turbo build
cd apps/cli && bun link      # registers the `sandcaster` command globally

sandcaster init hello-sandbox
cd hello-sandbox
sandcaster "What can you do in this sandbox?"

sandcaster init <starter> scaffolds a runnable project with sandcaster.json, a README, .env.example, and any starter-specific assets. If provider env vars are detected in your shell, Sandcaster writes .env automatically.

sandcaster init --list              # list all starters
sandcaster init research-brief      # scaffold into ./research-brief
sandcaster init security-audit my-audit  # scaffold into ./my-audit

Examples

15 copy-paste-ready examples in examples/:

#	Example	What it does
01	hello-sandbox	Minimal agent in a sandbox
02	code-reviewer	Code review with structured JSON output
03	competitive-analysis	Market research with web crawling
04	structured-output	Extract TODOs with JSON schema
05	multi-agent-security-audit	Coordinated sub-agents for security review
06	speculative-branching	Multiple approaches evaluated by LLM judge
07	provider-vercel	Run on Vercel Sandbox
08	provider-cloudflare	Run on Cloudflare Containers
09	provider-docker	Run on local Docker
10	fix-ci-failure	Debug CI logs and propose a fix
11	generate-tests	Generate test suite with branching strategies
12	dependency-audit	Audit deps for CVEs, produce prioritized report
13	generate-api-docs	Generate OpenAPI spec from source code
14	onboard-to-codebase	Produce architecture overview of a codebase
15	chat-bot	Chat bot with Slack/Discord/Telegram + HTTP gateway

Why Sandcaster exists

Most agent projects break down in one of two ways:

You wire the SDK yourself and end up rebuilding sandbox lifecycle, file uploads, streaming, config loading, and starter setup.
You use an agent framework that is good at orchestration but weak at actually shipping a runnable agent product path.

Sandcaster is opinionated about the missing middle:

starter to runnable project in one command
fresh sandbox per request with teardown by default
CLI, API, and chat bots over the same runtime
config-driven behavior through sandcaster.json
multi-model support across Anthropic, OpenAI, Google, and OpenRouter
sub-agent orchestration with configurable tools and models per agent
speculative branching: run multiple approaches in parallel, pick the best with an LLM judge
composite sandboxes: agent-driven multi-sandbox orchestration across providers
session management with thread context and reconnect
skills system for reusable domain knowledge

Feature comparison

Capability	Sandcaster	Raw SDK	DIY runner
Fresh sandbox per request	Built in	Manual wiring	Manual wiring
Multi-provider sandboxes (E2B, Vercel, Docker, Cloudflare)	Built in	No	Custom work
Streaming SSE endpoint	Built in	Manual wiring	Custom work
File uploads with path safety	Built in	Manual wiring	Custom work
`sandcaster.json` config layer	Built in	No	Custom work
Multi-model with auto-detection	Built in	No	Custom work
Sub-agent orchestration	Built in	No	Custom work
Composite multi-sandbox orchestration	Built in	No	Custom work
Speculative branching with LLM judge	Built in	No	Custom work
Session management + reconnect	Built in	No	Custom work
Skills system (SKILL.md)	Built in	No	Custom work
Structured JSON output schemas	Built in	Manual wiring	Custom work
Chat bots (Slack, Discord, Telegram)	Built in	No	Custom work
Starter scaffolding with `init`	Built in	No	Custom work
Rich terminal TUI (Ink)	Built in	No	Custom work
Run history + JSONL store	Built in	No	Custom work
OpenTelemetry integration	Built in	No	Custom work

What changed from Sandstorm

Sandcaster is a ground-up TypeScript rewrite, not a port. Key differences:

	Sandstorm	Sandcaster
Language	Python	TypeScript (ESM-only)
Runtime	pip / venv	Bun
LLM SDK	Claude Agent SDK	Pi-mono (multi-provider)
Client SDK	Python (`duvo-sandstorm[client]`)	TypeScript (`@sandcaster/sdk`)
CLI	Click	Ink + React TUI with live status
API framework	FastAPI	Hono
Monorepo	flat	Turborepo with isolated packages
Models	Claude (multi-cloud + OpenRouter)	Anthropic, OpenAI, Google, OpenRouter
Sandbox providers	E2B only	E2B, Vercel, Docker, Cloudflare
Thinking levels	No	none / low / medium / high
Multi-sandbox	No	Composite sandboxes with cross-provider orchestration
Branching	No	Speculative branching with LLM judge evaluation
Chat platforms	No	Slack, Discord, Telegram
Sessions	No	Persistent sessions with thread context

Architecture

sandcaster/
├── packages/
│   ├── core               @sandcaster/core — schemas, config, sandbox orchestration, runner
│   │   ├── providers/     E2B, Vercel, Docker, Cloudflare Containers
│   │   ├── runner/        In-sandbox runner, IPC client, composite tools
│   │   ├── branching/     Speculative branching orchestrator + evaluators
│   │   └── session/       Session management, conversation threading
│   ├── sdk                @sandcaster/sdk — standalone TypeScript client (npm)
│   ├── chat               @sandcaster/chat — unified chat bot (Slack, Discord, Telegram)
│   ├── cloudflare-worker  Cloudflare Worker proxy for sandbox operations
│   └── ts-config          @sandcaster/ts-config — shared TypeScript configs
├── apps/
│   ├── api                @sandcaster/api — Hono REST server (SSE streaming)
│   ├── cli                @sandcaster/cli — Ink TUI + chat bot (npm)
│   └── web                @sandcaster/web — docs site (getsandcaster.com)
├── examples/              15 copy-paste starters
└── scripts/               E2B template + changelog sync

Execution flow

sandcaster "prompt" -f report.pdf
  │
  ├─ Load sandcaster.json + .env
  ├─ Resolve model (alias or auto-detect from API keys)
  ├─ Resolve sandbox provider (config > env auto-detect > docker > e2b)
  ├─ Create primary sandbox (E2B / Vercel / Docker / Cloudflare)
  ├─ Upload runner, config, user files, skills
  ├─ Execute: node /opt/sandcaster/runner.mjs
  ├─ Stream JSON-line events → TUI / SSE
  │   ├─ [branching] Fork N branches, evaluate with LLM judge
  │   └─ [composite] Intercept IPC requests from runner
  │       ├─ spawn_sandbox → SandboxPool creates secondary sandbox
  │       ├─ exec_in → run command in named sandbox
  │       ├─ transfer_files → copy files between sandboxes
  │       └─ kill_sandbox → tear down named sandbox
  ├─ Extract generated artifacts
  └─ Kill all sandboxes (secondaries first, primary last)

Multi-model support

Set a model alias in sandcaster.json or pass --model:

Alias	Resolves to
`sonnet`	`claude-sonnet-4-6`
`opus`	`claude-opus-4-6`
`haiku`	`claude-haiku-4-5`
`gpt5`	`gpt-5.4`
`gpt5mini`	`gpt-5-mini`
`gemini`	`gemini-3.1-pro-preview`

If no model is specified, Sandcaster auto-detects from your API keys: ANTHROPIC_API_KEY > OPENAI_API_KEY > GOOGLE_API_KEY > OPENROUTER_API_KEY

You can also pass any full model ID directly (e.g. --model claude-opus-4-6).

Sandbox providers

Sandcaster supports multiple sandbox backends. Set sandboxProvider in sandcaster.json or let Sandcaster auto-detect from your environment:

Provider	Env var	Use case
`e2b`	`E2B_API_KEY`	Cloud sandbox with custom templates, streaming, reconnect
`vercel`	`VERCEL_TOKEN`	Vercel Sandbox with snapshot support and streaming
`docker`	Local daemon or `DOCKER_HOST`	Local development, no cloud account needed
`cloudflare`	`CLOUDFLARE_API_TOKEN` + `CLOUDFLARE_SANDBOX_WORKER_URL`	Edge-based sandbox via Cloudflare Containers

Auto-detection priority: E2B_API_KEY > VERCEL_TOKEN > Cloudflare env vars > Docker socket > E2B (fallback).

{
  "sandboxProvider": "docker"
}

Speculative branching

Run the same prompt through multiple branches in parallel and let an LLM judge pick the best result. Useful for code generation, test strategies, or any task where exploring alternatives improves quality.

{
  "branching": {
    "enabled": true,
    "count": 3,
    "trigger": "always",
    "evaluator": {
      "type": "llm-judge",
      "model": "sonnet"
    }
  }
}

CLI flags: --branches 3 --branch-trigger always --evaluator llm-judge

Each branch can use a different model or sandbox provider. The evaluator scores all branches and selects a winner with reasoning.

Composite sandboxes

Agents can dynamically spawn, coordinate, and tear down multiple sandboxes during a single run. This enables workflows like running a security scanner in one sandbox while building in another, or parallelizing work across providers.

Composite mode activates automatically when the primary sandbox supports it (requires fileSystem + shellExec capabilities — E2B and Docker). The agent gets five additional tools:

Tool	Description
`spawn_sandbox`	Create a new sandbox by name and provider
`exec_in`	Run a shell command in a named sandbox
`transfer_files`	Copy files between sandboxes (supports globs)
`kill_sandbox`	Tear down a named sandbox
`list_sandboxes`	List all active sandboxes in the session

Communication between the runner (inside the primary sandbox) and the host uses a nonce-authenticated file-based IPC protocol. The host intercepts requests from the runner's stdout, executes them against the SandboxPool, and writes responses back via atomic file rename.

Guardrails

Config sets the ceiling. Per-request overrides can only tighten limits, never broaden them.

Guardrail	Default	Config key
Max concurrent sandboxes	3	`composite.maxSandboxes` (1-20)
Max total spawns per session	10	`composite.maxTotalSpawns` (1-100)
Allowed providers	all	`composite.allowedProviders`
IPC poll interval	50ms	`composite.pollIntervalMs` (10-1000)

File transfers enforce per-file (25MB) and total (50MB) size limits with path traversal protection.

Cleanup

All secondary sandboxes are killed automatically when the run completes (secondaries first, primary last). A SIGTERM handler ensures cleanup on graceful shutdown.

Chat bots

Sandcaster includes a unified chat bot package (@sandcaster/chat) supporting Slack, Discord, and Telegram. Each message spawns a sandboxed agent session, with thread-based conversation tracking and automatic session management.

Zero-code (CLI)

# Set platform tokens in your environment, then:
sandcaster chat start

Production gateway

For production deployments, use the HTTP gateway pattern with Hono webhook routes:

import { createChatBot, createChatWebhookRoutes } from "@sandcaster/chat";

const { bot, pool } = await createChatBot({
  platforms: { slack: { botToken, appToken, signingSecret } },
  sessionTimeoutMs: 600_000,
  allowedChannels: ["C01234ABCDE"],
  onNewSession: async (session) => { /* wire up agent */ },
  onSessionMessage: async (session, text) => { /* continue session */ },
});

const routes = createChatWebhookRoutes(bot);

See examples/15-chat-bot for a full working example with both CLI and gateway modes.

Features

Thread-to-session mapping with per-thread mutex for message serialization
Thread re-engagement: rebuilds context from thread history when session expires
Deduplication (10-min TTL) to prevent duplicate processing
Access control via allowedChannels and allowedUsers
Event-to-text bridge streams agent output as markdown to chat

SDK

npm install @sandcaster/sdk

import { SandcasterClient } from "@sandcaster/sdk";

const client = new SandcasterClient({
  baseUrl: "http://localhost:8000",
  apiKey: "your-token",
});

for await (const event of client.query({ prompt: "Analyze this market" })) {
  if (event.type === "assistant") console.log(event.content);
  if (event.type === "file") console.log(`artifact: ${event.path}`);
  if (event.type === "result") console.log(`cost: $${event.costUsd}`);
}

The SDK implements Symbol.asyncDispose for automatic cleanup and supports AbortSignal for cancellation.

API

Start the server:

sandcaster serve --port 8000

Method	Route	Description
`GET`	`/health`	Status and version
`POST`	`/query`	Stream agent execution as SSE
`GET`	`/runs`	List recent runs (limit: 1-200)
`POST`	`/sessions/:sessionId/message`	Send message to an existing session

Configuration

sandcaster.json at your project root:

{
  "systemPrompt": "You are a research analyst...",
  "model": "sonnet",
  "maxTurns": 20,
  "timeout": 300,
  "outputFormat": { "type": "json_schema", "schema": { "..." : "..." } },
  "skillsDir": ".claude/skills",
  "allowedTools": ["bash", "file_read", "file_write"],
  "templateSkills": true,
  "thinkingLevel": "medium",
  "agents": {
    "dependency-scanner": {
      "description": "Checks dependencies for known CVEs",
      "tools": ["Read", "Glob", "Bash"],
      "model": "haiku"
    }
  },
  "provider": "anthropic",
  "sandboxProvider": "e2b",
  "branching": {
    "enabled": true,
    "count": 3,
    "trigger": "always",
    "evaluator": { "type": "llm-judge", "model": "sonnet" }
  },
  "composite": {
    "maxSandboxes": 3,
    "maxTotalSpawns": 10,
    "allowedProviders": ["e2b", "docker"]
  },
  "chat": {
    "sessionTimeoutMs": 600000,
    "allowedChannels": ["C01234ABCDE"],
    "botName": "sandcaster"
  }
}

Skills

Drop a SKILL.md file in your skills directory:

---
name: owasp-top-10
description: OWASP Top 10 security review checklist
---

Use this checklist when auditing application code...

Skills are injected into the agent's context and can be loaded on-demand via the read_skill tool.

CLI commands

Command	Description
`sandcaster [prompt]`	Run agent with optional prompt (default command)
`sandcaster serve`	Start Hono API server
`sandcaster init [starter]`	Scaffold a starter project
`sandcaster chat start`	Start chat bot (Slack/Discord/Telegram)
`sandcaster session <id>`	Interact with an existing session
`sandcaster templates`	List or push E2B sandbox templates

Development

bun install              # Install all workspace dependencies
bunx turbo build         # Build all packages
bunx turbo test          # Run all tests
bunx turbo lint          # Lint all packages (Biome)

Published packages use Changesets for versioning. CI publishes @sandcaster/sdk and @sandcaster/cli to npm with provenance on every release.

Community

Docs and changelog at getsandcaster.com.

If Sandcaster saves you runner plumbing, please star the repo.

If you want a new starter, a provider integration, or a sharper deploy story, open an issue or start a discussion.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 224 Commits
.changeset		.changeset
.github/workflows		.github/workflows
apps		apps
examples		examples
packages		packages
plans		plans
research		research
scripts		scripts
.gitignore		.gitignore
.npmrc		.npmrc
CLAUDE.md		CLAUDE.md
Dockerfile.sandbox		Dockerfile.sandbox
LICENSE		LICENSE
README.md		README.md
biome.json		biome.json
bun.lock		bun.lock
package.json		package.json
turbo.json		turbo.json
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sandcaster

Terminal demo

60-second path

Examples

Why Sandcaster exists

Feature comparison

What changed from Sandstorm

Architecture

Execution flow

Multi-model support

Sandbox providers

Speculative branching

Composite sandboxes

Guardrails

Cleanup

Chat bots

Zero-code (CLI)

Production gateway

Features

SDK

API

Configuration

Skills

CLI commands

Development

Community

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Sandcaster

Terminal demo

60-second path

Examples

Why Sandcaster exists

Feature comparison

What changed from Sandstorm

Architecture

Execution flow

Multi-model support

Sandbox providers

Speculative branching

Composite sandboxes

Guardrails

Cleanup

Chat bots

Zero-code (CLI)

Production gateway

Features

SDK

API

Configuration

Skills

CLI commands

Development

Community

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages