Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
176 changes: 176 additions & 0 deletions .plans/cloud-agent-next-dind-small-sandbox-handoff.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
# Cloud Agent Next DIND Small Sandbox Handoff

## PR

- Draft PR: https://github.com/Kilo-Org/cloud/pull/2933
- Branch: `chore/cloud-agent-small-dind-clean`
- Commit: `1b4a56dc chore(cloud-agent-next): add DIND small sandbox`

## Goal

Make `SandboxSmall` use a Docker-in-Docker-capable Cloudflare Sandbox image so it can be used as the experimental runtime for devcontainer support.

The current PR intentionally stops at the image/config layer. It does not yet run devcontainers from cloud-agent-next session code.

## What Changed

- Added `services/cloud-agent-next/Dockerfile.dind`.
- Updated `services/cloud-agent-next/wrangler.jsonc` so `SandboxSmall` uses `./Dockerfile.dind` in both production and `env.dev` container config.
- Updated the generated Wrangler type hash in `services/cloud-agent-next/worker-configuration.d.ts`.

## Dockerfile Shape

`Dockerfile.dind` follows Cloudflare's Docker-in-Docker guidance:

- Starts from `docker:dind-rootless`.
- Copies `/sandbox` and required musl runtime libraries from `docker.io/cloudflare/sandbox:0.8.9-musl`.
- Runs the Cloudflare Sandbox binary as `ENTRYPOINT`.
- Starts rootless `dockerd` as the Sandbox child `CMD` with `--iptables=false --ip6tables=false`.
- Switches back to `USER rootless` for runtime.

It also installs the cloud-agent tooling needed to keep the current `SandboxSmall` behavior mostly compatible:

- `git`, `git-lfs`, `jq`, `curl`, `wget`, `openssh-client`
- GitHub CLI `gh`
- GitLab CLI `glab`
- `nodejs`, `npm`, `bun`, `pnpm`
- `@devcontainers/cli`
- `@kilocode/cli@${KILOCODE_CLI_VERSION}`
- Existing cloud-agent wrapper bundles at `/usr/local/bin/kilocode-wrapper.js` and `/usr/local/bin/kilo-restore-session.js`

It additionally prepares a future inner-container Kilo bundle:

- `/opt/kilo-agent/bin/kilo`
- `/opt/kilo-agent/cli-linux-x64/bin/kilo`
- `/opt/kilo-agent/cli-linux-x64-musl/bin/kilo`

The launcher detects `x86_64` plus glibc/musl and execs the matching platform binary. This is for later mounting/copying into user devcontainers so `kilo serve` can run there without requiring Node/npm/Bun in the user image.

## Important Context

- This sandbox environment does not have Docker installed, so the DIND image has not been built or runtime-tested here.
- Push was done with `--no-verify` at user request because the pre-push hook timed out in this sandbox.
- `pnpm install --frozen-lockfile` was run to make local tooling available; it warned that this environment uses Node 20 while the repo wants Node 24.
- Earlier brainstorming lives in `.plans/cloud-agent-next-devcontainer-support.md` on the session branch, but it is not included in the clean draft PR.

## Verification Already Run

On the clean branch:

```bash
pnpm --filter cloud-agent-next exec wrangler types --check
pnpm --filter cloud-agent-next run typecheck
pnpm --filter cloud-agent-next run format:check
pnpm format
```

All passed, with the expected Node engine warning:

```text
Unsupported engine: wanted: {"node":">=24 <25"} (current: {"node":"v20.20.2","pnpm":"10.33.0"})
```

## Recommended Next Tests

Run these locally or in an environment with Docker.

### 1. Build the image

From `services/cloud-agent-next`:

```bash
docker build -f Dockerfile.dind -t cloud-agent-next-dind-small .
```

Things to watch:

- Whether `cloudflare/sandbox:0.8.9-musl` exists and has all copied paths.
- Whether `npm install -g bun pnpm @devcontainers/cli @kilocode/cli@7.1.23` works on Alpine.
- Whether `gh` and `glab` downloaded archives match expected directory layout.
- Final image size; `/opt/kilo-agent` includes two large Kilo platform binaries.

### 2. Smoke-test Docker inside the image

Run the image in a local Docker environment that can support rootless DIND. The exact local flags may differ from Cloudflare Containers, so this is only a smoke test:

```bash
docker run --rm -it cloud-agent-next-dind-small sh
```

Inside the container:

```bash
docker version
docker run --network=host --rm alpine:3.20 echo hello
devcontainer --version
kilo --version
/opt/kilo-agent/bin/kilo --version
```

Cloudflare docs say inner Docker networking should use `--network=host` because iptables is disabled.

### 3. Smoke-test through Cloudflare Sandbox

Deploy or run a staging Worker using the PR branch and call `SandboxSmall` with a fixed test ID.

Useful commands to run through `sandbox.exec(...)`:

```bash
docker version
docker run --network=host --rm alpine:3.20 echo hello
devcontainer --version
/opt/kilo-agent/bin/kilo --version
```

If those pass, write a minimal repo with `.devcontainer/devcontainer.json` and run:

```bash
devcontainer up --workspace-folder /workspace/repo
```

### 4. Kilo-in-inner-container smoke test

Before integrating wrapper changes, manually run Kilo in an inner container:

```bash
docker run \
--network=host \
--rm \
-v /workspace/repo:/workspaces/repo \
-v /opt/kilo-agent:/opt/kilo-agent:ro \
-e HOME=/tmp/kilo-home \
-e USER=agent \
-e KILO_SERVER_PASSWORD=secret \
-w /workspaces/repo \
alpine:3.20 \
/opt/kilo-agent/bin/kilo serve --hostname 127.0.0.1 --port 43210 --print-logs
```

From the outer sandbox:

```bash
curl -u kilo:secret http://127.0.0.1:43210/
```

Expected behavior:

- Without auth: `401 Unauthorized`.
- With `-u kilo:secret`: reaches Kilo server; `/` may return `404`, which is fine for connectivity.

## Likely Follow-Up Code Work

- Add a smoke-test endpoint or script for DIND validation if local testing is painful.
- Add a `DevcontainerManager` module to discover `.devcontainer/devcontainer.json`, run `devcontainer up`, record container ID/remote workspace folder, and clean up inner Docker resources.
- Split wrapper startup from Kilo server startup: wrapper should remain in the outer sandbox and connect to an externally managed `kilo serve` URL in devcontainer mode.
- Start `kilo serve` inside the inner devcontainer with `KILO_SERVER_PASSWORD` and teach wrapper HTTP clients to use Basic auth.
- Decide whether to mount `/opt/kilo-agent` into inner containers via devcontainer config overlay or copy it with `docker cp` after startup.
- Decide where Kilo HOME lives inside the devcontainer and how auth/config/session restore are written there.
- Add cleanup for inner Docker containers/images/volumes, all labeled by agent session ID.

## Known Open Questions

- Should `SandboxSmall` use DIND in production immediately, or should this be limited to `env.dev`/staging until the image is proven?
- Is Alpine-based DIND compatible with the existing Bun-built wrapper in all target regions/CPU variants?
- Do we need arm64 Kilo platform packages in `/opt/kilo-agent`, or are Cloudflare Containers always x64 for this workload?
- Should `/opt/kilo-agent` include both glibc and musl binaries, or should we start with only musl to reduce image size?
- How should devcontainer private registry credentials be passed without leaking into image build logs?
121 changes: 121 additions & 0 deletions services/cloud-agent-next/Dockerfile.dind
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
ARG SANDBOX_VERSION="0.8.9"

FROM docker.io/cloudflare/sandbox:${SANDBOX_VERSION}-musl AS cloudflare-sandbox

FROM docker:dind-rootless

USER root

# Build arguments for metadata (all optional with defaults)
ARG BUILD_DATE=""
ARG VCS_REF=""
ARG KILOCODE_CLI_VERSION="7.1.23"

# Cloudflare Containers run without root privileges, so Docker must run in
# rootless mode. The Sandbox SDK server is copied into this image so the
# Durable Object can still control the container while dockerd runs as a child
# process.
COPY --from=cloudflare-sandbox /container-server/sandbox /sandbox
COPY --from=cloudflare-sandbox /usr/lib/libstdc++.so.6 /usr/lib/libstdc++.so.6
COPY --from=cloudflare-sandbox /usr/lib/libgcc_s.so.1 /usr/lib/libgcc_s.so.1
COPY --from=cloudflare-sandbox /bin/bash /bin/bash
COPY --from=cloudflare-sandbox /usr/lib/libreadline.so.8 /usr/lib/libreadline.so.8
COPY --from=cloudflare-sandbox /usr/lib/libreadline.so.8.2 /usr/lib/libreadline.so.8.2

RUN apk add --no-cache \
bash \
curl \
git \
git-lfs \
jq \
nodejs \
npm \
openssh-client \
tar \
wget

# Install GitHub CLI from the official release. Alpine packages can lag the
# Debian package used in Dockerfile, so pin the upstream binary archive here.
RUN GH_VERSION="2.82.1" \
&& wget -q -O /tmp/gh.tar.gz "https://github.com/cli/cli/releases/download/v${GH_VERSION}/gh_${GH_VERSION}_linux_amd64.tar.gz" \
&& tar -xzf /tmp/gh.tar.gz -C /tmp \
&& cp "/tmp/gh_${GH_VERSION}_linux_amd64/bin/gh" /usr/local/bin/gh \
&& chmod +x /usr/local/bin/gh \
&& rm -rf /tmp/gh.tar.gz "/tmp/gh_${GH_VERSION}_linux_amd64"

# Install GitLab CLI from the official Linux amd64 binary archive.
RUN GLAB_VERSION="1.80.4" \
&& wget -q -O /tmp/glab.tar.gz "https://gitlab.com/gitlab-org/cli/-/releases/v${GLAB_VERSION}/downloads/glab_${GLAB_VERSION}_linux_amd64.tar.gz" \
&& tar -xzf /tmp/glab.tar.gz -C /tmp \
&& cp /tmp/bin/glab /usr/local/bin/glab \
&& chmod +x /usr/local/bin/glab \
&& rm -rf /tmp/glab.tar.gz /tmp/bin

# Tools used by the outer sandbox. Kilo itself is still installed globally for
# the existing wrapper path; the platform package bundle under /opt/kilo-agent
# is intended for mounting or copying into inner dev containers.
RUN npm install -g bun pnpm @devcontainers/cli @kilocode/cli@${KILOCODE_CLI_VERSION}

RUN mkdir -p /opt/kilo-agent/bin \
/opt/kilo-agent/cli-linux-x64 \
/opt/kilo-agent/cli-linux-x64-musl \
&& npm pack \
"@kilocode/cli-linux-x64@${KILOCODE_CLI_VERSION}" \
"@kilocode/cli-linux-x64-musl@${KILOCODE_CLI_VERSION}" \
--pack-destination /tmp \
&& tar -xzf "/tmp/kilocode-cli-linux-x64-${KILOCODE_CLI_VERSION}.tgz" \
-C /opt/kilo-agent/cli-linux-x64 --strip-components=1 \
&& tar -xzf "/tmp/kilocode-cli-linux-x64-musl-${KILOCODE_CLI_VERSION}.tgz" \
-C /opt/kilo-agent/cli-linux-x64-musl --strip-components=1 \
&& rm -f /tmp/kilocode-cli-linux-x64-*.tgz \
&& rm -f /tmp/kilocode-cli-linux-x64-musl-*.tgz \
&& rm -f /opt/kilo-agent/cli-linux-x64/bin/*.map \
&& rm -f /opt/kilo-agent/cli-linux-x64-musl/bin/*.map \
&& chmod +x /opt/kilo-agent/cli-linux-x64/bin/kilo \
&& chmod +x /opt/kilo-agent/cli-linux-x64-musl/bin/kilo

RUN cat > /opt/kilo-agent/bin/kilo <<'EOF' \
&& chmod +x /opt/kilo-agent/bin/kilo
#!/bin/sh
set -eu

root="${KILO_AGENT_ROOT:-$(CDPATH= cd -- "$(dirname -- "$0")/.." && pwd)}"
arch="$(uname -m)"

if ldd --version 2>&1 | grep -qi musl; then
libc="musl"
else
libc="glibc"
fi

case "$arch:$libc" in
x86_64:glibc) exec "$root/cli-linux-x64/bin/kilo" "$@" ;;
x86_64:musl) exec "$root/cli-linux-x64-musl/bin/kilo" "$@" ;;
*) echo "Unsupported devcontainer platform: $arch/$libc" >&2; exit 1 ;;
esac
EOF

# === Build wrapper bundle inside container ===
# This mirrors Dockerfile but builds on Alpine, matching the DIND base image.
COPY wrapper /tmp/wrapper-build/wrapper
COPY src/shared /tmp/wrapper-build/src/shared

RUN cd /tmp/wrapper-build/wrapper \
&& bun install --production \
&& bun build src/main.ts --outfile=/usr/local/bin/kilocode-wrapper.js --target=bun --minify \
&& bun build src/restore-session.ts --outfile=/usr/local/bin/kilo-restore-session.js --target=bun --minify \
&& rm -rf /tmp/wrapper-build

RUN printf '#!/bin/sh\n\
set -eu\n\
dockerd-entrypoint.sh dockerd --iptables=false --ip6tables=false &\n\
until docker version >/dev/null 2>&1; do sleep 0.2; done\n\
echo "Docker is ready"\n\
wait\n' > /home/rootless/boot-docker-for-dind.sh \
&& chmod +x /home/rootless/boot-docker-for-dind.sh \
&& chown rootless:rootless /home/rootless/boot-docker-for-dind.sh

USER rootless

ENTRYPOINT ["/sandbox"]
CMD ["/home/rootless/boot-docker-for-dind.sh"]
2 changes: 1 addition & 1 deletion services/cloud-agent-next/worker-configuration.d.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/* eslint-disable */
// Generated by Wrangler by running `wrangler types` (hash: edb7ebd5d1ef6ff409168c7d77a89113)
// Generated by Wrangler by running `wrangler types` (hash: fd604ad645220614b978b84ed33d728e)
// Runtime types generated with workerd@1.20260312.1 2025-09-15 nodejs_compat
declare namespace Cloudflare {
interface GlobalProps {
Expand Down
4 changes: 2 additions & 2 deletions services/cloud-agent-next/wrangler.jsonc
Original file line number Diff line number Diff line change
Expand Up @@ -147,7 +147,7 @@
},
{
"class_name": "SandboxSmall",
"image": "./Dockerfile",
"image": "./Dockerfile.dind",
"instance_type": "standard-2",
"image_vars": {
"KILOCODE_CLI_VERSION": "7.1.23",
Expand Down Expand Up @@ -287,7 +287,7 @@
},
{
"class_name": "SandboxSmall",
"image": "./Dockerfile.dev",
"image": "./Dockerfile.dind",
"instance_type": "standard-2",
"image_vars": {
"KILOCODE_CLI_VERSION": "7.1.23",
Expand Down