diff --git a/.plans/cloud-agent-next-dind-small-sandbox-handoff.md b/.plans/cloud-agent-next-dind-small-sandbox-handoff.md new file mode 100644 index 0000000000..8a823e62b8 --- /dev/null +++ b/.plans/cloud-agent-next-dind-small-sandbox-handoff.md @@ -0,0 +1,176 @@ +# Cloud Agent Next DIND Small Sandbox Handoff + +## PR + +- Draft PR: https://github.com/Kilo-Org/cloud/pull/2933 +- Branch: `chore/cloud-agent-small-dind-clean` +- Commit: `1b4a56dc chore(cloud-agent-next): add DIND small sandbox` + +## Goal + +Make `SandboxSmall` use a Docker-in-Docker-capable Cloudflare Sandbox image so it can be used as the experimental runtime for devcontainer support. + +The current PR intentionally stops at the image/config layer. It does not yet run devcontainers from cloud-agent-next session code. + +## What Changed + +- Added `services/cloud-agent-next/Dockerfile.dind`. +- Updated `services/cloud-agent-next/wrangler.jsonc` so `SandboxSmall` uses `./Dockerfile.dind` in both production and `env.dev` container config. +- Updated the generated Wrangler type hash in `services/cloud-agent-next/worker-configuration.d.ts`. + +## Dockerfile Shape + +`Dockerfile.dind` follows Cloudflare's Docker-in-Docker guidance: + +- Starts from `docker:dind-rootless`. +- Copies `/sandbox` and required musl runtime libraries from `docker.io/cloudflare/sandbox:0.8.9-musl`. +- Runs the Cloudflare Sandbox binary as `ENTRYPOINT`. +- Starts rootless `dockerd` as the Sandbox child `CMD` with `--iptables=false --ip6tables=false`. +- Switches back to `USER rootless` for runtime. + +It also installs the cloud-agent tooling needed to keep the current `SandboxSmall` behavior mostly compatible: + +- `git`, `git-lfs`, `jq`, `curl`, `wget`, `openssh-client` +- GitHub CLI `gh` +- GitLab CLI `glab` +- `nodejs`, `npm`, `bun`, `pnpm` +- `@devcontainers/cli` +- `@kilocode/cli@${KILOCODE_CLI_VERSION}` +- Existing cloud-agent wrapper bundles at `/usr/local/bin/kilocode-wrapper.js` and `/usr/local/bin/kilo-restore-session.js` + +It additionally prepares a future inner-container Kilo bundle: + +- `/opt/kilo-agent/bin/kilo` +- `/opt/kilo-agent/cli-linux-x64/bin/kilo` +- `/opt/kilo-agent/cli-linux-x64-musl/bin/kilo` + +The launcher detects `x86_64` plus glibc/musl and execs the matching platform binary. This is for later mounting/copying into user devcontainers so `kilo serve` can run there without requiring Node/npm/Bun in the user image. + +## Important Context + +- This sandbox environment does not have Docker installed, so the DIND image has not been built or runtime-tested here. +- Push was done with `--no-verify` at user request because the pre-push hook timed out in this sandbox. +- `pnpm install --frozen-lockfile` was run to make local tooling available; it warned that this environment uses Node 20 while the repo wants Node 24. +- Earlier brainstorming lives in `.plans/cloud-agent-next-devcontainer-support.md` on the session branch, but it is not included in the clean draft PR. + +## Verification Already Run + +On the clean branch: + +```bash +pnpm --filter cloud-agent-next exec wrangler types --check +pnpm --filter cloud-agent-next run typecheck +pnpm --filter cloud-agent-next run format:check +pnpm format +``` + +All passed, with the expected Node engine warning: + +```text +Unsupported engine: wanted: {"node":">=24 <25"} (current: {"node":"v20.20.2","pnpm":"10.33.0"}) +``` + +## Recommended Next Tests + +Run these locally or in an environment with Docker. + +### 1. Build the image + +From `services/cloud-agent-next`: + +```bash +docker build -f Dockerfile.dind -t cloud-agent-next-dind-small . +``` + +Things to watch: + +- Whether `cloudflare/sandbox:0.8.9-musl` exists and has all copied paths. +- Whether `npm install -g bun pnpm @devcontainers/cli @kilocode/cli@7.1.23` works on Alpine. +- Whether `gh` and `glab` downloaded archives match expected directory layout. +- Final image size; `/opt/kilo-agent` includes two large Kilo platform binaries. + +### 2. Smoke-test Docker inside the image + +Run the image in a local Docker environment that can support rootless DIND. The exact local flags may differ from Cloudflare Containers, so this is only a smoke test: + +```bash +docker run --rm -it cloud-agent-next-dind-small sh +``` + +Inside the container: + +```bash +docker version +docker run --network=host --rm alpine:3.20 echo hello +devcontainer --version +kilo --version +/opt/kilo-agent/bin/kilo --version +``` + +Cloudflare docs say inner Docker networking should use `--network=host` because iptables is disabled. + +### 3. Smoke-test through Cloudflare Sandbox + +Deploy or run a staging Worker using the PR branch and call `SandboxSmall` with a fixed test ID. + +Useful commands to run through `sandbox.exec(...)`: + +```bash +docker version +docker run --network=host --rm alpine:3.20 echo hello +devcontainer --version +/opt/kilo-agent/bin/kilo --version +``` + +If those pass, write a minimal repo with `.devcontainer/devcontainer.json` and run: + +```bash +devcontainer up --workspace-folder /workspace/repo +``` + +### 4. Kilo-in-inner-container smoke test + +Before integrating wrapper changes, manually run Kilo in an inner container: + +```bash +docker run \ + --network=host \ + --rm \ + -v /workspace/repo:/workspaces/repo \ + -v /opt/kilo-agent:/opt/kilo-agent:ro \ + -e HOME=/tmp/kilo-home \ + -e USER=agent \ + -e KILO_SERVER_PASSWORD=secret \ + -w /workspaces/repo \ + alpine:3.20 \ + /opt/kilo-agent/bin/kilo serve --hostname 127.0.0.1 --port 43210 --print-logs +``` + +From the outer sandbox: + +```bash +curl -u kilo:secret http://127.0.0.1:43210/ +``` + +Expected behavior: + +- Without auth: `401 Unauthorized`. +- With `-u kilo:secret`: reaches Kilo server; `/` may return `404`, which is fine for connectivity. + +## Likely Follow-Up Code Work + +- Add a smoke-test endpoint or script for DIND validation if local testing is painful. +- Add a `DevcontainerManager` module to discover `.devcontainer/devcontainer.json`, run `devcontainer up`, record container ID/remote workspace folder, and clean up inner Docker resources. +- Split wrapper startup from Kilo server startup: wrapper should remain in the outer sandbox and connect to an externally managed `kilo serve` URL in devcontainer mode. +- Start `kilo serve` inside the inner devcontainer with `KILO_SERVER_PASSWORD` and teach wrapper HTTP clients to use Basic auth. +- Decide whether to mount `/opt/kilo-agent` into inner containers via devcontainer config overlay or copy it with `docker cp` after startup. +- Decide where Kilo HOME lives inside the devcontainer and how auth/config/session restore are written there. +- Add cleanup for inner Docker containers/images/volumes, all labeled by agent session ID. + +## Known Open Questions + +- Should `SandboxSmall` use DIND in production immediately, or should this be limited to `env.dev`/staging until the image is proven? +- Is Alpine-based DIND compatible with the existing Bun-built wrapper in all target regions/CPU variants? +- Do we need arm64 Kilo platform packages in `/opt/kilo-agent`, or are Cloudflare Containers always x64 for this workload? +- Should `/opt/kilo-agent` include both glibc and musl binaries, or should we start with only musl to reduce image size? +- How should devcontainer private registry credentials be passed without leaking into image build logs? diff --git a/services/cloud-agent-next/Dockerfile.dind b/services/cloud-agent-next/Dockerfile.dind new file mode 100644 index 0000000000..f9533e75d7 --- /dev/null +++ b/services/cloud-agent-next/Dockerfile.dind @@ -0,0 +1,121 @@ +ARG SANDBOX_VERSION="0.8.9" + +FROM docker.io/cloudflare/sandbox:${SANDBOX_VERSION}-musl AS cloudflare-sandbox + +FROM docker:dind-rootless + +USER root + +# Build arguments for metadata (all optional with defaults) +ARG BUILD_DATE="" +ARG VCS_REF="" +ARG KILOCODE_CLI_VERSION="7.1.23" + +# Cloudflare Containers run without root privileges, so Docker must run in +# rootless mode. The Sandbox SDK server is copied into this image so the +# Durable Object can still control the container while dockerd runs as a child +# process. +COPY --from=cloudflare-sandbox /container-server/sandbox /sandbox +COPY --from=cloudflare-sandbox /usr/lib/libstdc++.so.6 /usr/lib/libstdc++.so.6 +COPY --from=cloudflare-sandbox /usr/lib/libgcc_s.so.1 /usr/lib/libgcc_s.so.1 +COPY --from=cloudflare-sandbox /bin/bash /bin/bash +COPY --from=cloudflare-sandbox /usr/lib/libreadline.so.8 /usr/lib/libreadline.so.8 +COPY --from=cloudflare-sandbox /usr/lib/libreadline.so.8.2 /usr/lib/libreadline.so.8.2 + +RUN apk add --no-cache \ + bash \ + curl \ + git \ + git-lfs \ + jq \ + nodejs \ + npm \ + openssh-client \ + tar \ + wget + +# Install GitHub CLI from the official release. Alpine packages can lag the +# Debian package used in Dockerfile, so pin the upstream binary archive here. +RUN GH_VERSION="2.82.1" \ + && wget -q -O /tmp/gh.tar.gz "https://github.com/cli/cli/releases/download/v${GH_VERSION}/gh_${GH_VERSION}_linux_amd64.tar.gz" \ + && tar -xzf /tmp/gh.tar.gz -C /tmp \ + && cp "/tmp/gh_${GH_VERSION}_linux_amd64/bin/gh" /usr/local/bin/gh \ + && chmod +x /usr/local/bin/gh \ + && rm -rf /tmp/gh.tar.gz "/tmp/gh_${GH_VERSION}_linux_amd64" + +# Install GitLab CLI from the official Linux amd64 binary archive. +RUN GLAB_VERSION="1.80.4" \ + && wget -q -O /tmp/glab.tar.gz "https://gitlab.com/gitlab-org/cli/-/releases/v${GLAB_VERSION}/downloads/glab_${GLAB_VERSION}_linux_amd64.tar.gz" \ + && tar -xzf /tmp/glab.tar.gz -C /tmp \ + && cp /tmp/bin/glab /usr/local/bin/glab \ + && chmod +x /usr/local/bin/glab \ + && rm -rf /tmp/glab.tar.gz /tmp/bin + +# Tools used by the outer sandbox. Kilo itself is still installed globally for +# the existing wrapper path; the platform package bundle under /opt/kilo-agent +# is intended for mounting or copying into inner dev containers. +RUN npm install -g bun pnpm @devcontainers/cli @kilocode/cli@${KILOCODE_CLI_VERSION} + +RUN mkdir -p /opt/kilo-agent/bin \ + /opt/kilo-agent/cli-linux-x64 \ + /opt/kilo-agent/cli-linux-x64-musl \ + && npm pack \ + "@kilocode/cli-linux-x64@${KILOCODE_CLI_VERSION}" \ + "@kilocode/cli-linux-x64-musl@${KILOCODE_CLI_VERSION}" \ + --pack-destination /tmp \ + && tar -xzf "/tmp/kilocode-cli-linux-x64-${KILOCODE_CLI_VERSION}.tgz" \ + -C /opt/kilo-agent/cli-linux-x64 --strip-components=1 \ + && tar -xzf "/tmp/kilocode-cli-linux-x64-musl-${KILOCODE_CLI_VERSION}.tgz" \ + -C /opt/kilo-agent/cli-linux-x64-musl --strip-components=1 \ + && rm -f /tmp/kilocode-cli-linux-x64-*.tgz \ + && rm -f /tmp/kilocode-cli-linux-x64-musl-*.tgz \ + && rm -f /opt/kilo-agent/cli-linux-x64/bin/*.map \ + && rm -f /opt/kilo-agent/cli-linux-x64-musl/bin/*.map \ + && chmod +x /opt/kilo-agent/cli-linux-x64/bin/kilo \ + && chmod +x /opt/kilo-agent/cli-linux-x64-musl/bin/kilo + +RUN cat > /opt/kilo-agent/bin/kilo <<'EOF' \ + && chmod +x /opt/kilo-agent/bin/kilo +#!/bin/sh +set -eu + +root="${KILO_AGENT_ROOT:-$(CDPATH= cd -- "$(dirname -- "$0")/.." && pwd)}" +arch="$(uname -m)" + +if ldd --version 2>&1 | grep -qi musl; then + libc="musl" +else + libc="glibc" +fi + +case "$arch:$libc" in + x86_64:glibc) exec "$root/cli-linux-x64/bin/kilo" "$@" ;; + x86_64:musl) exec "$root/cli-linux-x64-musl/bin/kilo" "$@" ;; + *) echo "Unsupported devcontainer platform: $arch/$libc" >&2; exit 1 ;; +esac +EOF + +# === Build wrapper bundle inside container === +# This mirrors Dockerfile but builds on Alpine, matching the DIND base image. +COPY wrapper /tmp/wrapper-build/wrapper +COPY src/shared /tmp/wrapper-build/src/shared + +RUN cd /tmp/wrapper-build/wrapper \ + && bun install --production \ + && bun build src/main.ts --outfile=/usr/local/bin/kilocode-wrapper.js --target=bun --minify \ + && bun build src/restore-session.ts --outfile=/usr/local/bin/kilo-restore-session.js --target=bun --minify \ + && rm -rf /tmp/wrapper-build + +RUN printf '#!/bin/sh\n\ + set -eu\n\ + dockerd-entrypoint.sh dockerd --iptables=false --ip6tables=false &\n\ + until docker version >/dev/null 2>&1; do sleep 0.2; done\n\ + echo "Docker is ready"\n\ + wait\n' > /home/rootless/boot-docker-for-dind.sh \ + && chmod +x /home/rootless/boot-docker-for-dind.sh \ + && chown rootless:rootless /home/rootless/boot-docker-for-dind.sh + +USER rootless + +ENTRYPOINT ["/sandbox"] +CMD ["/home/rootless/boot-docker-for-dind.sh"] diff --git a/services/cloud-agent-next/worker-configuration.d.ts b/services/cloud-agent-next/worker-configuration.d.ts index 9cceb19433..3093a07ee5 100644 --- a/services/cloud-agent-next/worker-configuration.d.ts +++ b/services/cloud-agent-next/worker-configuration.d.ts @@ -1,5 +1,5 @@ /* eslint-disable */ -// Generated by Wrangler by running `wrangler types` (hash: edb7ebd5d1ef6ff409168c7d77a89113) +// Generated by Wrangler by running `wrangler types` (hash: fd604ad645220614b978b84ed33d728e) // Runtime types generated with workerd@1.20260312.1 2025-09-15 nodejs_compat declare namespace Cloudflare { interface GlobalProps { diff --git a/services/cloud-agent-next/wrangler.jsonc b/services/cloud-agent-next/wrangler.jsonc index 6ca6a91568..aefaa95166 100644 --- a/services/cloud-agent-next/wrangler.jsonc +++ b/services/cloud-agent-next/wrangler.jsonc @@ -147,7 +147,7 @@ }, { "class_name": "SandboxSmall", - "image": "./Dockerfile", + "image": "./Dockerfile.dind", "instance_type": "standard-2", "image_vars": { "KILOCODE_CLI_VERSION": "7.1.23", @@ -287,7 +287,7 @@ }, { "class_name": "SandboxSmall", - "image": "./Dockerfile.dev", + "image": "./Dockerfile.dind", "instance_type": "standard-2", "image_vars": { "KILOCODE_CLI_VERSION": "7.1.23",