Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/architecture/cri-passthrough.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ Windows named-pipe URIs use forward slashes after the scheme -- the path is `//.

- **Linux**: CRI is fully supported by containerd v2. All crictl commands behave as they would against a standalone containerd install.
- **Windows**: containerd v2 ships a native Windows CRI implementation (Hyper-V isolated containers). A handful of CRI features that assume Linux semantics (cgroups, mount propagation flags) are no-ops or return errors -- this mirrors upstream containerd behavior.
- **WSL-to-Windows**: when the Windows host routes Linux jobs to the WSL worker, `ephemerd crictl` on the host only sees the Windows containerd CRI. To inspect WSL-side Linux containers, use `wsl -- ephemerd crictl ...` inside the distro.
- **In-VM containerd on Windows**: when the Windows host routes Linux jobs to the Hyper-V Linux VM, `ephemerd crictl` on the host only sees the Windows containerd CRI. To inspect Linux containers inside the VM, exec into the VM (via `ephemerd debugexec` or the VM's console) and run `ephemerd crictl ...` against the VM's local containerd socket.

## Typical Debugging Workflow

Expand Down
4 changes: 2 additions & 2 deletions docs/architecture/embedded-containerd.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ On startup, `server.New()`:
5. Calls `ctdserver.New(ctx, cfg)` to create the in-process server.
6. Creates a gRPC listener on the platform-appropriate socket and serves in a background goroutine.
7. Also creates a tTRPC listener for task/event APIs.
8. Optionally creates a TCP listener for remote access (used by the Windows host to connect to WSL containerd).
8. Optionally creates a TCP listener for remote access (used by the Windows or macOS host to connect to the in-VM containerd).
9. Connects an in-process containerd client and waits for it to become ready (up to 15 seconds).

The server, gRPC listeners, and client all run in the same process. On shutdown, `Server.Stop()` closes the client, stops the server, cancels the context, and waits for the background goroutines to finish.
Expand All @@ -48,7 +48,7 @@ The `SocketPath()` function in `pkg/containerd/server.go` returns the correct pa

When `TCPPort` is set in the config (e.g., `--containerd-tcp-port 10000`), the server also listens on TCP. This is used for:

- **Windows host to WSL**: the Windows scheduler connects to WSL's containerd via TCP since named pipes do not cross the WSL boundary.
- **Windows host to Hyper-V Linux VM**: the Windows scheduler connects to the in-VM containerd via TCP since named pipes do not cross the VM boundary.
- **macOS host to Linux VM**: the macOS host connects to containerd inside the Virtualization.framework Linux VM via TCP over NAT.

The TCP bind address defaults to `127.0.0.1` but can be configured to `0.0.0.0` for VM environments where the host is on a different network interface.
Expand Down
6 changes: 3 additions & 3 deletions docs/architecture/forgejo-gitea.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ ephemerd exploits this by mounting its [fake Docker socket]({{< relref "fake-doc
flowchart TB
F["Forge Instance<br/>(Forgejo or Gitea)"]

subgraph H ["ephemerd host (Linux, Windows via WSL2, or macOS via Vz)"]
subgraph H ["ephemerd host (Linux, Windows via Hyper-V Linux VM, or macOS via Vz)"]
E[ephemerd]
CTD["containerd"]
DSock["Fake Docker Socket<br/>pkg/dind<br/>/var/run/docker.sock"]
Expand Down Expand Up @@ -88,7 +88,7 @@ flowchart TB
### Lifecycle

1. ephemerd creates the runner container from the upstream runner image, with the fake Docker socket bind-mounted at `/var/run/docker.sock`.
2. containerd starts the runner -- on Linux directly, inside WSL2 on Windows, inside the Vz Linux VM on macOS.
2. containerd starts the runner -- on Linux directly, inside the Hyper-V Linux VM on Windows, inside the Vz Linux VM on macOS.
3. Runner registers with the forge as an ephemeral runner and long-polls `FetchTask`.
4. Forge returns a task -- workflow YAML bytes, context, secrets, vars.
5. act parses the workflow and determines the job image from `runs-on:` label mapping.
Expand Down Expand Up @@ -185,7 +185,7 @@ Forgejo/Gitea Actions is a Linux-jobs-only ecosystem today. On all three host OS
| Host OS | How Linux containers run |
|---------|-------------------------|
| Linux | Direct containerd |
| Windows | containerd inside WSL2 |
| Windows | containerd inside Hyper-V Linux VM |
| macOS | containerd inside Vz Linux VM |

## Configuration
Expand Down
2 changes: 1 addition & 1 deletion docs/architecture/macos-vms.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ The rootfs tarball and Linux binary are embedded in the macOS binary via `go:emb
- **virtio-fs**: the host's data directory is shared into the VM at `/mnt/ephemerd`. The ephemerd Linux binary lives here -- no need to copy it into the disk image. It loads into memory on exec and runs at native speed.
- **TCP over NAT**: containerd inside the VM listens on a TCP port. The host connects a gRPC containerd client to `127.0.0.1:<port>`.

Unlike Windows WSL dispatch, macOS does not need a separate dispatch layer. The containerd gRPC client is platform-agnostic -- the macOS host binary can create Linux containers directly via the TCP connection. Only the container runtime code (OCI spec, snapshotter, networking) runs inside the VM.
Unlike the Windows Hyper-V dispatch, macOS does not need a separate dispatch layer. The containerd gRPC client is platform-agnostic -- the macOS host binary can create Linux containers directly via the TCP connection. Only the container runtime code (OCI spec, snapshotter, networking) runs inside the VM.

### Two Boot Modes

Expand Down
2 changes: 1 addition & 1 deletion docs/architecture/multi-forge-providers.md
Original file line number Diff line number Diff line change
Expand Up @@ -157,7 +157,7 @@ Only one provider should be configured at a time. Precedence when multiple secti
The entire container infrastructure is provider-agnostic:

- Container runtime (`pkg/runtime`)
- WSL dispatch (Linux jobs on Windows)
- Hyper-V Linux VM dispatch (Linux jobs on Windows)
- Networking (CNI on Linux, HCN on Windows)
- Embedded containerd
- gRPC control plane (status, jobs, drain)
Expand Down
6 changes: 3 additions & 3 deletions docs/architecture/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ Standard OCI containers via embedded containerd, running directly on the host ke

containerd runs natively on Windows and supports Hyper-V isolation. Each container gets its own kernel in a lightweight VM -- real isolation, malicious code cannot escape to the host. Same OCI images, same containerd APIs, just compiled for Windows. Startup ~5-10s. Networking via HCN (Host Compute Network) with NAT and per-endpoint ACL policies.

Linux jobs on a Windows host are dispatched to a WSL2 worker via gRPC. See [Windows WSL dispatch]({{< relref "windows-wsl-dispatch" >}}).
Linux jobs on a Windows host are dispatched via gRPC to a Hyper-V Linux VM that ephemerd boots and manages directly. See [Windows Hyper-V dispatch]({{< relref "windows-wsl-dispatch" >}}).

### macOS: Virtualization.framework

Expand All @@ -98,7 +98,7 @@ Because Windows can run Hyper-V Linux VMs and macOS can run Virtualization.frame
|------|-----------|----------------|
| Linux x86_64 | containerd (direct) | -- |
| Linux arm64 | containerd (direct) | -- |
| Windows x86_64 | containerd in WSL2 Linux VM | Hyper-V Windows containers |
| Windows x86_64 | containerd in Hyper-V Linux VM | Hyper-V Windows containers |
| macOS arm64 | containerd in Virtualization.framework Linux VM | Ephemeral macOS VMs (clone-on-write) |

A Windows box and a Mac Mini together cover every combination: linux/amd64, linux/arm64, windows/amd64.
Expand All @@ -111,7 +111,7 @@ Each OS/arch combination produces one self-contained binary with containerd comp
|--------|--------|----------------------|
| linux/amd64 | `ephemerd` | containerd direct |
| linux/arm64 | `ephemerd` | containerd direct |
| windows/amd64 | `ephemerd.exe` | containerd + Hyper-V (Windows jobs) / WSL2 (Linux jobs) |
| windows/amd64 | `ephemerd.exe` | containerd + Hyper-V (Windows jobs) / Hyper-V Linux VM (Linux jobs) |
| darwin/arm64 | `ephemerd` | Virtualization.framework Linux VM + containerd inside |

No runtime dependencies beyond the OS kernel, Hyper-V (Windows), or Virtualization.framework (macOS).
Expand Down
10 changes: 5 additions & 5 deletions docs/architecture/pre-baked-rootfs.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,20 +3,20 @@ title: Pre-Baked Rootfs
weight: 7
---

The WSL and macOS Linux VM rootfs is an Alpine minirootfs with gcompat and iptables baked in at compile time. This eliminates network-dependent package installation during boot.
The Linux VM rootfs (Hyper-V on Windows, Vz on macOS) and the temporary WSL distro that `ephemerd run` uses are all Alpine minirootfs with gcompat and iptables baked in at compile time. This eliminates network-dependent package installation during boot.

## Context

Every WSL distro boot (and Vz Linux VM boot) needs two packages that are not in the stock Alpine minirootfs:
Every Linux VM boot (Hyper-V on Windows, Vz on macOS) and every `ephemerd run` WSL distro import needs two packages that are not in the stock Alpine minirootfs:

- **gcompat** -- glibc compatibility shim required by `containerd-shim-runc-v2`, which is built against glibc.
- **iptables** -- required by CNI plugins for container network NAT rules.

Previously these were installed at runtime via `apk add --no-cache gcompat iptables` after each distro import. This had several problems:
Previously these were installed at runtime via `apk add --no-cache gcompat iptables` after each boot/import. This had several problems:

- 10-30s of boot time spent downloading and installing packages over the network.
- DNS flakes -- WSL networking is not always ready immediately after distro import, requiring a retry loop with backoffs.
- The only network-dependent step in the entire distro boot sequence.
- DNS flakes -- guest networking is not always ready immediately after the VM/distro starts, requiring a retry loop with backoffs.
- The only network-dependent step in the entire boot sequence.
- Multiplied cost -- `ephemerd run` creates a fresh distro per invocation, paying this penalty every time.

## How It Works
Expand Down
78 changes: 47 additions & 31 deletions docs/architecture/windows-wsl-dispatch.md
Original file line number Diff line number Diff line change
@@ -1,25 +1,32 @@
---
title: Windows WSL Dispatch
title: Windows Hyper-V Dispatch
weight: 3
aliases:
- /architecture/windows-wsl-dispatch/
---

On Windows, ephemerd runs a single scheduler that handles both Windows and Linux jobs. Windows jobs run natively in Hyper-V containers. Linux jobs are dispatched to a WSL2 worker via gRPC.
On Windows, ephemerd runs a single scheduler that handles both Windows and Linux jobs. Windows jobs run natively as Hyper-V isolated containers. Linux jobs are dispatched via gRPC to a Hyper-V Linux VM that ephemerd boots and manages directly.

## Why a Hyper-V VM (not WSL2)

An earlier revision dispatched Linux jobs to a WSL2 distro. That works when ephemerd runs as a user process, but Windows Services execute as `LocalSystem`, and WSL2 has no `LocalSystem` support — calling `wsl --import` or `wsl --exec` from `LocalSystem` fails with `0x80370102` / `WSL_E_USER_NOT_REGISTERED`. The Hyper-V Compute Service (HCS) has no such restriction, so ephemerd creates the Linux VM by calling `vmcompute.dll` directly. The same code path works for an interactive user *and* for the installed Windows service.

## Architecture

One poller on Windows dispatches Linux jobs to WSL via gRPC. WSL runs containerd-only plus a dispatch worker -- no scheduler, no GitHub credentials.
One poller on Windows dispatches Linux jobs to the Hyper-V VM via gRPC. The VM runs `ephemerd serve --containerd-only` plus a dispatch worker no scheduler, no GitHub credentials.

```
Windows Host (ephemerd.exe serve):
+-- Containerd (Windows, named pipe)
+-- Containerd (Windows, named pipe + 127.0.0.1 TCP)
+-- Scheduler (single poller for ALL jobs)
| +-- Windows job -> local Runtime.Create() on Windows containerd
| +-- Linux job -> gRPC DispatchClient -> WSL dispatch server
+-- WSL VM boot (containerd-only + dispatch worker)
| +-- Linux job -> gRPC DispatchClient -> Hyper-V VM dispatch server
+-- Hyper-V Linux VM boot (HCS / vmcompute.dll)

WSL (ephemerd serve --containerd-only):
Hyper-V Linux VM (ephemerd serve --containerd-only):
+-- Containerd (Linux, TCP :10000)
+-- Runner extracted, CNI extracted, networking initialized
+-- Persistent VHDX rootfs (data dir / containerd state)
+-- Embedded Linux ephemerd binary, runner, CNI, gcompat, iptables
+-- Dispatch gRPC server (TCP :10001)
+-- CreateJob(id, image, jitConfig) -> local Runtime.Create()
+-- WaitJob(id) -> local Runtime.Wait()
Expand All @@ -36,7 +43,7 @@ A Windows-compiled `Runtime.Create()` cannot create Linux containers. The runtim
- Container I/O (`cio.NullIO` on Windows, log file on Linux)
- Runner mount paths (`C:\actions-runner` vs `/actions-runner`)

The Linux-specific code must run inside WSL. The gRPC dispatch layer bridges the gap: the Windows scheduler sends job requests to the WSL worker, which creates Linux containers using its own Linux-compiled runtime.
The Linux-specific code must run inside the Linux VM. The gRPC dispatch layer bridges the gap: the Windows scheduler sends job requests to the in-VM worker, which creates Linux containers using its own Linux-compiled runtime.

## Protobuf Dispatch Service

Expand Down Expand Up @@ -65,7 +72,7 @@ message DestroyJobResponse {}

## Key Components

### Dispatch Server (WSL side)
### Dispatch Server (Linux VM side)

Implemented in `pkg/scheduler/dispatch.go`. The `dispatchServer` struct wraps a `*runtime.Runtime` and a map of active `RunnerEnv` objects:

Expand All @@ -81,10 +88,10 @@ Also in `pkg/scheduler/dispatch.go`. The `DispatchClient` struct holds a gRPC co

### Containerd-Only Mode

When WSL boots ephemerd with `--containerd-only`:
When the in-VM ephemerd boots with `--containerd-only`:

1. Starts embedded containerd with a TCP listener.
2. Extracts the runner binary and CNI plugins.
1. Starts embedded containerd with a TCP listener on `0.0.0.0:10000`.
2. Extracts the runner binary and CNI plugins from its embedded payload.
3. Initializes networking (CNI bridge, stale bridge cleanup).
4. Creates a local `runtime.Runtime`.
5. Starts the dispatch gRPC server on `containerdPort + 1` (default port 10001).
Expand All @@ -104,38 +111,47 @@ Windows-labeled jobs go through the normal local `Runtime.Create()` path.
## End-to-End Flow

1. Windows host starts: native containerd + single scheduler.
2. WSL VM boots in background: containerd-only + dispatch worker.
2. Hyper-V Linux VM boots in background: containerd-only + dispatch worker.
3. GitHub job queued with `runs-on: [self-hosted, linux, x64]`.
4. Windows scheduler sees it, detects `"linux"` label and `LinuxDispatcher != nil`.
5. Registers JIT runner with `["self-hosted", "linux", "x64"]` labels.
6. Calls `dispatcher.Create(name, image, jitConfig)` -- gRPC to WSL.
7. WSL dispatch server creates a Linux container using its local Runtime.
6. Calls `dispatcher.Create(name, image, jitConfig)` -- gRPC to the VM IP.
7. Dispatch server in VM creates a Linux container using its local Runtime.
8. Windows scheduler calls `dispatcher.Wait(name)` -- blocks until job completes.
9. Windows scheduler calls `dispatcher.Destroy(name)` -- cleans up container + networking in WSL.
9. Windows scheduler calls `dispatcher.Destroy(name)` -- cleans up container + networking in the VM.
10. Windows jobs follow the normal local Runtime flow.

## WSL VM Lifecycle
## Hyper-V VM Lifecycle

The WSL VM is managed by `pkg/vm/linuxvm_windows.go`:
The Linux VM is managed by `pkg/vm/linuxvm_windows.go` via the HCS (Host Compute Service) API:

- On startup, imports a WSL distro from the embedded pre-built rootfs.
- Runs the Linux ephemerd binary from `/mnt/c/` (Windows disk mount, avoids slow 9P copy into the distro).
- Launches with `--containerd-only` -- no GitHub credentials are needed in WSL.
- After containerd is ready, connects a dispatch gRPC client to port `containerdPort + 1`.
- On shutdown, the distro is unregistered via `wsl --unregister`.
- On startup, the embedded Linux kernel (`vmlinuz`) and initrd (containing a pre-baked Alpine rootfs + the cross-compiled Linux `ephemerd` binary) are written into `<DataDir>/vm/linux/`.
- A persistent VHDX root disk is created on first boot at `<DataDir>/containerd/linux-root/root.vhdx` (default 100 GB). Image content and containerd metadata live here, so a host restart doesn't re-pull every image.
- ephemerd builds an HCS compute system document for a KernelDirect (LCOW) boot and calls `vmcompute.dll` directly. We don't use hcsshim's `uvm.CreateLCOW` because it assumes a Microsoft GCS is running inside the VM (vsock-based), and we run a normal Linux userspace instead.
- An HCN endpoint on the Default Switch is attached to the VM. ephemerd watches WMI events to discover the assigned IP, then connects:
- `<vm-ip>:10000` -- containerd gRPC (only used by buildkit and per-job runtime calls; jobs themselves see a unix socket inside the VM).
- `<vm-ip>:10001` -- dispatch gRPC (CreateJob / WaitJob / DestroyJob).
- The Linux ephemerd binary launches with `--containerd-only`. No PEM file, no config.toml, no GitHub credentials inside the VM.
- On shutdown, ephemerd asks HCS to terminate the compute system and releases the HCN endpoint. The VHDX persists for the next boot.

The WSL VM boots asynchronously in a background goroutine. Windows jobs can run immediately while the WSL worker starts up. Linux jobs queue until the dispatch client is connected.
The VM boots asynchronously in a background goroutine. Windows jobs can run immediately while the Linux VM starts up. Linux jobs queue until the dispatch client is connected to the VM.

## Pre-Baked Rootfs

The WSL rootfs is an Alpine minirootfs with gcompat and iptables baked in at compile time. This eliminates network-dependent `apk add` calls during boot. See [Pre-baked rootfs]({{< relref "pre-baked-rootfs" >}}).
The rootfs inside the initrd is an Alpine minirootfs with gcompat and iptables baked in at compile time. This eliminates network-dependent `apk add` calls during boot. See [Pre-baked rootfs]({{< relref "pre-baked-rootfs" >}}).

## What This Architecture Removes

Compared to the earlier dual-scheduler approach:

- No PEM file copy into WSL.
- No config.toml rewriting for WSL.
- No duplicate GitHub polling from WSL.
- No GitHub App token refresh in WSL.
- WSL has no GitHub credentials at all.
- No PEM file copy into the worker.
- No config.toml rewriting for the worker.
- No duplicate GitHub polling from the worker.
- No GitHub App token refresh in the worker.
- The Linux worker has no GitHub credentials at all.

Compared to the earlier WSL2-based worker:

- Works under `LocalSystem`, so the installed Windows service can manage Linux jobs.
- No dependency on the `wsl.exe` toolchain or any WSL distro registration.
- Boot is deterministic — same kernel, same initrd, same VHDX root every time.
Loading