Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 11 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ vk-cocoon is the host-side bridge between the Kubernetes API and the cocoon runt
| Provider | `provider/cocoon/` | `Provider` struct with lifecycle methods (CreatePod / DeletePod / UpdatePod / GetPodStatus), startup reconcile, orphan policy, VM event watcher, pod eviction |
| Provider iface | `provider/` | Shared provider interface and node-capacity helpers |
| Cocoon CLI | `vm/` | `Runtime` interface + the default `CocoonCLI` implementation that shells out to `cocoon` (including `WatchEvents` via `cocoon vm status --event --format json`) |
| Snapshot SDK | `snapshots/` | Wraps the [epoch](https://github.com/cocoonstack/epoch) SDK as a `RegistryClient` interface, plus `Puller` and `Pusher` that stream snapshots and cloud images via `epoch/snapshot` and `epoch/cloudimg` |
| Snapshot SDK | `snapshots/` | `Puller` and `Pusher` stream snapshots and cloud images to any OCI registry through `cocoon-common`'s `oci.Registry` backend (`cocoon-common/snapshot` + `cocoon-common/cloudimg`) |
| Network | `network/` | cocoon-net JSON lease parser used to resolve a freshly cloned VM's IP, plus the ICMPv4 `Pinger` the probe loop uses to check guest reachability |
| Guest exec | `guest/` | RDP help-text shim (Windows) and SAC dialer (Windows static IP). Linux guest exec / logs go through `cocoon vm exec` and `cocoon vm logs` — see `vm/`. |
| Probes | `probes/` | Per-pod probe agents that run a caller-supplied health check on a ticker, update the in-memory readiness map, and invoke an onUpdate callback so the async provider can push fresh status through v-k's notify hook |
Expand All @@ -27,8 +27,8 @@ vk-cocoon is the host-side bridge between the Kubernetes API and the cocoon runt
2. If a VM with `spec.VMName` already exists locally, adopt it (idempotent on restart). Adoption hinges on `StartupReconcile` having populated `vmsByName`; before reconcile completes, CreatePod treats the pod as new and may collide on VM name.
3. Otherwise branch on `spec.Managed` first, then `spec.Mode`:
- **`Managed=false`** (static / externally-managed VMs, e.g. Windows toolboxes on an external QEMU host): skip the runtime entirely and adopt the pre-assigned `VMID` / `IP` / `VNCPort` the operator pre-wrote into the `VMRuntime` annotations. `Managed` is the single source of truth for "vk-cocoon owns this VM's lifecycle".
- **Mode `clone`** (default, `Managed=true`): look up the snapshot locally using a **tag-aware name** (`repo:tag`, or bare `repo` when the tag is `latest` for backward compatibility). If the local snapshot does not exist, pull it from epoch via `Puller.PullSnapshot`. Before cloning, `assertSnapshotBackend` validates the snapshot's recorded hypervisor matches `spec.Backend` — a CH snapshot cannot be cloned onto a FC target and vice-versa. When the snapshot carries a base image, `Pull: true` is passed to `CloneOptions`, which translates to `cocoon vm clone --pull`; cocoon constructs a digest reference (`repo@sha256:xxx`) from the snapshot metadata and pulls the exact image version recorded at snapshot time. Then `Runtime.Clone(from=<local>, to=spec.VMName)`. Pod-side CPU/memory/storage are not plumbed into clone — cocoon clone inherits all guest resources from the snapshot. Only the `vm run` path translates pod resources into VM resources.
- **Mode `run`** (`Managed=true`): `ensureRunImage` makes the image available locally before launching the VM. It peeks the OCI manifest via `Puller.Registry`: cocoonstack cloud-image artifacts (artifactType=`application/vnd.cocoonstack.os-image.v1+json`) take the qcow2 streaming path through `Puller.EnsureCloudImage` → `cocoon image import`, snapshot artifacts are rejected with a "use mode=clone" error, and everything else (HTTP(S) URLs, container images, refs that don't resolve against epoch) falls through to `Runtime.EnsureImage` → `cocoon image pull`. `--force` when `spec.ForcePull` is true. Then `Runtime.Run(image=spec.Image, name=spec.VMName)`. When `spec.Backend` is `firecracker`, `--fc` is passed to select the FC backend; when `spec.OS` is `windows`, `--windows` is passed. When `spec.NoDirectIO` is true, `--no-direct-io` disables O_DIRECT on writable disks (CH only, useful for dev/test).
- **Mode `clone`** (default, `Managed=true`): look up the snapshot locally using a **tag-aware name** (`repo:tag`, or bare `repo` when the tag is `latest` for backward compatibility). If the local snapshot does not exist, pull it from the registry via `Puller.PullSnapshot`. Before cloning, `assertSnapshotBackend` validates the snapshot's recorded hypervisor matches `spec.Backend` — a CH snapshot cannot be cloned onto a FC target and vice-versa. When the snapshot carries a base image, `Pull: true` is passed to `CloneOptions`, which translates to `cocoon vm clone --pull`; cocoon constructs a digest reference (`repo@sha256:xxx`) from the snapshot metadata and pulls the exact image version recorded at snapshot time. Then `Runtime.Clone(from=<local>, to=spec.VMName)`. Pod-side CPU/memory/storage are not plumbed into clone — cocoon clone inherits all guest resources from the snapshot. Only the `vm run` path translates pod resources into VM resources.
- **Mode `run`** (`Managed=true`): `ensureRunImage` makes the image available locally before launching the VM. It peeks the OCI manifest via `Puller.Registry`: cocoonstack cloud-image artifacts (artifactType=`application/vnd.cocoonstack.os-image.v1+json`) take the qcow2 streaming path through `Puller.EnsureCloudImageFromRaw` → `cocoon image import`, snapshot artifacts are rejected with a "use mode=clone" error, and everything else (HTTP(S) URLs, container images, refs that don't resolve against the registry) falls through to `Runtime.EnsureImage` → `cocoon image pull`. `--force` when `spec.ForcePull` is true. Then `Runtime.Run(image=spec.Image, name=spec.VMName)`. When `spec.Backend` is `firecracker`, `--fc` is passed to select the FC backend; when `spec.OS` is `windows`, `--windows` is passed. When `spec.NoDirectIO` is true, `--no-direct-io` disables O_DIRECT on writable disks (CH only, useful for dev/test).
- **`vm.cocoonstack.io/clone-from-dir` override** (managed-only, takes precedence over mode/fork-from): clone via `cocoon vm clone --from-dir <abs-path> --pull`, bypassing the local snapshot DB. Pairs with `cocoon snapshot export --to-dir` for cross-node staging. Conflicts with `mode=run` or `fork-from` fast-fail.
4. For clone/fork/wake paths, check whether the VM needs manual network setup (see [Post-clone hints](#post-clone-hints) below). If so, write the required commands as a base64-encoded annotation (`vm.cocoonstack.io/post-clone-hint`) and log a warning. The pod stays Running but Not Ready until the user executes the commands via `cocoon vm console` and the probe detects network connectivity.
5. Resolve the IP from the cocoon-net JSON lease file by MAC.
Expand All @@ -39,7 +39,7 @@ vk-cocoon is the host-side bridge between the Kubernetes API and the cocoon runt

1. Decode `meta.VMSpec`.
2. `meta.ShouldSnapshotVM(spec)` — the shared cocoon-common decoder — decides whether to snapshot before destroy:
- `always`: `Runtime.SnapshotSave` then `Pusher.PushSnapshot(tag=meta.DefaultSnapshotTag)` to epoch.
- `always`: `Runtime.SnapshotSave` then `Pusher.PushSnapshot(tag=meta.DefaultSnapshotTag)` to the registry.
- `main-only`: same, but only when the VM name ends in `-0` (slot 0 = main agent).
- `never`: skip snapshots entirely.
3. `Runtime.Remove(vmID)` to destroy the VM.
Expand All @@ -52,9 +52,9 @@ The only update vk-cocoon honors is a `HibernateState` transition. Anything else
| Transition | Behavior |
|---|---|
| `false → true` | NetResize (CH+Windows) → SnapshotSave → Push → clear VMID before Remove → Remove (rollback on failure). Pod stays alive (`PodRunning`) so K8s controllers do not recreate it. VMID/IP annotations clear between Push and Remove so the operator's manifest+VMID race window collapses to one patch RTT. **Compensating rollback**: if `Runtime.Remove` fails after a successful push, vk-cocoon best-effort `Registry.DeleteManifest` the hibernate tag and re-applies VMID/IP so the pod stays recoverable. Push and Save are idempotent, so a compensated retry re-publishes the tag cleanly on the next attempt. |
| `true → false` (with no live VM) | `Puller.PullSnapshot(tag=meta.HibernateSnapshotTag)` → `Runtime.Clone` → drop the hibernation tag from epoch. |
| `true → false` (with no live VM) | `Puller.PullSnapshot(tag=meta.HibernateSnapshotTag)` → `Runtime.Clone` → drop the hibernation tag from the registry. |

The operator's `CocoonHibernation` reconciler tracks the transition by polling `epoch.GetManifest(vmName, "hibernate")`.
The operator's `CocoonHibernation` reconciler tracks the transition by polling the registry for the `hibernate` manifest.

### Node resources

Expand Down Expand Up @@ -87,8 +87,8 @@ vk-cocoon exposes three metrics surfaces:
| `vk_cocoon_node_storage_available_bytes` / `total_bytes` | Gauge | Cocoon root filesystem |
| `vk_cocoon_vm_boot_duration_seconds{mode,backend}` | Histogram | VM creation time (run or clone) |
| `vk_cocoon_snapshot_save_duration_seconds` | Histogram | Snapshot save time |
| `vk_cocoon_snapshot_push_duration_seconds` | Histogram | Epoch push time |
| `vk_cocoon_snapshot_pull_duration_seconds` | Histogram | Epoch pull time |
| `vk_cocoon_snapshot_push_duration_seconds` | Histogram | Registry push time |
| `vk_cocoon_snapshot_pull_duration_seconds` | Histogram | Registry pull time |
| `vk_cocoon_probe_duration_seconds` | Histogram | Per-probe health check time (ICMP or TCP) |
| `vk_cocoon_pod_lifecycle_total{op,result}` | Counter | Pod lifecycle operations |
| `vk_cocoon_snapshot_pull_total{result}` / `push_total` | Counter | Snapshot pull/push counts |
Expand Down Expand Up @@ -178,9 +178,7 @@ If the ICMP raw socket cannot be opened — typically because the binary is runn
| `KUBECONFIG` | unset | Path to kubeconfig (in-cluster used otherwise). |
| `VK_NODE_NAME` | `cocoon-pool` | Virtual node name registered with the K8s API. |
| `VK_LOG_LEVEL` | `info` | `projecteru2/core/log` level. |
| `EPOCH_URL` | `http://epoch.cocoon-system.svc:8080` | Epoch base URL. |
| `EPOCH_TOKEN` | unset | Bearer token (only needed for `/v2/` pushes; `/dl/` is anonymous). |
| `EPOCH_CA_CERT` | unset | Path to PEM-encoded CA certificate for TLS verification against epoch. |
| `OCI_REGISTRY` | **required** | OCI registry base for snapshots and cloud images (e.g. an Artifact Registry repo). Auth resolves GCP ADC then docker config. |
| `VK_LEASES_PATH` | `/var/lib/cocoon/net/leases.json` | cocoon-net JSON lease file. |
| `VK_COCOON_BIN` | `/usr/local/bin/cocoon` | Path to the cocoon CLI binary. |
| `VK_ORPHAN_POLICY` | `destroy` | `destroy` (auto-clean), `alert`, or `keep`. |
Expand Down Expand Up @@ -225,17 +223,16 @@ make fmt # gofumpt + goimports
make help # show all targets
```

The Makefile detects Go workspace mode (`go env GOWORK`) and skips `go mod tidy` when active so cross-module references resolve through `go.work` without forcing a release of cocoon-common or epoch.
The Makefile detects Go workspace mode (`go env GOWORK`) and skips `go mod tidy` when active so cross-module references resolve through `go.work` without forcing a release of cocoon-common.

## Related projects

| Project | Role |
|---|---|
| [cocoon](https://github.com/cocoonstack/cocoon) | The MicroVM runtime vk-cocoon shells out to. |
| [cocoon-common](https://github.com/cocoonstack/cocoon-common) | CRD types, annotation contract, shared helpers. |
| [cocoon-common](https://github.com/cocoonstack/cocoon-common) | CRD types, annotation contract, shared helpers, and the OCI registry + snapshot/cloud-image packages. |
| [cocoon-operator](https://github.com/cocoonstack/cocoon-operator) | CocoonSet and CocoonHibernation reconcilers. |
| [cocoon-webhook](https://github.com/cocoonstack/cocoon-webhook) | Admission webhook for sticky scheduling and CocoonSet validation. |
| [epoch](https://github.com/cocoonstack/epoch) | Snapshot registry; vk-cocoon pulls and pushes via `epoch/snapshot` + `epoch/cloudimg`. |
| [cocoon-net](https://github.com/cocoonstack/cocoon-net) | Per-host networking with embedded DHCP server and iptables setup; vk-cocoon reads its JSON lease file. |

## License
Expand Down
35 changes: 21 additions & 14 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,14 @@ module github.com/cocoonstack/vk-cocoon
go 1.25.6

require (
github.com/cocoonstack/cocoon-common v0.2.2
github.com/cocoonstack/epoch v0.2.4
github.com/cocoonstack/cocoon-common v0.2.3-0.20260701064759-3dcdfdd23a16
github.com/google/go-containerregistry v0.21.7
github.com/projecteru2/core v0.0.0-20241016125006-ff909eefe04c
github.com/prometheus/client_golang v1.23.2
github.com/prometheus/client_model v0.6.2
github.com/virtual-kubelet/virtual-kubelet v1.12.0
golang.org/x/net v0.50.0
golang.org/x/sync v0.19.0
golang.org/x/net v0.56.0
golang.org/x/sync v0.21.0
google.golang.org/protobuf v1.36.10
k8s.io/api v0.35.3
k8s.io/apimachinery v0.35.3
Expand All @@ -21,6 +21,7 @@ require (

require (
cel.dev/expr v0.24.0 // indirect
cloud.google.com/go/compute/metadata v0.7.0 // indirect
github.com/NYTimes/gziphandler v1.1.1 // indirect
github.com/alphadose/haxmap v1.2.0 // indirect
github.com/antlr4-go/antlr/v4 v4.13.0 // indirect
Expand All @@ -34,6 +35,8 @@ require (
github.com/coreos/go-semver v0.3.1 // indirect
github.com/coreos/go-systemd/v22 v22.5.0 // indirect
github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc // indirect
github.com/docker/cli v29.5.3+incompatible // indirect
github.com/docker/docker-credential-helpers v0.9.3 // indirect
github.com/emicklei/go-restful/v3 v3.12.2 // indirect
github.com/evanphx/json-patch/v5 v5.9.11 // indirect
github.com/felixge/httpsnoop v1.0.4 // indirect
Expand All @@ -60,7 +63,7 @@ require (
github.com/inconshreveable/mousetrap v1.1.0 // indirect
github.com/josharian/intern v1.0.0 // indirect
github.com/json-iterator/go v1.1.12 // indirect
github.com/klauspost/compress v1.18.4 // indirect
github.com/klauspost/compress v1.18.6 // indirect
github.com/kr/pretty v0.3.1 // indirect
github.com/kr/text v0.2.0 // indirect
github.com/kylelemons/godebug v1.1.0 // indirect
Expand All @@ -73,12 +76,15 @@ require (
github.com/modern-go/reflect2 v1.0.3-0.20250322232337-35a7c28c31ee // indirect
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 // indirect
github.com/mxk/go-flowrate v0.0.0-20140419014527-cca7078d478f // indirect
github.com/opencontainers/go-digest v1.0.0 // indirect
github.com/opencontainers/image-spec v1.1.1 // indirect
github.com/pkg/errors v0.9.1 // indirect
github.com/pmezard/go-difflib v1.0.1-0.20181226105442-5d4384ee4fb2 // indirect
github.com/prometheus/common v0.67.4 // indirect
github.com/prometheus/procfs v0.16.1 // indirect
github.com/rogpeppe/go-internal v1.14.1 // indirect
github.com/rs/zerolog v1.29.1 // indirect
github.com/sirupsen/logrus v1.9.4 // indirect
github.com/spf13/cobra v1.10.2 // indirect
github.com/spf13/pflag v1.0.10 // indirect
github.com/stoewer/go-strcase v1.3.0 // indirect
Expand All @@ -90,25 +96,24 @@ require (
go.opentelemetry.io/auto/sdk v1.2.1 // indirect
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.60.0 // indirect
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.61.0 // indirect
go.opentelemetry.io/otel v1.39.0 // indirect
go.opentelemetry.io/otel v1.41.0 // indirect
go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.34.0 // indirect
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.34.0 // indirect
go.opentelemetry.io/otel/metric v1.39.0 // indirect
go.opentelemetry.io/otel/metric v1.41.0 // indirect
go.opentelemetry.io/otel/sdk v1.38.0 // indirect
go.opentelemetry.io/otel/trace v1.39.0 // indirect
go.opentelemetry.io/otel/trace v1.41.0 // indirect
go.opentelemetry.io/proto/otlp v1.5.0 // indirect
go.uber.org/multierr v1.11.0 // indirect
go.uber.org/zap v1.27.0 // indirect
go.yaml.in/yaml/v2 v2.4.3 // indirect
go.yaml.in/yaml/v3 v3.0.4 // indirect
golang.org/x/crypto v0.48.0 // indirect
golang.org/x/crypto v0.53.0 // indirect
golang.org/x/exp v0.0.0-20240719175910-8a7402abbf56 // indirect
golang.org/x/oauth2 v0.35.0 // indirect
golang.org/x/sys v0.41.0 // indirect
golang.org/x/term v0.40.0 // indirect
golang.org/x/text v0.34.0 // indirect
golang.org/x/oauth2 v0.36.0 // indirect
golang.org/x/sys v0.46.0 // indirect
golang.org/x/term v0.44.0 // indirect
golang.org/x/text v0.38.0 // indirect
golang.org/x/time v0.14.0 // indirect
golang.org/x/tools v0.42.0 // indirect
google.golang.org/genproto/googleapis/api v0.0.0-20250303144028-a0af3efb3deb // indirect
google.golang.org/genproto/googleapis/rpc v0.0.0-20250528174236-200df99c418a // indirect
google.golang.org/grpc v1.72.2 // indirect
Expand All @@ -128,3 +133,5 @@ require (
sigs.k8s.io/structured-merge-diff/v6 v6.3.2-0.20260122202528-d9cc6641c482 // indirect
sigs.k8s.io/yaml v1.6.0 // indirect
)

exclude cloud.google.com/go v0.26.0
Loading