fix(image): streamline the Incus base image (#57, #45)#125
Merged
Conversation
Write br_stage breadcrumbs straight to /dev/hvc0 (the VZ-captured virtio console, present in late userspace regardless of the kernel console= cmdline) with a /dev/console fallback, and drop the forced first-boot reboot from bootcmd. The reboot existed only to activate a console=hvc0 grub drop-in before runcmd, but it never fired reliably (cloud-init's pid-1 reaper) and doubled cold-boot time. Writing to hvc0 directly makes first-boot progress visible with no reboot. The grub drop-in stays so the KERNEL's own console routes to hvc0 on subsequent natural boots. Fixes #57.
Every build-guest-image run failed on 'passt exited with status 1': libguestfs 1.52 on the GitHub-hosted runners cannot bring up its appliance network, so virt-customize --install (which needs apt) aborts. Add a --method auto|guestfish|nbd selector and force 'nbd' in CI. The qemu-nbd + chroot path runs apt over the host network namespace and never boots a libguestfs appliance, sidestepping passt entirely. virt-sparsify (no network) still handles the compress step. Also probe the ext4 root partition by filesystem instead of hardcoding p1. Progresses #45 (pipeline had never produced a publishable image).
The nbd path mounts fine and probes the ext4 root, but apt failed with 'Temporary failure resolving deb.debian.org': the Debian cloud image's /etc/resolv.conf is a systemd-resolved symlink that dangles inside the chroot. Copy the host resolver in (the chroot shares the host net namespace) before apt, and restore the original symlink afterwards so the baked image is unchanged. Progresses #45.
Third build failure: 'Unable to locate package incus-ui-canonical'. That package is not in Debian trixie main (it bundles minified JS -> contrib/ non-free), and apt-installing the Zabbly build would swap Debian's incus to satisfy its Depends. Mirror the proven cloud-init path: install 'incus incus-client' (+ socat jq openssh-server chrony) from main as the required set, then bake the web UI best-effort by downloading the Zabbly .deb and extracting it to /opt/incus/ui (never installing it), with an INCUS_UI drop-in. The whole UI step is non-fatal, so a missing Zabbly suite can't fail the build. Progresses #45.
Fourth failure was the final step: virt-sparsify crashed with 'guestfs_launch failed' — the libguestfs appliance can't launch at all on the GitHub runners (the same root cause as the passt error). Everything before it now works: incus + client from main, the Zabbly UI extracted to /opt/incus/ui, initramfs regenerated. Skip virt-sparsify on the nbd path and compress with qemu-img convert -c instead. Zero free space first (virt-sparsify's block-discard can't run) so compression stays effective, apt-get clean in the chroot, and detach qemu-nbd before converting so the compress reads a flushed image. Progresses #45.
arm64 built and published the guest image cleanly; amd64 failed on a transient 'Connection reset by peer' fetching a single .deb from the mirror. Same code, different luck. Set Acquire::Retries=5 in the chroot so apt retries transient CDN resets, and remove the config before sealing the image. Closes the last gap in the build (#45).
Owner
Author
#45 build pipeline validated ✅Dispatched
Final run 27981889504: Build (amd64) ✅ · Build (arm64) ✅ · Publish Release ✅ Published the first-ever guest images:
The image bakes incus + incus-client (Debian main), the Incus web UI at |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Addresses the two open issues against bladerunner's existing Incus base image (Debian Trixie genericcloud + cloud-init), as the first-step cleanup before adding new guest base-image types (#119/#120).
#57 — first-boot console invisible / reboot never fires
The bootstrap emitted
br_stagebreadcrumbs to/dev/console, but under VZ the captured device is the virtio console (/dev/hvc0) while Debian's default cmdline routes/dev/console→ a non-existentttyS0— so first-boot progress vanished. To compensate,bootcmdforced a first-boot reboot to activate aconsole=hvc0grub drop-in. That reboot is exactly what #57 reports never fires (cloud-init's pid-1 reaper aborts it), and it doubled cold-boot time.The issue's own "Option A" (
power_statereboot) is actually broken:power_statefires after runcmd, and cloud-init runcmd is per-instance — it won't re-run post-reboot — so the bootstrap would run once, invisibly, then reboot into a boot with no runcmd.Fix: write breadcrumbs straight to
/dev/hvc0(present in late userspace regardless of theconsole=cmdline) with a/dev/consolefallback, and delete the forced reboot. First-boot progress is visible immediately, no double-boot. The grub drop-in stays so the kernel's own console routes to hvc0 on subsequent natural boots.internal/provision/cloudinit.go: hvc0 breadcrumbs; bootcmd keepsupdate-grub, drops the.boot1-rebootedreboot.TestBuildCloudInit_NoFirstBootReboot(regression guard that the reboot stays gone), breadcrumb assertion now checks>/dev/hvc0, comments updated.#45 — pre-baked guest image build never succeeded
The
build-guest-imagepipeline scaffolding landed long ago (#46/#50/#51) but has never produced a publishable image: every run dies onpasst exited with status 1. libguestfs 1.52 on the GitHub-hosted runners can't bring up its appliance network, sovirt-customize --install(which needs apt) aborts. The hosted-image opt-in (UseHostedGuestImage) therefore points at aguest-image-latestrelease that doesn't exist.Fix: add a
--method auto|guestfish|nbdselector toscripts/build-guest-image.shand forcenbdin CI. The qemu-nbd + chroot path runs apt over the host network namespace and never boots a libguestfs appliance, sidestepping passt.virt-sparsify(no network) still handles the compress step. Root partition is now probed by filesystem instead of hardcodingp1.scripts/build-guest-image.sh,.github/workflows/build-guest-image.yml.Testing
go test ./internal/provision/...pass;golangci-lint0 issues;gofmtclean.bash -n+shellcheckclean on the build script.build-guest-image.ymlon this branch to confirm the nbd path builds + publishes (passt fix can only be verified on the runner). Will report the result on this PR.Notes / follow-ups
UseHostedGuestImagestays opt-in until the published image is boot-verified end-to-end on a Mac. Flipping it (the <60s cold-start payoff in [PRD] Pre-baked bladerunner guest image: Trixie + Incus pre-installed #45) is a deliberate follow-up.os/kinddiscriminator yet; that's their dependency, tracked separately.