Harden desktop switching and native QEMU input#1
Merged
Conversation
Replace hand-transcribed virtio-MMIO register offsets, virtio-blk request types, status codes, feature bits, ring descriptor flags, and device IDs with the upstream virtio-bindings crate. Values are identical to the prior literals, so device behavior and the 238-test suite are unchanged; the constants now derive from a single spec-tracked source instead of magic numbers maintained by hand.
Add the virtio-bindings 0.2.7 entry to THIRD_PARTY_NOTICES with its BSD-3-Clause text, and list it in the VMM foundation component table as the linked source of virtio register/blk/ring/id constants.
Live kernel-layout trace: reached PANE_BLOCK_MODULE_LOAD_OK and PANE_DISPLAY_CONTRACT_DISCOVERED. The guest attempted /dev/vda1, then emitted PANE_VIRTIO_ROOT_DEVICE_WAIT_TIMEOUT with no virtio interrupt request or acknowledgement. The pane-block fallback serviced one base-image read, but PANE_ROOT_MOUNT_OK and PANE_INIT_EXEC were not reached before the root-mount budget expired. pane-block remains intact as the required diagnostic fallback.
The native boot reached initramfs userspace but the guest never touched the virtio-MMIO aperture, so /dev/vda never appeared and the boot fell back to the diagnostic pane-block path. Root cause: the stock Arch kernel builds the virtio-MMIO bus as a module (CONFIG_VIRTIO_MMIO=m) with the cmdline-device path enabled (CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES=y) and virtio-blk built in, but the discovery initramfs neither bundled nor loaded virtio_mmio.ko, leaving virtio_mmio.device=4K@0xdfc0000:5 with no bus driver to act on. Add `runtime --register-virtio-mmio-module` to copy a SHA-verified, kernel-matched virtio_mmio.ko into the discovery initramfs as /lib/modules/virtio_mmio.ko, pack it into the discovery cpio, and load it from the generated /init (finit_module) before the virtio-root device wait. This lets the cmdline directive register the device so the built-in virtio-blk driver can expose /dev/vda. Documented in the README and VMM foundation notes; pane-block remains the labeled fallback.
The previous change loaded virtio_mmio.ko but passed empty module parameters, so the bus driver registered no device and /dev/vda never appeared: the kernel does not replay the boot cmdline virtio_mmio.device= value to a module loaded from userspace. Parse virtio_mmio.device from the cmdline in the discovery /init and pass it explicitly as the module's device= parameter (the form the kernel's own virtio-mmio documentation specifies for the modular configuration). Live WHP boot result: the guest now loads the bus module with the device spec, enumerates /dev/vda1, mounts the virtio root (no wait timeout, no pane-block fallback), and drives the split-virtqueue with delivered and acknowledged interrupts. The ext4 root mount is in progress; completing it within the root-mount phase budget is the next step.
Document the June 21 live boot: the virtio-mmio device parameter cleared the zero-MMIO blocker, the guest enumerated /dev/vda1 and mounted the virtio root with acknowledged interrupts, and the open issue is level-triggered virtio completion interrupt delivery so the ext4 mount finishes inside the phase budget.
Introduce a narrow, unit-tested I/O APIC emulation (the device WHP does not provide) so a level-triggered device IRQ can be delivered and resampled through the guest's local APIC. service_irq injects and arms remote IRR; end_of_interrupt clears it and re-injects while the device still holds the line asserted, which is the level-triggered resample the virtio-MMIO block path needs. The model is shaped by crosvm's irqchip/ioapic.rs semantics but is Pane-owned and pure: it returns the vector for the WHP exit loop to inject rather than using eventfds. Wiring into the WHP memory map, exit loop, MP table, and virtio IRQ routing follows in subsequent phases.
Wire the I/O APIC window at 0xFEC00000 into the WHP kernel-layout exit loop: guest accesses there now decode through the live WinHvEmulation instruction emulator into the Pane I/O APIC mmio_read/write, instead of being treated as an unmapped-memory blocker that stops the probe. Instantiate one I/O APIC per run. This is the device boundary for the next phases: an MP table so the guest discovers the I/O APIC, then routing the virtio-MMIO IRQ through it with EOI resampling so level-triggered completion interrupts are delivered reliably.
With acpi=off the guest finds the local APIC and I/O APIC by scanning low memory for an MP floating pointer. Emit a minimal MP table (one CPU, ISA bus, the I/O APIC at 0xFEC00000, identity ISA->pin interrupt routing with the virtio IRQ marked level-triggered, and the LINT sources) and embed it in the mapped BIOS ROM region inside the 0xF0000 scan window. This lets Linux route the virtio IRQ through the I/O APIC instead of the legacy 8259 PIC. Routing the virtio device's line through the I/O APIC and resampling on local-APIC EOI is the next phase.
Wire the virtio-MMIO block device's interrupt line into the I/O APIC instead of injecting a legacy-PIC vector directly. On a completion the device line is asserted into the I/O APIC, which injects the guest-programmed vector as a level-triggered interrupt; WHP then raises an APIC EOI exit when the guest acknowledges it, and Pane resamples the line through the I/O APIC, re-injecting while the device still has work pending. This replaces the edge injection that gummed up the local APIC in-service state (the guest EOIs the PIC, never the LAPIC) and caused virtio completion interrupts to stall after the first delivery. The APIC EOI exit now carries the acknowledged vector. Legacy PIC vector helpers become test-only.
With the I/O APIC enabled, Linux routes its boot timer to IRQ0/pin0 (vector 0x30) and froze with jiffies stuck at 0.000000 because Pane has no 8254 PIT to tick it (crosvm/QEMU emulate one). Inject a periodic edge interrupt on I/O APIC pin 0 on a ~1kHz wall-clock cadence, re-arming the line each tick; service_irq is a no-op until the guest unmasks pin 0, so it self-gates until the kernel programs its timer. With ticks flowing the guest now boots through full kernel init, unpacks the initramfs, runs /init, loads virtio_mmio, and virtio_blk enumerates /dev/vda1 from the I/O APIC-routed device. Also resample the virtio block line on the same cadence while a completion is pending so a coalesced delivery cannot stall I/O. Remaining: the ext4 mount over virtio still needs reliable block-completion interrupt delivery.
crosvm wires virtio-MMIO interrupts through an edge IRQ event
(Transport::Mmio { irq_evt_edge }); only virtio-PCI uses a level line with a
resample thread. Mark the virtio IRQ edge-triggered in the MP table (bus default)
instead of level, and gate the periodic IOAPIC resample to level pins only so the
edge virtio line is not spuriously re-asserted.
Full native boot via the I/O APIC is unchanged (timer ticks, kernel init, virtio_blk
enumerates /dev/vda1). The ext4 mount over virtio still stalls after the enumeration
reads, which is a separate virtio block-completion issue under investigation.
The virtio-blk device exposed only the root partition's byte length as the whole
disk capacity, so vda1 (which starts at the 1MB partition offset inside vda)
extended past end-of-disk and Linux truncated it ("p1 ... extends beyond EOD"),
breaking the ext4 geometry. Expose the full base image as vda; the root partition
lives at its offset within it. The guest now sees the correct 8388608-sector disk
and an untruncated vda1.
Also enrich the queue-notify trace with per-request type/sector/head/used-len and
the queue indices + interrupt status, for diagnosing the remaining virtio block
completion stall.
Pane's from-scratch WHP run loop boots Linux but cannot drive the guest
timer fast enough on this host: the exit loop caps near 15/sec, so jiffies
starve and the ext4 root mount stalls (~10 block reads in 600s). After
exhausting the owned-VMM path, pivot the boot path to QEMU with the WHPX
accelerator, which boots the same Arch image end to end (virtio root,
switch_root, systemd, login) in ~10s on the same hypervisor substrate.
QEMU engine (src/qemu.rs):
- Locate qemu-system-x86_64; build the WHPX machine: virtio root from the
base image, persistent qcow2 user disk mounted at /home, user-mode
networking. Run as a headless milestone probe, interactive (serial or a
gtk/sdl window), or detached with pid tracking.
- fstab=0 so the image's stale fstab swap entry cannot stall boot;
per-drive snapshot keeps the SHA-pinned base image immutable while the
user disk and optional root overlay persist.
Self-contained initramfs (src/ext4.rs):
- Minimal read-only ext4 extractor pulls the distro initramfs out of the
base image's root partition with no WSL or external tools; byte-for-byte
identical to a debugfs extraction.
Wiring:
- pane launch --runtime qemu-whpx|auto [--display gtk] [--persist-root]
[--detach]; pane stop terminates a detached VM.
- RuntimeMode::{QemuWhpx,Auto} and DisplayMode in src/model.rs; flags in
src/cli.rs; native-boot-spike --qemu-whpx for the probe.
Owned-VMM work retained as the engine reference/fallback: src/lapic.rs
userspace xAPIC, native.rs run-loop hardening, IOAPIC edge routing, MP
table and virtio queue diagnostics.
265 tests pass.
systemd colorizes status lines and embeds VT highlight codes inside the message text (e.g. "Mounted \e[..m/sysroot\e[0m."), so plain substring matching missed real milestones: mounted_sysroot and the /home mount read false on a fully successful boot. read_serial now strips CSI/OSC/DCS escape sequences first, so every milestone is detected. Adds unit tests.
Change the --runtime default from wsl-bridge to auto: pane launch with no flags now selects QEMU+WHPX when QEMU and the native runtime artifacts are present, and falls back to the WSL bridge otherwise. Makes the QEMU route the default boot path without breaking environments that only have WSL.
Reduce a fresh-machine boot to a single `pane launch`: - Derive the kernel from the base image: resolve_distro_kernel pulls /boot/vmlinuz-linux via the ext4 reader (cached), mirroring the initramfs derivation, so only the base image needs to exist. Shared extract_from_base_image + base_image_partition_offset helpers. - Auto-install QEMU: ensure_qemu_available installs it via winget on first use when absent, then resolves the path. - Acquire the base image: ensure_base_image downloads it on first run (curl, SHA-verified, registered via register_base_os_image) when a hosting URL is configured, else gives an actionable import instruction. - launch_qemu_whpx_runtime runs these as a preflight; auto runtime selection now only needs QEMU + the base image (kernel is derived) and stays side-effect free. Adds an ext4 bzImage extraction test. 268 tests pass.
(a) Configurable base-image source: PANE_BASE_IMAGE_URL / PANE_BASE_IMAGE_SHA256 env vars override the built-in defaults via base_image_download_source(), so hosting can be pointed at runtime without a rebuild. (b) Graceful VM lifecycle for detached QEMU-WHPX: - Detached boots expose a QMP control channel (tcp 127.0.0.1:44510). - Record pid + QMP port in state/qemu-whpx.json (QemuVmState). - pane stop requests a clean ACPI shutdown (QMP system_powerdown), waits up to 15s for the guest to power off, and hard-kills only as a fallback. - pane status reports whether the VM is running; process_alive via tasklist. Verified: detach -> status running -> stop shuts down cleanly in ~3s. 268 tests pass.
The base image locks root (shadow '*') and ships only serial autologin, so the graphical console had no usable login. `pane provision` sets a root password and optionally creates a first sudo user, persisted to the root overlay, without editing the base image: - Drives the serial autologin root shell over a TCP serial socket (qemu.rs provision_via_serial): waits for autologin, runs chpasswd / useradd / sudoers.d, then powers off. TCP avoids the Windows named-pipe open() blocking trap. - pane provision [--root-password] [--username --password]; generates strong defaults and prints the credentials; grants the user wheel+sudo. Verified by flattening the overlay and reading /etc/shadow: root and the new user have real yescrypt hashes (root was previously '*'), the user exists in /etc/passwd, and /etc/sudoers.d/wheel is set. 268 tests pass.
Install a graphical desktop into the guest image and make the QEMU window usable end to end. - pane install-desktop (app.rs install_desktop): drives the serial root shell to configure a mirror + DHCP networking, initialize the pacman keyring (pacman-key --init/--populate; the base image ships it uninitialized), then pacman xorg-server + lightdm + lightdm-gtk-greeter + xfce4, and enable lightdm. Persisted to the root overlay. - provision_via_serial: generalized with a completion-marker wait so long installs are awaited (not just fixed sleeps), and a periodic transcript flush so progress is watchable; QEMU stderr is captured for diagnostics. - Machine: add virtio-rng (entropy for pacman-key --init), and render the graphical window with -vga std + gtk gl=off (virtio-gpu + GL crashed QEMU under WHPX when Xorg started). Verified: install reaches 258/258 packages; the overlay contains lightdm, xfce4-session and xfce.desktop; an interactive graphical launch boots to the LightDM greeter (log in to reach XFCE). 268 tests pass. Known follow-up: detached (--detach) graphical launch closes the window in the spawn context; interactive launch is stable.
Remove a temp transcript accidentally committed in the previous change and ignore *.out captures.
The VM ran 1 vCPU, fixed 2 GB, a generic CPU and uncached disks. Scale to the host and use fast I/O: - host_resources(): vCPUs = logical cores clamped [2,8]; RAM = half of physical clamped [2048,8192] MB (GlobalMemoryStatusEx). Applied via QemuBootConfig.vcpus + memory_mb (build_qemu_engine_config). - -cpu Skylake-Client: modern features (AVX2/SSE4) and WHPX-compatible. WHPX rejects -cpu host/max (APX/MPX feature conflict kills the guest before boot, verified), so a feature-rich named model is the sweet spot. - Disks: aio=threads everywhere; ephemeral drives cache=unsafe; persistent drives cache=writeback + discard=unmap + detect-zeroes=unmap so deleted guest files reclaim host space via TRIM. Verified: probe boots to login with 8 vCPUs + scaled RAM + fast disk.
- pane install-desktop --de xfce|gnome|kde: per-desktop package sets + display manager (lightdm/gdm/sddm), each including Firefox and NetworkManager. model::DesktopChoice. - Grow the root disk to fit heavier desktops: qemu-img resize the overlay (qemu::resize_qcow2) + in-guest sfdisk extend, partx -u, resize2fs (online). Default 8 GiB (XFCE) / 24 GiB (GNOME, KDE), --disk-gib to set. - Enable fstrim.timer so deleted guest files reclaim host space (pairs with the disks' discard=unmap). - Auto timeout by desktop (30 min XFCE, 90 min GNOME/KDE) unless set. Verified on the XFCE path: root grew 4 -> 8 GiB (resize2fs reports 2096896 4k blocks), Firefox installed, run completed successfully. 268 tests pass.
Pass -name "Pane" to QEMU so the guest window/taskbar title reads "Pane" instead of "QEMU". First step of the single-app rebrand; the embedded- display Tauri shell will remove the standalone window entirely.
Make Pane one app: running `pane` with no subcommand opens a Tauri window (title "Pane") with a Control Center UI; any subcommand still runs the CLI. Single binary, GUI by default. - src/gui.rs: Tauri app + engine_run command that self-execs the pane CLI, so the UI and CLI share one source of truth. main.rs routes bare launch to the GUI. - ui/: vanilla frontend styled like the homepage (cream/black) — status pill, Launch/Stop, desktop choice (XFCE/GNOME/KDE) + workspace mode (persistent/perishable), credentials, Doctor, and an activity log. - tauri.conf.json (v2, frontendDist=ui, withGlobalTauri) + capabilities/. - build.rs runs tauri_build::build(); drop our icon/manifest embedding (tauri-build owns them) to avoid a duplicate-resource link error; keep version info via manifest_optional. Verified: bare pane.exe opens the "Pane" window; pane status still runs the CLI; 268 tests pass. Embedded guest display (noVNC) + folding more actions are the next steps.
Render the guest inside the Pane app instead of a separate QEMU window: - DisplayMode::Vnc -> QEMU headless `-vnc 127.0.0.1:0,websocket=5700` (-display none, -vga std). qemu::display_args_for handles vnc vs gtk/sdl; vnc_websocket_port() exposes the port. - UI: vendored noVNC 1.7.0 (ui/novnc); a full-window screen view with an RFB client that connects to ws://127.0.0.1:5700 (scaled, retries while the guest boots). Launch now uses --display vnc --detach, shows the screen, and connects; Stop disconnects + powers off. Verified: launch --display vnc --detach serves VNC 5900 + websocket 5700; release builds; 268 tests pass. Users see only the Pane window.
- pane workspace --reset (discard the root overlay -> fresh from base), --purge (also drop the user disk), --compact (qemu-img compaction to reclaim space freed by TRIM/deletes). The UI Reset button now calls it. - locate_qemu() prefers a bundled pane-engine.exe next to pane.exe (offline, version-pinned, shows as Pane in Task Manager), then PATH, then the winget install path. Shipping the renamed QEMU tree is a packaging step. Verified: workspace --reset removes the overlay; --compact rewrites the disks; 268 tests pass.
Two bugs made the window show a black screen with non-working buttons: - #screen-view used `display:flex`, overriding the `hidden` attribute, so the VNC view covered the Control Center at startup. Gate it on a `.show` class (base `display:none`). - main.js statically imported noVNC at the top; if that module failed to load, the whole script errored and no click handlers attached. Import noVNC lazily inside connectDisplay so the UI always works; the display is the only thing affected if noVNC fails. Also logs "Pane ready." on load to confirm the script ran.
engine_run was a synchronous Tauri command, which runs on the main UI thread; a long action (e.g. install-desktop) blocked it and the window stopped responding. Make it async so Tauri runs it off the main thread — the UI stays responsive while the work runs.
- VNC display now uses virtio-gpu (was bochs std): gives the guest a real KMS framebuffer so desktops render properly (software/llvmpipe, no host GL). Verified stable under WHPX+VNC (the earlier crash was gtk+GL only). - GUI attaches the display on startup if a VM is already running, so relaunching Pane shows the live desktop instead of doing nothing.
…ess) Switch the VNC display from software virtio-gpu to hardware-accelerated GL: -display egl-headless gives a host GL context and -device virtio-gpu-gl-pci translates guest OpenGL to the host GPU (VirGL), served over VNC. This makes GL-heavy desktops (GNOME, KDE) render properly instead of black, comparable to VMware/VirtualBox 3D. Verified under WHPX: a full GNOME boot reaches the graphical target with QEMU stable (gnome-shell exercises GL without crashing) and VNC serving.
The desktop dropped to a text terminal because the display manager was never wired, and the embedded noVNC path was laggy with broken mouse input. - install-desktop now disables conflicting display managers, enables the chosen one, sets the default to graphical.target, installs mesa, and reports the DM state. Verified: display-manager -> lightdm, default graphical (previously only a getty started). - GUI launches the desktop in a native QEMU gtk window (smooth, real input) via a fire-and-forget interactive child (launch_vm), parented to the long-lived Pane app so the window survives without blocking the UI. Detached gtk windows die regardless of process flags, so this is the reliable approach. stop_vm stops it. - Drop the laggy noVNC path from the default launch.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR hardens Pane's QEMU-WHPX desktop path after testing exposed unreliable desktop switching, misleading install success, fixed C: runtime storage, and native-window input problems.
Changes
PANE_HOMEso Pane can run from a user-chosen drive instead of always using%LOCALAPPDATA%.archlinux-keyring.Validation
cargo build --offlinecargo test --offline --quiet(272 passed)cargo clippy --offline -- -D warningscargo fmt --checkstill reports pre-existing formatting differences insrc/ext4.rsandsrc/lapic.rs; those files are not part of this PR.