Skip to content

Harden desktop switching and native QEMU input#1

Merged
NAME0x0 merged 40 commits into
mainfrom
load-virtio-mmio-module
Jun 30, 2026
Merged

Harden desktop switching and native QEMU input#1
NAME0x0 merged 40 commits into
mainfrom
load-virtio-mmio-module

Conversation

@NAME0x0

@NAME0x0 NAME0x0 commented Jun 29, 2026

Copy link
Copy Markdown
Owner

Summary

This PR hardens Pane's QEMU-WHPX desktop path after testing exposed unreliable desktop switching, misleading install success, fixed C: runtime storage, and native-window input problems.

Changes

  • Make the GUI pass a selectable storage root through PANE_HOME so Pane can run from a user-chosen drive instead of always using %LOCALAPPDATA%.
  • Make desktop install/switch operations fail fast and surface CLI errors in the GUI instead of silently continuing after pacman/keyring failures.
  • Clear corrupt pacman sync DB files before reinstalling desktop packages and force a fresh sync for archlinux-keyring.
  • Install the full GNOME package group and allow GDM to use its normal Wayland path instead of forcing a partial Xorg-only GNOME setup.
  • Add QEMU USB tablet/keyboard devices and expose a GUI window-backend selector, defaulting to SDL native windows for more reliable mouse input.
  • Ensure launch validates the selected desktop before opening the persistent graphical VM.

Validation

  • cargo build --offline
  • cargo test --offline --quiet (272 passed)
  • cargo clippy --offline -- -D warnings

cargo fmt --check still reports pre-existing formatting differences in src/ext4.rs and src/lapic.rs; those files are not part of this PR.

NAME0x0 added 30 commits June 20, 2026 14:50
Replace hand-transcribed virtio-MMIO register offsets, virtio-blk request
types, status codes, feature bits, ring descriptor flags, and device IDs
with the upstream virtio-bindings crate. Values are identical to the prior
literals, so device behavior and the 238-test suite are unchanged; the
constants now derive from a single spec-tracked source instead of magic
numbers maintained by hand.
Add the virtio-bindings 0.2.7 entry to THIRD_PARTY_NOTICES with its
BSD-3-Clause text, and list it in the VMM foundation component table as the
linked source of virtio register/blk/ring/id constants.
Live kernel-layout trace: reached PANE_BLOCK_MODULE_LOAD_OK and PANE_DISPLAY_CONTRACT_DISCOVERED. The guest attempted /dev/vda1, then emitted PANE_VIRTIO_ROOT_DEVICE_WAIT_TIMEOUT with no virtio interrupt request or acknowledgement. The pane-block fallback serviced one base-image read, but PANE_ROOT_MOUNT_OK and PANE_INIT_EXEC were not reached before the root-mount budget expired. pane-block remains intact as the required diagnostic fallback.
The native boot reached initramfs userspace but the guest never touched the
virtio-MMIO aperture, so /dev/vda never appeared and the boot fell back to the
diagnostic pane-block path. Root cause: the stock Arch kernel builds the
virtio-MMIO bus as a module (CONFIG_VIRTIO_MMIO=m) with the cmdline-device path
enabled (CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES=y) and virtio-blk built in, but the
discovery initramfs neither bundled nor loaded virtio_mmio.ko, leaving
virtio_mmio.device=4K@0xdfc0000:5 with no bus driver to act on.

Add `runtime --register-virtio-mmio-module` to copy a SHA-verified, kernel-matched
virtio_mmio.ko into the discovery initramfs as /lib/modules/virtio_mmio.ko, pack it
into the discovery cpio, and load it from the generated /init (finit_module) before
the virtio-root device wait. This lets the cmdline directive register the device so
the built-in virtio-blk driver can expose /dev/vda. Documented in the README and
VMM foundation notes; pane-block remains the labeled fallback.
The previous change loaded virtio_mmio.ko but passed empty module parameters,
so the bus driver registered no device and /dev/vda never appeared: the kernel
does not replay the boot cmdline virtio_mmio.device= value to a module loaded
from userspace. Parse virtio_mmio.device from the cmdline in the discovery /init
and pass it explicitly as the module's device= parameter (the form the kernel's
own virtio-mmio documentation specifies for the modular configuration).

Live WHP boot result: the guest now loads the bus module with the device spec,
enumerates /dev/vda1, mounts the virtio root (no wait timeout, no pane-block
fallback), and drives the split-virtqueue with delivered and acknowledged
interrupts. The ext4 root mount is in progress; completing it within the
root-mount phase budget is the next step.
Document the June 21 live boot: the virtio-mmio device parameter cleared the
zero-MMIO blocker, the guest enumerated /dev/vda1 and mounted the virtio root
with acknowledged interrupts, and the open issue is level-triggered virtio
completion interrupt delivery so the ext4 mount finishes inside the phase budget.
Introduce a narrow, unit-tested I/O APIC emulation (the device WHP does not
provide) so a level-triggered device IRQ can be delivered and resampled through
the guest's local APIC. service_irq injects and arms remote IRR; end_of_interrupt
clears it and re-injects while the device still holds the line asserted, which is
the level-triggered resample the virtio-MMIO block path needs.

The model is shaped by crosvm's irqchip/ioapic.rs semantics but is Pane-owned and
pure: it returns the vector for the WHP exit loop to inject rather than using
eventfds. Wiring into the WHP memory map, exit loop, MP table, and virtio IRQ
routing follows in subsequent phases.
Wire the I/O APIC window at 0xFEC00000 into the WHP kernel-layout exit loop:
guest accesses there now decode through the live WinHvEmulation instruction
emulator into the Pane I/O APIC mmio_read/write, instead of being treated as an
unmapped-memory blocker that stops the probe. Instantiate one I/O APIC per run.

This is the device boundary for the next phases: an MP table so the guest
discovers the I/O APIC, then routing the virtio-MMIO IRQ through it with EOI
resampling so level-triggered completion interrupts are delivered reliably.
With acpi=off the guest finds the local APIC and I/O APIC by scanning low memory
for an MP floating pointer. Emit a minimal MP table (one CPU, ISA bus, the I/O
APIC at 0xFEC00000, identity ISA->pin interrupt routing with the virtio IRQ
marked level-triggered, and the LINT sources) and embed it in the mapped BIOS
ROM region inside the 0xF0000 scan window. This lets Linux route the virtio IRQ
through the I/O APIC instead of the legacy 8259 PIC.

Routing the virtio device's line through the I/O APIC and resampling on local-APIC
EOI is the next phase.
Wire the virtio-MMIO block device's interrupt line into the I/O APIC instead of
injecting a legacy-PIC vector directly. On a completion the device line is
asserted into the I/O APIC, which injects the guest-programmed vector as a
level-triggered interrupt; WHP then raises an APIC EOI exit when the guest
acknowledges it, and Pane resamples the line through the I/O APIC, re-injecting
while the device still has work pending.

This replaces the edge injection that gummed up the local APIC in-service state
(the guest EOIs the PIC, never the LAPIC) and caused virtio completion
interrupts to stall after the first delivery. The APIC EOI exit now carries the
acknowledged vector. Legacy PIC vector helpers become test-only.
With the I/O APIC enabled, Linux routes its boot timer to IRQ0/pin0 (vector 0x30)
and froze with jiffies stuck at 0.000000 because Pane has no 8254 PIT to tick it
(crosvm/QEMU emulate one). Inject a periodic edge interrupt on I/O APIC pin 0 on a
~1kHz wall-clock cadence, re-arming the line each tick; service_irq is a no-op
until the guest unmasks pin 0, so it self-gates until the kernel programs its timer.

With ticks flowing the guest now boots through full kernel init, unpacks the
initramfs, runs /init, loads virtio_mmio, and virtio_blk enumerates /dev/vda1 from
the I/O APIC-routed device. Also resample the virtio block line on the same cadence
while a completion is pending so a coalesced delivery cannot stall I/O. Remaining:
the ext4 mount over virtio still needs reliable block-completion interrupt delivery.
crosvm wires virtio-MMIO interrupts through an edge IRQ event
(Transport::Mmio { irq_evt_edge }); only virtio-PCI uses a level line with a
resample thread. Mark the virtio IRQ edge-triggered in the MP table (bus default)
instead of level, and gate the periodic IOAPIC resample to level pins only so the
edge virtio line is not spuriously re-asserted.

Full native boot via the I/O APIC is unchanged (timer ticks, kernel init, virtio_blk
enumerates /dev/vda1). The ext4 mount over virtio still stalls after the enumeration
reads, which is a separate virtio block-completion issue under investigation.
The virtio-blk device exposed only the root partition's byte length as the whole
disk capacity, so vda1 (which starts at the 1MB partition offset inside vda)
extended past end-of-disk and Linux truncated it ("p1 ... extends beyond EOD"),
breaking the ext4 geometry. Expose the full base image as vda; the root partition
lives at its offset within it. The guest now sees the correct 8388608-sector disk
and an untruncated vda1.

Also enrich the queue-notify trace with per-request type/sector/head/used-len and
the queue indices + interrupt status, for diagnosing the remaining virtio block
completion stall.
Pane's from-scratch WHP run loop boots Linux but cannot drive the guest
timer fast enough on this host: the exit loop caps near 15/sec, so jiffies
starve and the ext4 root mount stalls (~10 block reads in 600s). After
exhausting the owned-VMM path, pivot the boot path to QEMU with the WHPX
accelerator, which boots the same Arch image end to end (virtio root,
switch_root, systemd, login) in ~10s on the same hypervisor substrate.

QEMU engine (src/qemu.rs):
- Locate qemu-system-x86_64; build the WHPX machine: virtio root from the
  base image, persistent qcow2 user disk mounted at /home, user-mode
  networking. Run as a headless milestone probe, interactive (serial or a
  gtk/sdl window), or detached with pid tracking.
- fstab=0 so the image's stale fstab swap entry cannot stall boot;
  per-drive snapshot keeps the SHA-pinned base image immutable while the
  user disk and optional root overlay persist.

Self-contained initramfs (src/ext4.rs):
- Minimal read-only ext4 extractor pulls the distro initramfs out of the
  base image's root partition with no WSL or external tools; byte-for-byte
  identical to a debugfs extraction.

Wiring:
- pane launch --runtime qemu-whpx|auto [--display gtk] [--persist-root]
  [--detach]; pane stop terminates a detached VM.
- RuntimeMode::{QemuWhpx,Auto} and DisplayMode in src/model.rs; flags in
  src/cli.rs; native-boot-spike --qemu-whpx for the probe.

Owned-VMM work retained as the engine reference/fallback: src/lapic.rs
userspace xAPIC, native.rs run-loop hardening, IOAPIC edge routing, MP
table and virtio queue diagnostics.

265 tests pass.
systemd colorizes status lines and embeds VT highlight codes inside the
message text (e.g. "Mounted \e[..m/sysroot\e[0m."), so plain substring
matching missed real milestones: mounted_sysroot and the /home mount read
false on a fully successful boot. read_serial now strips CSI/OSC/DCS
escape sequences first, so every milestone is detected. Adds unit tests.
Change the --runtime default from wsl-bridge to auto: pane launch with no
flags now selects QEMU+WHPX when QEMU and the native runtime artifacts are
present, and falls back to the WSL bridge otherwise. Makes the QEMU route
the default boot path without breaking environments that only have WSL.
Reduce a fresh-machine boot to a single `pane launch`:

- Derive the kernel from the base image: resolve_distro_kernel pulls
  /boot/vmlinuz-linux via the ext4 reader (cached), mirroring the
  initramfs derivation, so only the base image needs to exist. Shared
  extract_from_base_image + base_image_partition_offset helpers.
- Auto-install QEMU: ensure_qemu_available installs it via winget on
  first use when absent, then resolves the path.
- Acquire the base image: ensure_base_image downloads it on first run
  (curl, SHA-verified, registered via register_base_os_image) when a
  hosting URL is configured, else gives an actionable import instruction.
- launch_qemu_whpx_runtime runs these as a preflight; auto runtime
  selection now only needs QEMU + the base image (kernel is derived) and
  stays side-effect free.

Adds an ext4 bzImage extraction test. 268 tests pass.
(a) Configurable base-image source: PANE_BASE_IMAGE_URL /
PANE_BASE_IMAGE_SHA256 env vars override the built-in defaults via
base_image_download_source(), so hosting can be pointed at runtime without
a rebuild.

(b) Graceful VM lifecycle for detached QEMU-WHPX:
- Detached boots expose a QMP control channel (tcp 127.0.0.1:44510).
- Record pid + QMP port in state/qemu-whpx.json (QemuVmState).
- pane stop requests a clean ACPI shutdown (QMP system_powerdown), waits
  up to 15s for the guest to power off, and hard-kills only as a fallback.
- pane status reports whether the VM is running; process_alive via
  tasklist.

Verified: detach -> status running -> stop shuts down cleanly in ~3s.
268 tests pass.
The base image locks root (shadow '*') and ships only serial autologin, so
the graphical console had no usable login. `pane provision` sets a root
password and optionally creates a first sudo user, persisted to the root
overlay, without editing the base image:

- Drives the serial autologin root shell over a TCP serial socket
  (qemu.rs provision_via_serial): waits for autologin, runs chpasswd /
  useradd / sudoers.d, then powers off. TCP avoids the Windows named-pipe
  open() blocking trap.
- pane provision [--root-password] [--username --password]; generates
  strong defaults and prints the credentials; grants the user wheel+sudo.

Verified by flattening the overlay and reading /etc/shadow: root and the
new user have real yescrypt hashes (root was previously '*'), the user
exists in /etc/passwd, and /etc/sudoers.d/wheel is set. 268 tests pass.
Install a graphical desktop into the guest image and make the QEMU window
usable end to end.

- pane install-desktop (app.rs install_desktop): drives the serial root
  shell to configure a mirror + DHCP networking, initialize the pacman
  keyring (pacman-key --init/--populate; the base image ships it
  uninitialized), then pacman xorg-server + lightdm + lightdm-gtk-greeter
  + xfce4, and enable lightdm. Persisted to the root overlay.
- provision_via_serial: generalized with a completion-marker wait so long
  installs are awaited (not just fixed sleeps), and a periodic transcript
  flush so progress is watchable; QEMU stderr is captured for diagnostics.
- Machine: add virtio-rng (entropy for pacman-key --init), and render the
  graphical window with -vga std + gtk gl=off (virtio-gpu + GL crashed
  QEMU under WHPX when Xorg started).

Verified: install reaches 258/258 packages; the overlay contains lightdm,
xfce4-session and xfce.desktop; an interactive graphical launch boots to
the LightDM greeter (log in to reach XFCE). 268 tests pass.

Known follow-up: detached (--detach) graphical launch closes the window
in the spawn context; interactive launch is stable.
Remove a temp transcript accidentally committed in the previous change and
ignore *.out captures.
The VM ran 1 vCPU, fixed 2 GB, a generic CPU and uncached disks. Scale to
the host and use fast I/O:

- host_resources(): vCPUs = logical cores clamped [2,8]; RAM = half of
  physical clamped [2048,8192] MB (GlobalMemoryStatusEx). Applied via
  QemuBootConfig.vcpus + memory_mb (build_qemu_engine_config).
- -cpu Skylake-Client: modern features (AVX2/SSE4) and WHPX-compatible.
  WHPX rejects -cpu host/max (APX/MPX feature conflict kills the guest
  before boot, verified), so a feature-rich named model is the sweet spot.
- Disks: aio=threads everywhere; ephemeral drives cache=unsafe; persistent
  drives cache=writeback + discard=unmap + detect-zeroes=unmap so deleted
  guest files reclaim host space via TRIM.

Verified: probe boots to login with 8 vCPUs + scaled RAM + fast disk.
- pane install-desktop --de xfce|gnome|kde: per-desktop package sets +
  display manager (lightdm/gdm/sddm), each including Firefox and
  NetworkManager. model::DesktopChoice.
- Grow the root disk to fit heavier desktops: qemu-img resize the overlay
  (qemu::resize_qcow2) + in-guest sfdisk extend, partx -u, resize2fs
  (online). Default 8 GiB (XFCE) / 24 GiB (GNOME, KDE), --disk-gib to set.
- Enable fstrim.timer so deleted guest files reclaim host space (pairs
  with the disks' discard=unmap).
- Auto timeout by desktop (30 min XFCE, 90 min GNOME/KDE) unless set.

Verified on the XFCE path: root grew 4 -> 8 GiB (resize2fs reports 2096896
4k blocks), Firefox installed, run completed successfully. 268 tests pass.
Pass -name "Pane" to QEMU so the guest window/taskbar title reads "Pane"
instead of "QEMU". First step of the single-app rebrand; the embedded-
display Tauri shell will remove the standalone window entirely.
Make Pane one app: running `pane` with no subcommand opens a Tauri window
(title "Pane") with a Control Center UI; any subcommand still runs the CLI.
Single binary, GUI by default.

- src/gui.rs: Tauri app + engine_run command that self-execs the pane CLI,
  so the UI and CLI share one source of truth. main.rs routes bare launch
  to the GUI.
- ui/: vanilla frontend styled like the homepage (cream/black) — status
  pill, Launch/Stop, desktop choice (XFCE/GNOME/KDE) + workspace mode
  (persistent/perishable), credentials, Doctor, and an activity log.
- tauri.conf.json (v2, frontendDist=ui, withGlobalTauri) + capabilities/.
- build.rs runs tauri_build::build(); drop our icon/manifest embedding
  (tauri-build owns them) to avoid a duplicate-resource link error; keep
  version info via manifest_optional.

Verified: bare pane.exe opens the "Pane" window; pane status still runs the
CLI; 268 tests pass. Embedded guest display (noVNC) + folding more actions
are the next steps.
Render the guest inside the Pane app instead of a separate QEMU window:

- DisplayMode::Vnc -> QEMU headless `-vnc 127.0.0.1:0,websocket=5700`
  (-display none, -vga std). qemu::display_args_for handles vnc vs gtk/sdl;
  vnc_websocket_port() exposes the port.
- UI: vendored noVNC 1.7.0 (ui/novnc); a full-window screen view with an
  RFB client that connects to ws://127.0.0.1:5700 (scaled, retries while
  the guest boots). Launch now uses --display vnc --detach, shows the
  screen, and connects; Stop disconnects + powers off.

Verified: launch --display vnc --detach serves VNC 5900 + websocket 5700;
release builds; 268 tests pass. Users see only the Pane window.
- pane workspace --reset (discard the root overlay -> fresh from base),
  --purge (also drop the user disk), --compact (qemu-img compaction to
  reclaim space freed by TRIM/deletes). The UI Reset button now calls it.
- locate_qemu() prefers a bundled pane-engine.exe next to pane.exe (offline,
  version-pinned, shows as Pane in Task Manager), then PATH, then the winget
  install path. Shipping the renamed QEMU tree is a packaging step.

Verified: workspace --reset removes the overlay; --compact rewrites the
disks; 268 tests pass.
Two bugs made the window show a black screen with non-working buttons:
- #screen-view used `display:flex`, overriding the `hidden` attribute, so
  the VNC view covered the Control Center at startup. Gate it on a `.show`
  class (base `display:none`).
- main.js statically imported noVNC at the top; if that module failed to
  load, the whole script errored and no click handlers attached. Import
  noVNC lazily inside connectDisplay so the UI always works; the display
  is the only thing affected if noVNC fails.

Also logs "Pane ready." on load to confirm the script ran.
engine_run was a synchronous Tauri command, which runs on the main UI
thread; a long action (e.g. install-desktop) blocked it and the window
stopped responding. Make it async so Tauri runs it off the main thread —
the UI stays responsive while the work runs.
NAME0x0 added 10 commits June 29, 2026 12:09
- VNC display now uses virtio-gpu (was bochs std): gives the guest a real
  KMS framebuffer so desktops render properly (software/llvmpipe, no host
  GL). Verified stable under WHPX+VNC (the earlier crash was gtk+GL only).
- GUI attaches the display on startup if a VM is already running, so
  relaunching Pane shows the live desktop instead of doing nothing.
…ess)

Switch the VNC display from software virtio-gpu to hardware-accelerated GL:
-display egl-headless gives a host GL context and -device virtio-gpu-gl-pci
translates guest OpenGL to the host GPU (VirGL), served over VNC. This makes
GL-heavy desktops (GNOME, KDE) render properly instead of black, comparable
to VMware/VirtualBox 3D.

Verified under WHPX: a full GNOME boot reaches the graphical target with
QEMU stable (gnome-shell exercises GL without crashing) and VNC serving.
The desktop dropped to a text terminal because the display manager was
never wired, and the embedded noVNC path was laggy with broken mouse input.

- install-desktop now disables conflicting display managers, enables the
  chosen one, sets the default to graphical.target, installs mesa, and
  reports the DM state. Verified: display-manager -> lightdm, default
  graphical (previously only a getty started).
- GUI launches the desktop in a native QEMU gtk window (smooth, real
  input) via a fire-and-forget interactive child (launch_vm), parented to
  the long-lived Pane app so the window survives without blocking the UI.
  Detached gtk windows die regardless of process flags, so this is the
  reliable approach. stop_vm stops it.
- Drop the laggy noVNC path from the default launch.
@NAME0x0 NAME0x0 merged commit 0310a4c into main Jun 30, 2026
11 checks passed
@NAME0x0 NAME0x0 deleted the load-virtio-mmio-module branch June 30, 2026 07:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant