Skip to content

fix(storage): increase data disk to 8 TiB and add btrfs auto-resize#273

Open
AprilNEA wants to merge 1 commit intomasterfrom
fix/data-disk-size-v2
Open

fix(storage): increase data disk to 8 TiB and add btrfs auto-resize#273
AprilNEA wants to merge 1 commit intomasterfrom
fix/data-disk-size-v2

Conversation

@AprilNEA
Copy link
Copy Markdown
Member

Summary

The default docker.img sparse file is 64 GiB, which users hit too easily once they keep a working set of images and volumes. OrbStack ships 8 TiB. Since the host file is sparse, the virtual size costs nothing until blocks are actually written.

Changes:

  • app/arcbox-core/src/vm_lifecycle/mod.rs — bump DOCKER_DATA_IMAGE_SIZE_BYTES from 64 GiB to 8 TiB, with a docstring explaining sparse semantics.
  • guest/arcbox-agent/src/agent/mod.rs:
    • Wait up to 5 s (50 × 100 ms) for /dev/vdb to appear. The VirtIO block device registration races the agent on cold boot.
    • After the first mount, call BTRFS_IOC_RESIZE with "max" so an existing 64 GiB filesystem transparently grows to fill the (possibly enlarged) block device on the next boot. No-op when the FS already fills the device.

Rebased from an old fix/data-disk-size branch (single commit from 2026-03-05). Paths updated for the vm_lifecycle.rs → vm_lifecycle/mod.rs and agent.rs → agent/mod.rs splits; the // ===== section divider the original introduced is intentionally dropped to match the current convention.

Test plan

  • cargo check -p arcbox-core clean
  • Cross-compile arcbox-agent for aarch64-unknown-linux-musl (rely on CI)
  • E2E: upgrade a machine that had the 64 GiB image and verify the mount auto-grows on next boot (df -h /mnt/data)
  • Fresh install: verify docker.img virtual size reports 8 TiB and actual disk use stays low until workload writes

Notes

The 5 s timeout for /dev/vdb is a heuristic — if anyone sees it expire in practice, the underlying kernel probe is stuck and increasing the timeout would only mask the symptom.

The 64 GiB sparse image was too small — OrbStack ships 8 TiB. Since the
host file is sparse, the virtual size costs nothing until written.

On the guest side:
- Wait up to 5 s for /dev/vdb to appear (VirtIO device registration race)
- After mounting btrfs, call BTRFS_IOC_RESIZE with "max" to grow the
  filesystem to fill the (possibly enlarged) block device. This handles
  existing 64 GiB images transparently on upgrade.
Copilot AI review requested due to automatic review settings April 24, 2026 14:54
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Apr 24, 2026

Greptile Summary

This PR bumps the Docker data disk virtual size from 64 GiB to 8 TiB (sparse, so no actual disk cost until blocks are written), adds a 5 s polling loop for /dev/vdb to handle VirtIO registration races on cold boot, and introduces an inline BTRFS_IOC_RESIZE "max" call after each mount so an existing 64 GiB filesystem automatically grows to fill the enlarged block device on first boot after upgrade. The host-side ensure_sparse_block_image already calls set_len when the file is smaller than the target, so both sides of the upgrade path are covered.

Confidence Score: 5/5

Safe to merge; the single remaining finding is a P2 observability improvement, not a blocking bug.

All functional pieces are correct: host sparse-file extension via set_len, btrfs ioctl struct layout (devid=1, name="max\0"), the mount/resize ordering, and the non-target-os stub. The only finding is that btrfs_resize_max swallows errors with a warn log rather than surfacing them more visibly, which is a P2 observability concern and does not block correctness.

guest/arcbox-agent/src/agent/mod.rs — btrfs_resize_max error handling worth revisiting in a follow-up.

Important Files Changed

Filename Overview
app/arcbox-core/src/vm_lifecycle/mod.rs Bumps DOCKER_DATA_IMAGE_SIZE_BYTES from 64 GiB to 8 TiB with clear doc comment; ensure_sparse_block_image already extends existing files via set_len, so the upgrade path is handled correctly on host side.
guest/arcbox-agent/src/agent/mod.rs Adds /dev/vdb polling loop (50 × 100 ms = 5 s), btrfs_resize_max via BTRFS_IOC_RESIZE ioctl after mount; ioctl struct layout and devid/name encoding are correct, but resize failures are silently swallowed (warn-only).

Sequence Diagram

sequenceDiagram
    participant H as Host (arcbox-core)
    participant VM as VM Boot
    participant A as arcbox-agent (guest)
    participant FS as Btrfs /dev/vdb

    H->>H: ensure_sparse_block_image()<br/>64 GiB → 8 TiB (set_len, sparse)
    H->>VM: Start VM with 8 TiB block device
    VM->>A: Agent starts
    A->>A: Poll /dev/vdb (up to 50 × 100 ms)
    A->>FS: ensure_btrfs_format() — Step 1
    A->>FS: mount -t btrfs (BTRFS_TEMP_MOUNT) — Step 2
    A->>FS: BTRFS_IOC_RESIZE max — Step 2.5, grows FS 64 GiB → 8 TiB
    A->>FS: Create subvolumes (@docker, @containerd, …) — Step 3
    A->>FS: Bind-mount subvolumes to final paths — Step 4
Loading

Reviews (1): Last reviewed commit: "fix(storage): increase data disk to 8 Ti..." | Re-trigger Greptile

// boot (e.g. 64 GiB → 8 TiB upgrade). `BTRFS_IOC_RESIZE` with "max"
// is a no-op when the FS already fills the device, so this is safe to
// run unconditionally.
btrfs_resize_max(BTRFS_TEMP_MOUNT);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Resize failures are silently swallowed

btrfs_resize_max returns () and only emits a warn log on failure. If the ioctl fails for a non-trivial reason (e.g., filesystem metadata corruption, kernel OOM, or the fd not pointing at the mount root), the agent proceeds normally and users will see their disk size unchanged without any obvious indication of why. Since the intent is to grow the data disk on upgrade, a failed resize is more than a cosmetic issue — it leaves the user on the old 64 GiB capacity silently.

Consider elevating the warn to an error! trace level, or returning a Result and propagating a non-fatal error message in the handler so the caller can surface a status note. The no-op case (fs already at max) succeeds, so this only affects genuine failures.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR increases the default persistent Docker data disk virtual size and adds guest-side logic to automatically grow an existing Btrfs filesystem to use the expanded capacity, improving usability for image/volume-heavy workloads.

Changes:

  • Increase docker.img virtual size from 64 GiB to 8 TiB (sparse image).
  • In the guest agent, wait briefly for the VirtIO data device to appear on boot.
  • After mounting the Btrfs data volume, issue BTRFS_IOC_RESIZE with "max" to auto-grow the filesystem.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
app/arcbox-core/src/vm_lifecycle/mod.rs Raises the default docker data image size constant and documents sparse semantics.
guest/arcbox-agent/src/agent/mod.rs Adds /dev/vdb appearance wait and a Btrfs ioctl-based resize-to-max step after the first mount.

Comment on lines +72 to +76
/// Persistent guest dockerd data image size (8 TiB sparse file).
///
/// This is the virtual size of the block device. The host file is sparse and
/// only consumes actual disk space for written blocks. 8 TiB matches OrbStack
/// and prevents users from hitting artificial limits.
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new 8 TiB default interacts badly with ensure_sparse_block_image on macOS: when growing/creating the image it calls F_PREALLOCATE with F_ALLOCATEALL for the full size_bytes, which can attempt to allocate the entire 8 TiB on disk (or spend time failing) and contradicts the “sparse costs nothing” doc here. Consider skipping/capping APFS preallocation for very large images (or preallocating only a bounded initial chunk) so increasing the virtual size doesn’t risk massive host disk allocation/timeouts.

Suggested change
/// Persistent guest dockerd data image size (8 TiB sparse file).
///
/// This is the virtual size of the block device. The host file is sparse and
/// only consumes actual disk space for written blocks. 8 TiB matches OrbStack
/// and prevents users from hitting artificial limits.
/// Persistent guest dockerd data image size on macOS.
///
/// The image is still sparse, but the current macOS creation/growth path may
/// ask APFS to preallocate the full requested size. Keeping the default
/// bounded avoids very large allocation attempts and long timeouts when the VM
/// data image is first created or expanded.
#[cfg(target_os = "macos")]
const DOCKER_DATA_IMAGE_SIZE_BYTES: u64 = 64 * 1024 * 1024 * 1024;
/// Persistent guest dockerd data image size on non-macOS platforms (8 TiB
/// sparse file).
///
/// This is the virtual size of the block device. The host file is sparse and
/// only consumes actual disk space for written blocks. 8 TiB matches OrbStack
/// and prevents users from hitting artificial limits.
#[cfg(not(target_os = "macos"))]

Copilot uses AI. Check for mistakes.
// struct btrfs_ioctl_vol_args: 8 bytes fd (devid, 1 = default) + 4088 bytes name.
// For resize, fd=1 (device id), name="max\0".
let mut args = [0u8; 4096];
args[0] = 1; // devid = 1 (little-endian i64)
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btrfs_resize_max encodes devid by setting only args[0] = 1, which relies on little-endian layout and is hard to read. It would be safer/clearer to write the full 8-byte integer explicitly (e.g., 1i64.to_le_bytes() into args[0..8]) and keep the layout handling consistent with the struct comment.

Suggested change
args[0] = 1; // devid = 1 (little-endian i64)
args[0..8].copy_from_slice(&1i64.to_le_bytes()); // devid = 1

Copilot uses AI. Check for mistakes.
Comment on lines +282 to +287
// Step 2.5: Grow the Btrfs filesystem to fill the (possibly resized)
// block device. The host sparse image may have grown since the last
// boot (e.g. 64 GiB → 8 TiB upgrade). `BTRFS_IOC_RESIZE` with "max"
// is a no-op when the FS already fills the device, so this is safe to
// run unconditionally.
btrfs_resize_max(BTRFS_TEMP_MOUNT);
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btrfs_resize_max failures are only logged, but ensure_data_mount() still returns Ok(...) without surfacing that resize didn’t happen. Since this step is part of the upgrade story (64 GiB → 8 TiB), consider returning a note/result from btrfs_resize_max and appending it to the notes returned to the host (or failing the mount if resize errors are considered actionable).

Copilot uses AI. Check for mistakes.
Comment on lines +282 to +287
// Step 2.5: Grow the Btrfs filesystem to fill the (possibly resized)
// block device. The host sparse image may have grown since the last
// boot (e.g. 64 GiB → 8 TiB upgrade). `BTRFS_IOC_RESIZE` with "max"
// is a no-op when the FS already fills the device, so this is safe to
// run unconditionally.
btrfs_resize_max(BTRFS_TEMP_MOUNT);
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change introduces new boot-time behavior (waiting for /dev/vdb and issuing BTRFS_IOC_RESIZE), but there’s no automated coverage for it. Consider factoring out the ioctl-argument construction into a pure helper and unit-testing that it produces the expected 4096-byte layout (and/or adding an integration test behind a Linux-only flag) so regressions in the resize path are caught in CI.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants