Cross-platform packager: OCI symlink resolution, rewrite-include, rtld_audit removal#741
Conversation
Add skipped_addrs output parameter to hook_syscalls_in_elf for reporting unpatchable syscall locations. Add no_std/alloc support with BTreeSet replacing HashSet for deterministic iteration. Add Bun executable detection (UnsupportedBunExecutable error), ET_REL object file rejection, phdr alignment fixup, fork-to-vfork patching, UD2 replacement for unpatchable syscalls, found_any tracking to move NoSyscallInstructionsFound to inner function, and println! removal. Add patch_code_segment public API for in-place code patching. Add unit tests for new functionality and update snapshot tests with address normalization. Update all callers (runner, optee, packager) to 3-arg API.
Reject unsupported Bun executables during packaging, keep rewritten trailer semantics compatible with the loader, and turn malformed rewrite failures into explicit errors.
Match CI rustfmt output for the x86 trampoline comments in the rewritten ELF path.
35d845d to
434eb38
Compare
7f33b78 to
aa7e539
Compare
| std::path::Component::ParentDir => { | ||
| result.pop(); | ||
| } | ||
| std::path::Component::CurDir | std::path::Component::RootDir => {} |
There was a problem hiding this comment.
We might need std::path::Component::Prefix(_) => {} for Windows hosts.
jaybosamiya-ms
left a comment
There was a problem hiding this comment.
I haven't done a deep review of this PR, since it does not seem to require an in-depth review afaict. It is also an overlong PR. Nonetheless, I've added a few inline comments.
Also top-level comment: please simplify the PR description to make sure we don't have overlong descriptions showing up in the squashed history.
Also, maybe a better title is just "Cross-platform packager"?
…riants, use ICEBP;HLT trap - Change hook_syscalls_in_elf and patch_code_segment to return (Vec<u8>, Vec<u64>) instead of taking &mut Vec<u64> for skipped addresses - Rename UnsupportedBunExecutable to UnsupportedExecutable(String) and add context to UnsupportedObjectFile(String) - Add PatchError variant for address arithmetic failures (previously ParseError) - Replace UD2 (0F 0B) with ICEBP;HLT (F1 F4) for poisoned syscalls to avoid confusion with unreachable!() paths - Make anyhow/clap features use dep: syntax in Cargo.toml - Update all callers in packager, runner, and tests
…encoding New trampoline format changes for the syscall rewriter: Rewriter (litebox_syscall_rewriter): - Add redzone reservation (LEA RSP,[RSP-0x80]) before syscall callback entry on x86-64, allowing the callback to use the 128-byte red zone - Add R11 restart address (LEA R11,[RIP+disp32]) pointing back to the call-site JMP, enabling SA_RESTART signal re-execution - Re-encode RIP-relative memory operands in pre-syscall instructions when they are copied to the trampoline, using iced_x86::Encoder at the trampoline IP so displacements remain correct - Guard post-syscall instructions with RIP-relative operands by delegating to hook_syscall_before_and_after instead of raw-copying - Append header-only marker (trampoline_size=0) when no syscall instructions are found, so the loader can distinguish checked binaries from unpatched ones - Add 5 inline unit tests for Bun detection and RIP-relative encoding Loader (litebox_common_linux): - Handle trampoline_size==0 as a valid no-op (checked, no syscalls) - Add UnpatchedBinary error variant for binaries missing the magic - Add has_trampoline() accessor Platform/shim (litebox_platform_linux_userland): - Add saved_r11 TLS slot and save R11 on syscall callback entry - Add syscall_callback_redzone entry point that undoes red zone reservation before saving registers - Return syscall_callback_redzone from get_syscall_entry_point() Shim loader (litebox_shim_linux): - Treat UnpatchedBinary as non-fatal in parse_trampoline calls, allowing unpatched binaries to load without a trampoline
- Gate syscall_callback_redzone behind #[cfg(target_arch = "x86_64")] on Linux since the asm symbol only exists in the x86_64 asm block, fixing the i686 linker error. - Add syscall_callback_redzone entry point to the Windows platform so the new trampoline format (with redzone reservation) works correctly on the Windows emulator. Uses mov+add to SCRATCH to avoid clobbering rax. - Fix rustfmt import ordering in litebox_shim_linux/src/loader/elf.rs.
Add runtime syscall patching in the shim's mmap hook: when an ELF segment with PROT_EXEC is mapped, patch syscall instructions in-place and set up a trampoline region. The loader also patches the main binary at load time when it lacks a trampoline. Remove rtld_audit entirely: gut build.rs, remove the audit .so injection from the runner, and remove the REQUIRE_RTLD_AUDIT global. Supporting changes: - Add ReadAt impl for &[u8] in litebox_common_linux - Hook finalize_elf_patch into sys_close to mprotect trampolines RX - Add elf_patch_cache on GlobalState and suppress_elf_runtime_patch on Task - Update ratchet test (runner has zero globals now)
…with new trampoline format)
…UserPointer The new trampoline format loads a restart address into R11 (for SA_RESTART) before jumping to the callback. On Windows, the TLS index lookup clobbers R11, so we temporarily stash R11 in the per-thread TEB.ArbitraryUserPointer slot (gs:[0x28]) for the ~20 instructions of inline asm between callback entry and pt_regs save. Also removes the dead syscall_callback entry point (only syscall_callback_redzone is used since get_syscall_entry_point always returns the redzone variant).
… discriminate rewriter errors - Remove litebox_rtld_audit/ directory entirely (Makefile, rtld_audit.c, .gitignore) - Replace litebox_packager/build.rs with no-op (was building rtld_audit.so) - Remove rtld_audit tar entry from litebox_packager/src/lib.rs - Remove fixup_env and set_load_filter from both Linux and LoW runners - Fix RFLAGS clobber on Windows: use lea+mov instead of mov+add - Simplify is_at_syscall_callback: x86 checks syscall_callback, x86_64 checks syscall_callback_redzone - Discriminate trampoline parse errors: only UnpatchedBinary triggers runtime patching - Discriminate rewriter errors: expected non-fatal vs unexpected with logging - Restore fork-vfork patch error path from PR 1c - Simplify suppress_elf_runtime_patch logic - Clean up rtld_audit references in comments across codebase
…ck, add x86_64 comment, add post-syscall RIP-relative comment, fix formatting
…d guards, remove unused ElfPatchState fields
Deleting litebox_runner_linux_userland/build.rs (rtld_audit removal) also
removed Cargo's OUT_DIR env var from integration tests. Replace the three
call sites with env!("CARGO_TARGET_TMPDIR"), a compile-time macro
available since Rust 1.68 that requires no build.rs.
- Use EINVAL instead of ENODATA for trampoline parse failures (loader.rs) - Handle UnpatchedBinary as non-fatal in OptEE ELF loader (optee/elf.rs) - Document R11 restart-address contract in rewriter (lib.rs) - Replace unchecked arithmetic with checked_add_u64 in rewriter (lib.rs) - Rename saved_r11 to saved_restart_addr in Linux userland TLS (lib.rs) - Store RFLAGS from stack ([rsp+88]) instead of TLS in Linux/Windows userland pt_regs->r11 (lib.rs) - Save R11 restart address to TlsState on Windows userland (lib.rs) - Add cleanup-leak TODO comment in PatchedMapper::map_file (elf.rs) - Restore trampoline RX on mprotect failure path (mm.rs) - Make check_trampoline_magic pointer-width aware (mm.rs) - Validate e_phentsize before parsing program headers (mm.rs) - Clarify elf_patch_cache lock scope comment (mm.rs) - Finalize ELF patch for implicitly-closed fd in dup2/dup3 (file.rs)
ecf9130 to
15eb87e
Compare
- Add fork_to_vfork_patch computation to metadata extraction block - Rename replace_with_ud2 -> replace_with_trap (pr1c rename) - Adapt hook_syscalls_in_elf callers to (Vec<u8>, Vec<u64>) return type - Adapt patch_code_segment callers to 4-arg signature - Update error variant names (UnsupportedExecutable, UnsupportedObjectFile) - Remove dead has_bun_footer_marker (pr1c uses ends_with directly) - Run cargo fmt
15eb87e to
8ea3756
Compare
…olution, rewrite-include flag Make litebox_packager compile and work on non-Linux hosts (primarily Windows) by: - Remove #![cfg(target_os = "linux")] crate-level gate and the dual-main pattern; gate only the host-mode code path behind cfg(target_os) - Add file_mode() helper with unix/non-unix variants to replace MetadataExt::mode() calls - Extract run_host_mode() behind #[cfg(target_os = "linux")] - Track OCI layer symlinks in-memory instead of creating OS symlinks (Windows requires special privileges for symlinks); materialize them after all layers are extracted via resolve_symlink_in_rootfs() - Add is_unix_absolute(), strip_unix_root(), normalize_path() helpers for cross-platform path handling - Force linux/amd64 platform when pulling OCI images - Normalize path separators to Unix-style in tar entries - Add --rewrite-include CLI flag for dlopen'd libraries - Change Bun executable detection from warning to hard error - Switch tar headers from GNU to UStar format
…tion
Two bugs found during review:
1. Opaque whiteouts (.wh..wh..opq) and regular whiteouts (.wh.<name>)
removed files from disk but did not prune corresponding entries from
the in-memory symlinks vec. This caused materialize_symlinks() to
resurrect deleted symlinks that a later layer intended to remove.
2. resolve_symlink_in_rootfs() could return Some(rootfs) when a
degenerate symlink target with excess .. segments normalized to an
empty path via normalize_path(). rootfs.join("") == rootfs, which
exists as a directory, causing the entire rootfs to be treated as
a resolution target. Guard against empty rel_path at function entry.
- Store Unix permission modes from tar headers in a HashMap during extraction, so permission bits are accurate on non-Unix hosts (Windows) instead of relying on the file_mode() heuristic which returns wrong answers (0o755 for most files). - Build symlink_map once in pull_and_extract and pass through to materialize_symlinks and scan_rootfs (was duplicated in both). - Add lookup_mode() helper that prefers tar header permissions, falls back to file_mode(), defaults to 0o644. - Add existence check for --rewrite-include in finalize_tar (was missing). - Remove redundant --include/--rewrite-include parsing from run_host_mode. - Replace Bun test with rewrite_elf_skips_non_elf_files test. - Add 22 unit tests for normalize_path, is_unix_absolute, strip_unix_root, resolve_symlink_in_rootfs, and lookup_mode. - Add litebox_packager to build_and_test_windows CI job.
… and Windows colon parsing
- Normalize all tar entry paths with normalize_path() to prevent path
traversal via absolute paths, ../ escape, and ./prefix inconsistency
- Normalize hard link source paths for the same protection
- Fix root-level opaque whiteout where Path::starts_with("") matches
all paths, wiping all in-memory symlinks and permissions
- Handle Windows drive letter colons (C:\path) in parse_include so
--include/--rewrite-include work correctly on Windows
- Merge --rewrite-include into --include (auto-detect ELF via magic bytes) - Make --include host-mode-only (conflicts_with oci_image) - Remove file_mode() helper; inline MetadataExt::mode() in Linux-only paths - Simplify lookup_mode() to use OCI permissions map only (no host FS fallback) - Add Component::Prefix(_) to normalize_path for Windows path completeness - Remove redundant path.is_absolute() fallback from is_unix_absolute - Consolidate and prune tests in oci.rs - Fix stale UnsupportedBunExecutable match arm from rebase
aa7e539 to
74b27eb
Compare
|
|
||
| for entry in entries { | ||
| let mut header = Header::new_gnu(); | ||
| let mut header = Header::new_ustar(); |
There was a problem hiding this comment.
Do we need to use USTAR? It does support up to 255-byte path lengths by default.
| // A later layer may override this symlink, so remove any stale | ||
| // entry with the same rel_path. | ||
| symlinks.retain(|s| s.rel_path != path); | ||
| symlinks.push(DeferredSymlink { | ||
| rel_path: path.clone(), | ||
| link_target, | ||
| }); |
There was a problem hiding this comment.
A symlink can be overridden by not only a symlink but also any regular file/directory.
| .map(|m| m.mode()) | ||
| .unwrap_or(0o755) | ||
| }; | ||
| let rewritten = rewrite_elf(&data, &inc.host_path, args.verbose)?; |
There was a problem hiding this comment.
This ignores --no-rewrite.
| use std::os::unix::fs::MetadataExt as _; | ||
| std::fs::metadata(&inc.host_path) | ||
| .map(|m| m.mode()) | ||
| .unwrap_or(0o755) |
There was a problem hiding this comment.
lookup_mode in oci.rs uses 0o644 as a default value.
| .into_owned(); | ||
| // A later layer may override this symlink, so remove any stale | ||
| // entry with the same rel_path. | ||
| symlinks.retain(|s| s.rel_path != path); |
There was a problem hiding this comment.
nits. this linearly scans all symlinks.
jaybosamiya-ms
left a comment
There was a problem hiding this comment.
A couple of minor comments.
Also, reminder for earlier top-level comment #741 (review)
| // Always pull linux/amd64 images regardless of host platform. | ||
| platform_resolver: Some(Box::new(|entries| { | ||
| entries | ||
| .iter() | ||
| .find(|entry| { | ||
| entry.platform.as_ref().is_some_and(|p| { | ||
| p.os == "linux".into() && p.architecture == "amd64".into() | ||
| }) | ||
| }) | ||
| .map(|e| e.digest.clone()) | ||
| })), |
There was a problem hiding this comment.
Shouldn't you pull linux image, but the arch should match the actual host you are running on? Otherwise, running this on an M-series MacBook might be quite surprising, no?
| pub fn scan_rootfs(rootfs: &Path, verbose: bool) -> anyhow::Result<RootfsFileMap> { | ||
| /// `permissions` provides Unix permission modes captured from tar headers | ||
| /// during extraction, so permission bits are accurate on non-Unix hosts. | ||
| #[allow(clippy::implicit_hasher)] |
There was a problem hiding this comment.
Nit: prefer expect over allow
b12f153 to
eab82b4
Compare
|
This PR is replaced by PR #815 |
Summary
Make
litebox_packagercompile and work on non-Linux hosts (primarily Windows) for the OCI image packaging path.#![cfg(target_os = "linux")]gate; gate only ldd-based host mode behind#[cfg(target_os = "linux")]file_mode()helper (MetadataExt::mode()on Unix, permission-based heuristic elsewhere)DeferredSymlink) instead of creating OS symlinks (which require special privileges on Windows). Resolve symlink chains through an in-memory map and materialize as file copies or directory placeholders--rewrite-includeCLI flag for dlopen'd libraries (e.g., NSS modules) that aren't discovered by the automatic dependency scanlinux/amd64images regardless of host platform\to/in tar entry paths for Windows compatibilitybuild.rs(rtld_audit fully removed from packager)Bug fixes (not in feature branch)
.wh..wh..opq) and regular whiteouts (.wh.<name>) removed files from disk but didn't prune corresponding entries from the in-memorysymlinksvec, causingmaterialize_symlinks()to resurrect deleted symlinksnormalize_path()could return an emptyPathBuffor targets with excess..segments, causingresolve_symlink_in_rootfs()to match the rootfs directory itselfStack
Test results