Harden bundled Copilot CLI extraction against corrupt/partial installs by jmoseley · Pull Request #1635 · github/copilot-sdk

jmoseley · 2026-06-11T21:00:18Z

Summary

On a Windows user's laptop, the GitHub Copilot desktop app's bundled gh.exe failed to launch with ERROR_BAD_EXE_FORMAT ("The specified executable is not a valid application for this OS platform"). Investigation traced the class of bug to non-atomic binary extraction that returns an unverified file. This PR applies the equivalent fix for the Copilot CLI binary that copilot-sdk bundles/extracts (the bundled-cli path in the Rust crate).

This is the "CLI in copilot-sdk" half of Tim Clem's ask. The "gh in the app" half is a separate PR in github/github-app. The safe-publish pattern here mirrors src-tauri/src/embedded_git.rs (stage in same-dir temp → write integrity marker → atomic fs::rename, with explicit Windows non-atomic-overwrite handling).

Root cause

rust/src/embeddedcli.rs extracted the embedded archive and wrote the binary with a non-atomic open → truncate → write directly on the live final path, then returned without re-verifying the written bytes. The "already installed?" fast path trusted a bare is_file() check. So an interrupted write, a multi-process race, or Windows Defender / ASR quarantining the freshly-written executable could leave a truncated or corrupt image that gets handed back as "good" — and fails to launch.

Fix

Atomic write — extract to a unique temp file (pid + counter + nanos) in the same target directory, write_all → flush → sync_all, set 0o755 on unix, then fs::rename onto the final path. rename overwrites atomically on POSIX; on Windows (which won't atomically overwrite) we remove-then-rename, with the race window guarded by re-verification and by accepting a peer's identical install.
Verify after publish — the bytes extracted from the trusted, signed embedded archive are the known-good reference. We verify the staged temp file and re-verify the published file byte-for-byte (size checked first), catching truncation or AV tampering between staging and publishing.
Trustworthy fast path — an existing install is reused only after a cheap re-check: an integrity marker recording the expected size plus a platform executable-image header check (PE MZ / Mach-O / ELF). A truncated, empty, or quarantined binary is re-extracted instead of trusted.
Retry + actionable fallback — re-extract/re-publish up to 3 times to ride out transient AV interference; if it still fails, surface a clear error ("bundled CLI appears blocked or corrupt … possibly quarantined by antivirus") instead of returning a broken path.

The change is scoped to the CLI extraction/install path. Public behavior is unchanged for the happy path; the installer is just no longer able to return a broken executable.

Tests

New unit tests in embeddedcli.rs cover:

temp + atomic-rename publish writes the exact bytes, records the marker, and leaves no temp files;
publish overwrites a stale existing binary;
corrupt / truncated / empty / unmarked / invalid-image installs are detected and rejected by the fast-path re-check;
post-write verification rejects both size and content mismatches.

The existing install_bundled_cli_* integration tests (which exercise the full extract → publish → verify → marker flow against the real downloaded archive and assert idempotency) continue to pass.

Validation

cargo +nightly fmt --check ✅
cargo clippy --all-targets --features test-support -- -D warnings -D clippy::unwrap_used … ✅ (both the skip-download and real-bundled paths)
cargo doc --no-deps --all-features with -D warnings ✅
cargo build (bundled, downloads real CLI) ✅
cargo test --lib embeddedcli + install_bundled_cli integration tests ✅

The embedded-CLI installer opened the final binary path with open→truncate→write and returned without re-verifying the written bytes. An interrupted write, a multi-process race, or Windows Defender/ASR quarantining the freshly-written executable could leave an invalid image that was handed back as "good" — surfacing downstream as ERROR_BAD_EXE_FORMAT ("not a valid application for this OS platform"). Replace the in-place write with a safe publish pattern: - Atomic write: extract to a unique temp file (pid + counter + nanos) in the *same* target directory, flush + fsync, set 0o755 on unix, then fs::rename onto the final path. rename overwrites atomically on POSIX; on Windows (no atomic overwrite) we remove-then-rename, with the race guarded by re-verification and peer-install acceptance. - Verify after publish: the bytes extracted from the trusted embedded archive are the known-good reference. Verify the staged temp file and again re-verify the published file byte-for-byte (size first), catching truncation or AV tampering between staging and publishing. - Trustworthy fast path: an existing install is reused only after a cheap re-check (integrity-marker size + executable-image header); a truncated, empty, or quarantined binary is re-extracted instead of trusted. - Retry + fallback: re-extract/re-publish up to 3 times to ride out transient AV interference, then surface a clear, actionable error ("appears blocked or corrupt ... possibly quarantined by antivirus") rather than returning a broken path. Adds unit tests covering the temp+rename publish, marker recording, detection of corrupt/truncated/unmarked/invalid-image installs, and post-write size/content verification. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

This PR hardens the Rust SDK’s bundled Copilot CLI extraction path (bundled-cli) so it does not return a truncated/corrupt executable after partial installs, races, or antivirus interference by staging to a temp file, verifying bytes, publishing via rename, and recording an integrity marker for a safer fast-path.

Changes:

Adds an integrity marker + executable-header check to validate an existing installed bundled CLI before reusing it.
Implements a staged write + fsync + atomic publish + post-publish verification flow with retries and clearer failure reporting.
Adds unit tests covering publish/overwrite behavior, fast-path rejection of corrupt installs, and verification failures.

Show a summary per file

File	Description
rust/src/lib.rs	Updates public docs for `install_bundled_cli()` to reflect integrity re-check behavior.
rust/src/embeddedcli.rs	Reworks bundled CLI install to stage+verify+publish with marker-based fast path and unit tests.

Copilot's findings

Files reviewed: 2/2 changed files
Comments generated: 1

+fn publish(tmp: &Path, final_path: &Path) -> Result<(), EmbeddedCliError> {
+    match fs::rename(tmp, final_path) {
+        Ok(()) => Ok(()),
+        Err(_) if final_path.exists() => {
+            let _ = fs::remove_file(final_path);
+            fs::rename(tmp, final_path)
+                .map_err(|e| EmbeddedCliError::new(EmbeddedCliErrorKind::Publish, e))
+        }
+        Err(e) => Err(EmbeddedCliError::new(EmbeddedCliErrorKind::Publish, e)),
+    }
+}


SteveSandersonMS · 2026-06-12T12:01:02Z

+//! The embedded bytes are part of the consumer's signed binary and therefore
+//! trusted *as the source of truth* — but the bytes that land on disk are not.
+//! A non-atomic write, a multi-process race, or antivirus quarantining the
+//! freshly-written executable can leave a truncated or corrupt image that, if


I might be wrong but I thought AV quarantining was generally done by setting some other flag in filesystem metadata, not modifying the bytes of the executable. Maybe some AV software differs.

Doesn't change the validity of the PR though

Update: looks like Windows Defender will actually move the file into a different dir if it quarantines it.

SteveSandersonMS · 2026-06-12T12:04:30Z

+/// image header for the current platform (PE on Windows, Mach-O on macOS,
+/// ELF elsewhere). Returns `false` on any I/O error or unrecognized header.
+#[cfg(any(has_bundled_cli, test))]
+fn looks_like_valid_image(path: &Path) -> bool {


If we wanted a more end-to-end problem detector we could consider something like detecting whether the spawn fails (that is, we never get to the point of having a valid handshake) and if so track a flag so that next time the app is launched, we do a full re-extraction. Then we wouldn't need so many other checks.

jmoseley requested a review from a team as a code owner June 11, 2026 21:00

Copilot AI review requested due to automatic review settings June 11, 2026 21:00

Copilot started reviewing on behalf of jmoseley June 11, 2026 21:00 View session

Copilot AI reviewed Jun 11, 2026

View reviewed changes

SteveSandersonMS reviewed Jun 12, 2026

View reviewed changes

SteveSandersonMS approved these changes Jun 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Harden bundled Copilot CLI extraction against corrupt/partial installs#1635

Harden bundled Copilot CLI extraction against corrupt/partial installs#1635
jmoseley wants to merge 1 commit into
mainfrom
jmoseley/harden-cli-extraction

jmoseley commented Jun 11, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

SteveSandersonMS Jun 12, 2026

Uh oh!

SteveSandersonMS Jun 12, 2026

Uh oh!

SteveSandersonMS Jun 12, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jmoseley commented Jun 11, 2026

Summary

Root cause

Fix

Tests

Validation

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

SteveSandersonMS Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

SteveSandersonMS Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

SteveSandersonMS Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

SteveSandersonMS Jun 12, 2026 •

edited

Loading