You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
0e2d20 test: fix memfd-disabled.test.ts E2BIG on Linux (#29501)
Summary
The Blob-stdin test inlined a 64 KiB payload twice into the -e
script via JSON.stringify, yielding a 131,394-byte argv entry — over
Linux's MAX_ARG_STRLEN (32 × PAGE_SIZE = 128 KiB) — so posix_spawn
failed with E2BIG. Now the payload is generated inside the child; canUseMemfd has no size gate for in-memory Blobs so the same code path
is exercised.
Both tests asserted stderr === "", which fails on ASAN debug builds
because JSC prints WARNING: ASAN interferes with JSC signal handlers….
Added a stripAsanWarning filter (same approach as broadcast-channel-worker-gc.test.ts, fetch-abort-queued.test.ts,
etc).
Follow-up to #29465.
Test plan
Reproduced original failure on Linux: E2BIG: argument list too long, posix_spawn
Fixed test passes on Linux ASAN debug build (2 pass / 0 fail)
7e83f1 ci: don't compare binary size against release builds (#29500)
What does this PR do?
The binary-size step walks recent main commits looking for a build
that uploaded binary-sizes.json to use as a baseline. Release builds
upload that artifact too, so when the most recent main build is a
release build, every canary PR compares against release-mode sizes.
Windows release and canary binaries differ by several MB, so PRs
spuriously fail the 0.5 MB threshold.
Fix: pass --release to scripts/binary-size.ts when !options.canary
(same release-detection check used for Windows signing), record release: <bool> in the uploaded binary-sizes.json, and skip any
baseline whose release flag doesn't match the current build. Canary
PRs now only compare against canary baselines.
Release builds will generally show "no release comparison" since prior
releases are well outside the 15-commit lookback window — this is
intentional and preferable to the previous behavior of showing a
misleading several-MB delta vs canary. Release builds are --no-fail
regardless.
Old artifacts without the release field are treated as canary (the
common case), so existing baselines remain usable. The currently-stuck
release build on main self-resolves once the next canary build lands and
is found first in the newest-first commit walk.
How did you verify your code works?
Syntax/type checked both files, truth-tabled the (record.release ?? false) !== isRelease filter for all 6 release/canary/missing-field
combinations, and confirmed !options.canary matches the existing
release-detection pattern in ci.mjs. The script talks to Buildkite so
it can't be exercised end-to-end locally.
d398bd build: add $BUN_ZIG_PATH to override the vendored zig compiler (#29492)
What does this PR do?
Mirrors the existing $BUN_WEBKIT_PATH env override: when set, points at
an existing zig install (containing zig + lib/) and the zig_fetch ninja
edge is skipped. Path resolution handles ~ expansion and anchors
relative paths to the repo root so ninja's regen rule resolves the same
path as the initial configure.
I'm primarily interested in this for Arch Linux's bun package [1], which
does this today on 1.3.12 via a patch [2]. However, one can imagine
other use cases:
Worktree sharing (one compiler build across N worktrees, same reason
$BUN_WEBKIT_PATH exists).
Testing zig compiler forks/patches without cutting a release or
touching ZIG_COMMIT.
Air-gapped / restricted-network dev environments where the compiler is
pre-staged.
Configure-time validation checks that $BUN_ZIG_PATH/zig and
$BUN_ZIG_PATH/lib/ both exist, and emits a hint pointing at the likely
fix when they don't. Commit mismatch (user's zig differs from
ZIG_COMMIT) is the user's problem — build.zig will error loudly if the
compiler is too old for the options it receives.
Assisted-by: Claude Opus 4.7 <noreply@anthropic.com>
How did you verify your code works?
Verified by regenerating build.ninja via bun scripts/build.ts --configure-only in three scenarios on Linux x64:
BUN_ZIG_PATH unset (default) — build.ninja still contains the zig_fetch edge (build ../../vendor/zig/.zig-commit | ../../vendor/zig/zig: zig_fetch …). Regression check — default behavior
is unchanged.
BUN_ZIG_PATH=/tmp/nonexistent-zig — configure errors with BUN_ZIG_PATH='/tmp/nonexistent-zig' but no zig executable at /tmp/nonexistent-zig/zig plus the hint pointing at the likely fix.
BUN_ZIG_PATH=vendor/zig (after pre-fetching the compiler via ninja -C build/debug zig-compiler) — no zig_fetch edge in build.ninja,
but bun-zig.o still lists ../../vendor/zig/zig as an implicit input,
and ninja treats the existing file as a source.
Re-ran scenario 1 after scenario 3 to confirm the fetch edge is
re-emitted once the env var is unset — no sticky state.
Also typechecked with bunx tsc --noEmit -p scripts/build/tsconfig.json: no new errors introduced in zig.ts.
Did not run a full bun bd end-to-end with BUN_ZIG_PATH set; the zig_build ninja edge is unchanged by this patch (it only references zigExecutable, which both code paths produce), so the risk surface is
configure-time only.
7a7905 build: bump parallel zig to 65b29282, enable on Linux (#29491)
This reverts commit 55b62eff1cf78b19b8dc0271e5d76a415b18cae3.
f53ef3 build: lower minimum glibc requirement from 2.26 to 2.17 (#29461)
What
Lowers the Linux glibc floor from 2.26 → 2.17 (RHEL/CentOS 7, Amazon
Linux 1, aarch64 baseline).
Only three symbols in the current release binaries required > 2.17. All
three are handled with the same pattern: --wrap (or strong def) → dlsym glibc's real implementation at runtime when present, with a
well-defined fallback for older glibc.
Symbol
glibc
Handling
getrandom
2.25
--wrap → dlsym glibc's (preserves vDSO fast
path on ≥ 2.41); fall back to syscall(SYS_getrandom) on < 2.25. All
memfd_create requires kernel ≥ 3.17. A binary-level syscall audit (in
#29461) found that every Bun caller already falls back on error — Blob →
heap, spawn stdio → pipe, process IPC → socketpair — so kernel 3.10
(RHEL 7) works today, but every call retries the failing syscall.
This PR:
Caches ENOSYS in bun.sys.memfd_create so subsequent calls return
immediately
Adds BUN_FEATURE_FLAG_DISABLE_MEMFD to force the fallback (seccomp
environments, testing)
Tests that Blob and spawn-stdin work with the flag set
Fixes docs/installation.mdx: "minimum kernel 5.1" was never true
(the io_uring check it referenced has zero callers). Actual floor is
~3.10 with degraded atomicity.
Complements #29461 (glibc 2.17).
Test plan
CI: linux test suite passes with new memfd-disabled.test.ts
setOnCloseFromJS stored the callback in a jsc.Strong, which forms a
rooted cycle: source-wrapper → close_jsvalue Strong → bound #onClose
→ NativeReadableStreamSource (ReadableStreamInternals.ts:1972) → $stream private prop (:1959) → source-wrapper. Because a Strong is a
global GC root, the source survives even after every JS reference
(including the outer ReadableStream) is dropped. It only becomes
collectable when EOF/close runs the JS-side callClose (which clears $stream) or at VM shutdown.
The codegen already declares onCloseCallback in streams.classes.ts values; onDrain already uses its cached slot. Switch onClose to
the same WriteBarrier-backed storage and delete the Strong field. The
cycle becomes an ordinary intra-heap cycle that mark-sweep collects.
2. Windows non-lazy FileReader across-read ref
FileReader.onStart holds an incrementCount() until onReaderDone
only on the lazy path (always) or the POSIX non-lazy path. The Windows
non-lazy path — fromPipe, reached via Bun.spawn().stdout/.stderr —
did not. With the cycle fix above, the source is now collectable while a uv_read_start IOCP read is pending, and WindowsBufferedReader.deinit
would run with a live .pipe source whose data ptr is then
dereferenced by the queued onStreamRead. Add a Windows arm matching
the POSIX one.
3. Release the across-read ref in onReaderError too
onReaderDone checks waiting_for_onReaderDone and decrements; onReaderError did not, so a read that ends in error (rather than EOF)
leaked the ref taken in onStart. Pre-existing on the lazy and POSIX
paths; commit 2 adds a Windows arm that would inherit the same gap.
Mirror the release after pending.run().
Verification
On Windows, with a child that spawns a detached grandchild inheriting
stdout (so the pipe stays open after the direct child exits), repeatedly
accessing proc.stdout, dropping it, and forcing GC:
| | *ReadableStreamSource heap count after 30 iters | WindowsBufferedReader.deinit reached with live .pipe source |
|---|---|---|
| baseline | 31 (linear growth; 61 at 60 iters) | no — leak masks it |
| commit 1 only | ~14 (plateaus at live-grandchild count) | yes — FileReader.deinit sees src=pipe, closed=false |
| commits 1–3 | ~15 (plateaus; freed as pipes EOF; flat through 80
iters) | no |
Relation to #29440
Found while verifying the review comment on #29440 about WindowsBufferedReader.deinit ordering. That comment correctly
identified the buffer-free-before-detach as theoretical; this PR
explains why (the Strong cycle pinned the source) and fixes the
underlying leak plus the UAF that fixing the leak would have exposed.
f8d425 Migrate TCPSocket/TLSSocket from hasPendingActivity to jsc.JSRef (#29451)
97d9da ci(binary-size): drop release comparison column (#29468)
What
Removes the "release" comparison column from the binary-size CI
annotation. The table now compares only against canary (latest main).
Why
Tagged release builds are configured differently from canary/PR builds
(less debug code baked in), so they come out ~1–2 MB smaller. That makes
the release Δ column read as a constant "+1.x MB" on every PR regardless
of what the PR actually changed — it's noise that looks like signal. The
canary delta is what answers "did this PR grow the binary."
Details
Dropped releaseFallback hardcoded size table
Dropped fetchReleaseBaseline() (git ls-remote tag lookup)
Dropped release field from Row and the corresponding HTML column /
console output
Threshold check was already canary-only — unchanged
[skip size check] escape hatch — unchanged
Net: −42 lines in scripts/binary-size.ts.
40ffda deps(mimalloc): set MI_OVERRIDE=OFF on Windows (#29467)
Summary
Windows debug builds fail to link since the dev3 mimalloc bump (#29420 /
#29435):
lld-link: error: duplicate symbol: _expand
>>> defined at mimalloc-debug.lib(alloc.c.obj)
>>> defined at libucrtd.lib(expand.obj)
Root cause:mimalloc.ts never set MI_OVERRIDE for Windows — only
a comment claiming the upstream default was "no override". The actual
default is ON. Pre-dev3 this was harmless because alloc-override.c's _MSC_VER block was an empty comment ("cannot override malloc unless
using a dll"). Upstream microsoft/mimalloc#1259
/ #1263 filled it with real CRT symbol definitions (_expand, _msize, _msize_base, _free_base, free), so the static lib now exports them
and collides with the debug CRT.
Fix: explicit MI_OVERRIDE=OFF on Windows. Bun links the static CRT
and calls mi_* directly; nothing routes through CRT malloc, so
override has no benefit there. This restores the effective pre-dev3
behavior.
Not stale cache:dep_configure already uses cmake --fresh, so
the cache was correctly regenerated — it got ON because that's the
real default.
Why CI didn't catch it: all ci-* profiles use buildType: "Release" (/MT → libucrt.lib). The duplicate only fires under /MTd because libucrtd.lib's expand.obj is pulled in for its
debug-heap symbols. CI never builds Windows debug.
Test plan
bunx tsc --noEmit -p scripts/build/tsconfig.json
bun scripts/build.ts --configure-only
Windows debug: bun bd --version → 1.3.13-debug (previously
failed at link)
Verified build/debug/deps/mimalloc/CMakeCache.txt shows MI_OVERRIDE:BOOL=OFF after reconfigure
CI (release Windows — also flips ON→OFF; expected no-op since
pre-dev3 override did nothing on Windows static)
983ee6 debugger: block on a condvar instead of spinning while paused (#29438)
What does this PR do?
Fixes #21654 — Bun pegs one CPU core at 100% while paused at a
breakpoint (or debugger; statement) in VSCode / Cursor / debug.bun.sh.
Repro
// index.jsdebugger;
bun --inspect-wait=localhost:6499/ index.js
# attach any inspector client, let it stop at `debugger;`# → ~100% of one core for as long as you're paused
Root cause
When JSC pauses execution, it calls BunInspectorConnection::runWhilePaused on the JS thread, which looped:
while (!isDoneProcessingEvents) {
connection->receiveMessagesOnInspectorThread(...); // non-blocking, usually empty
}
receiveMessagesOnInspectorThread just swaps an almost-always-empty Vector under a lock and returns, so the loop spins at full speed.
There was already a jsWaitForMessageFromInspectorLock in the file
intended for this, but the waiting side was commented out and the lock
was only ever unlocked, never acquired.
Fix
Replace the spin with a WTF::Lock + WTF::Condition wait:
runWhilePaused drains pending messages from each connection, then
waits on the condition (with a 1-second safety-net timeout) until either
a new message arrives or a connection disconnects.
sendMessageToInspectorFromDebuggerThread, connect() and disconnect() notify the condition after updating state so the paused
thread wakes immediately — round-trips for Runtime.evaluate while
paused stay in the low-ms range.
anyConnectionHasPendingWork() re-checks each connection's queue
under pausedWaitLock before sleeping so wakeups can't be missed.
The single- and multi-connection branches are merged into one loop;
when every connection is gone we continueProgram() and exit instead of
looping forever.
The unused jsWaitForMessageFromInspectorLock member and its isLocked() / unlockFairly() dance are removed.
Verification
test/regression/issue/21654/21654.test.ts spawns a child with --inspect-wait, attaches over WebSocket, enables the debugger, hits a debugger; statement, sleeps 2 s while paused, then resumes. The child
reports its own process.cpuUsage() delta across the pause.
| | CPU while paused (2 s) | Runtime.evaluate round-trip while paused
|
| --- | --- | --- |
| before | ~100% | ~1 ms |
| after | <15% (debug+ASAN; ~0% release) | ~25 ms (debug+ASAN) |
The test asserts <50% CPU and <500 ms round-trip. Also manually verified
that closing the WebSocket while paused resumes the program.
e2fd5f Fix BroadcastChannel channelToContextIdentifier locking and dispatchMessage lifetime (#29441)
What does this PR do?
Fixes two data races in BroadcastChannel.cpp that surface as ASAN
heap-use-after-free in test/js/web/broadcastchannel/broadcast-channel-worker-gc.test.ts.
Bug A — channelToContextIdentifier HashMap one-sided locking
The prior ThreadSafeWeakPtr fix only covered allBroadcastChannels().
The second global, channelToContextIdentifier(), has its own lock —
but it was taken at only 1 of 4 call sites:
Site
Thread
Lock?
registerChannel.add()
main
❌
unregisterChannel.remove()
main
❌
dispatchMessageTo.get()
main
❌
contextIdForBroadcastChannelId.get()
worker (via
ensureOnContextThread → dispatchMessage)
✅
When main rehashes the HashMap (add/remove during worker
spawn/terminate) while a worker reads it, the worker walks a freed
bucket array → ASAN heap-UAF inside WTF::HashTable. The accessor was
also missing WTF_REQUIRES_LOCK, so -Wthread-safety never flagged
this.
Fix: add Locker locker { channelToContextIdentifierLock }; at the
three unlocked sites and annotate the accessor with WTF_REQUIRES_LOCK(channelToContextIdentifierLock) to match allBroadcastChannels().
Bug B — dispatchMessage captures raw this in async task
dispatchMessage posts a task with [this, message = ...] — raw this, no Ref { *this }. The caller (dispatchMessageTo's inner
lambda) holds a strong RefPtr from the ThreadSafeWeakPtr lookup, but
that ref is dropped when the outer lambda returns. During worker
terminate the JS wrapper is destroyed → refcount 0 → ~BroadcastChannel
→ the queued task reads freed this->m_isClosed and calls this->dispatchEvent().
Fix: capture protectedThis = Ref { *this } in the postTaskTo
lambda, matching the pattern in MessagePort.cpp, Performance.cpp,
and WebSocket.cpp.
How did you verify your code works?
bun bd test test/js/web/broadcastchannel/broadcast-channel-worker-gc.test.ts — 3/3
pass, verified stable across 3 consecutive runs under debug+ASAN
bun bd test test/js/web/broadcastchannel/broadcast-channel.test.ts —
10/11 pass; the one failure (broadcast channel worker wait) is
pre-existing on main under debug+ASAN (it uses Bun.sleepSync(500)
which isn't enough for an ASAN worker to start) and is unrelated to this
change
Test changes
Added a stress test that churns channel registrations (forcing HashMap
rehashes) while workers cross-post (reaching the worker-side map read),
then terminates workers mid-dispatch (leaving queued tasks whose this
would otherwise dangle).
Filtered the unconditional ASAN startup warning from child-process stderr so expect(stderr).toBe("") holds on ASAN builds — same
pattern as fetch-abort-queued.test.ts / string-decoder.test.js.
Scaled timeouts for isDebug || isASAN — worker spawn under
debug+ASAN is ~5–10× slower; the existing tests were borderline at the
5s default.
Note: both races are highly timing-dependent (a HashMap rehash must land
mid-get()); 20 local ASAN runs on macOS did not repro before the fix.
The new stress test maximises contention but is not guaranteed to fail
without the fix on every platform.
a96270 cron: don't report TerminationException as uncaught on worker terminate (#29457)
aa16dd sys(windows): don't panic on unnamed NTSTATUS in openDirAtWindowsNtPath (#29443)
3845ee Fix segfault in Bun.pathToFileURL when URL construction fails on Windows (#29448)
5dfa63 module_loader: remove undefined backing for bun:main source (#29450)
7e4774 css/small_list: fix tryGrow over-allocating by @sizeOf(T) on heap realloc (#29452)
8f2519 deps: replace cloudflare/zlib with zlib-ng 2.3.3 (#29433)
What does this PR do?
Replaces the cloudflare/zlib fork (last commit Oct 2023) with zlib-ng 2.3.3 in ZLIB_COMPAT
mode. zlib-ng is actively maintained, ships in Node 24+ and Chromium,
and provides runtime-dispatched SIMD across
AVX-512/AVX2/SSE2/NEON/SVE/RVV for CRC32, adler32, longest-match, and
chunk-copy.
Supersedes #16100, #8529.
Benchmarks
Xeon Platinum 8375C (Ice Lake, AVX-512), linux-x64 release build vs
system bun 1.3.13. Run with bench/snippets/zlib-comprehensive.mjs and bench/snippets/zlib.mjs (both included).
Operation
cloudflare
zlib-ng
Speedup
gzipSync html-128K L1
275 µs
107 µs
2.59x
gzipSync html-1M L1
2.23 ms
892 µs
2.50x
gzipSync json-128K L6
897 µs
483 µs
1.86x
deflate 123K L6 (async)
373 µs
68 µs
5.48x
gunzipSync html-1M
561 µs
522 µs
1.07x
gunzipSync binary-128K
31.6 µs
26.7 µs
1.18x
createGzip stream L1 1M
3.76 ms
2.68 ms
1.40x
createGunzip stream 1M
1.24 ms
1.18 ms
1.05x
fetch() 11KB gzip decode
42.9 µs
41.6 µs
parity
gzipSync 13B (init overhead)
5.04 µs
7.12 µs
0.71x
The streaming-inflate regression that blocked #16100 (Jan 2025, zlib-ng
pre-2.2) does not reproduce on 2.3.3. The only downside is ~2µs
higher per-stream init cost from larger state structs, amortized away on
payloads ≥4KB.
Compression ratio at level=6 is +0.4% vs cloudflare (different
match-finding heuristics). Negligible.
Security hardening
Built with -DWITH_INFLATE_STRICT=ON. zlib-ng commit 340f2f6e moved inflateBack()'s distance-too-far-back check behind a default-off #ifdef; upstream zlib has it unconditional. Bun doesn't call inflateBack(), but this hardens against heap OOB reads on malicious
raw-deflate with windowBits<15 for anything else linking the same lib,
at zero cost to inflate() proper.
Why pin to 2.3.3 (not develop)
Two regressions landed on zlib-ng develop after 2.3.3 that are not
present at this commit (documented in zlib.ts):
e5129cfe — deflateBound() hits __builtin_unreachable() after Z_FINISH
Re-audit before bumping past 2.3.3.
Build system changes
zlib-ng generates zlib.h at cmake-configure time into the build
dir (it doesn't exist in source). This required:
provides.includes → depBuildDir(cfg, "zlib") instead of source dir
libarchive's -I → build dir
fetchDeps now resolves to the cross-dep's build outputs (lib
files) instead of just the source .ref stamp, so libarchive's
configure waits for zlib's configure to have run. resolveDep() takes a
map of previously-resolved deps.
Drops 4 cloudflare-specific vendor patches.
How did you verify your code works?
linux-x64 release build: bun run build:release clean → smoke
test passes
bun bd test test/js/node/zlib/: deflate/gzip/inflate tests pass
(1 unrelated brotli timeout in debug — createBrotliCompress slowness,
untouched by this PR)
Build-graph ordering verified: build.ninja shows libarchive
configure has deps/zlib/libz.a as order-only input
Co-authored-by: root <root@ip-10-0-2-234.us-west-2.compute.internal>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
52f68d deps: bump mimalloc to 57029fb1 (upstream dev3 a3fb9498) (#29435)
Merges upstream dev3 (22 commits) into bun-dev3-v2. With our config
(MI_NO_OPT_ARCH=ON, MI_OSX_ZONE=OFF, MI_NO_PROCESS_DETACH=ON,
MI_OVERRIDE=OFF on macOS/Win), only the bitmap-purge restore fixes
(65d70e3c, d5861285) reach compiled code: when mi_arenas_try_purge
early-exits, freed slices not yet visited are now put back on the
purge bitmap instead of being lost, so the next scavenge cycle can
return them to the OS.
Also picks up oven-sh/mimalloc@809f7f32 which extends
MI_NO_PROCESS_DETACH (already set in #29420) to gate
_mi_auto_process_done itself, covering Windows mi_win_main /
.CRT$XPU paths in addition to the POSIX destructor.
11ffb7 blob: clamp stat.size to max_size to avoid @intCast panic in ReadFile (#29355)
What does this PR do?
Fixes a Zig safety-check panic in ReadFile.resolveSizeAndLastModified
(called from runAsyncWithFD) when fstat reports a size larger than maxInt(u52).
SizeType is u52. The outer @​truncate is dead code — the inner @​intCast to u52 runs first and panics with integerOutOfBounds
whenever the (non-negative) stat size exceeds maxInt(u52). Verified
with objdump that resolveSizeAndLastModified contained a single call
to debug.FullPanic.integerOutOfBounds mapping to this line.
The fuzzer hit this via an fd-based ReadFile task scheduled on the
thread pool in a prior REPRL iteration, which then ran after the fd
context changed. In a standalone run of the minimized script, doReadFile is never invoked — the crash depends on cross-iteration
thread-pool state, which is why it's extremely flaky and not directly
reproducible outside the REPRL harness.
Fix
Clamp stat.size to [0, Blob.max_size] before casting, so the @​intCast is always in range. Applied to both the POSIX (ReadFile)
and Windows (ReadFileUV) paths.
While here: set system_error when initCapacity fails in the POSIX
path so OOM propagates to JS as an error instead of being silently
treated as an empty read. This matches what ReadFileUV already does.
How did you verify your code works?
objdump confirms integerOutOfBounds is no longer emitted in resolveSizeAndLastModified (1 → 0 call sites).
bun bd test test/js/bun/util/bun-stdin-slice.test.ts passes (covers
fd-based ReadFile path).
bun bd test test/js/bun/util/bun-file-read.test.ts passes.
bun bd test test/js/bun/util/bun-file.test.ts passes.
Manual check: Bun.file(fd).text() on a regular file fd still works.
No new regression test is added because triggering the original panic
requires fstat to report a size > 4.5 PB, which is not achievable in
the test environment; the fix is verified structurally and the affected
code path is already covered by the existing stdin-slice tests.
ee51bb libarchive: keep upstream damaged-block retry semantics on the buffered path (#29430)
Follow-up to #29404.
Problem
nonblocking-read.patch routed upstream libarchive's pre-existing
damaged-block ARCHIVE_RETRY (bad header checksum → skip this block and
try the next one) through the same bun_retry label as a non-blocking
yield. That left both tar->header_in_progress and a->read_header_in_progress set, so the next archive_read_next_header() call skipped archive_entry_clear, archive_clear_error, ++file_count and tar_reset_header_state — a
behaviour change on the ordinary buffered extract path even when the
reader never returns ARCHIVE_RETRY.
Repro
A tarball shaped [pax 'g' global header][block with bad checksum][pax 'g' global header][regular file], installed via file:./pkg.tgz
(always buffered — PackageManagerTask.readAndExtract → Archiver.extractToDir):
upstream libarchive / this PR: the damaged block is consumed,
state is fully reset, the second g header is accepted, the file is
extracted.
main (64951540d5): seen_headers = seen_g_header leaks across the
retry → the second g header trips "Redundant 'g' header" → ARCHIVE_FATAL → error: Fail extracting tarball.
Fix
Make the format reader the authority on whether a header read is
mid-flight:
tar_read_header's bun_retry: label now sets a->read_header_in_progress = 1 explicitly.
The damaged-block branch clears it and exits via TAR_HEADER_RETURN(ARCHIVE_RETRY) (which also clears tar->header_in_progress), so the next call runs the full upstream
reset.
archive_read_next_header2 now only clears the flag on terminal
results instead of setting it on every ARCHIVE_RETRY.
archive_read_format_tar_read_header only takes the early return ARCHIVE_RETRY when tar->header_in_progress is still set; a
damaged-block retry falls through to the original post-read handling
(sparse-list add etc.), matching upstream.
The streaming path is unaffected — the existing drip-feed tests in bun-install-streaming-extract.test.ts still pass.
On the consume_header change (point 2 in the report)
Zeroing next_in/avail_in before inflateInit2(-15) is intentionally
left as-is: zlib's inflateInit2_ never reads either field (verified
against vendor/zlib/inflate.c), and gzip_filter_read re-primes them
from __archive_read_filter_ahead before every inflate(). Removing
the extra read-ahead is what lets consume_header avoid an ARCHIVE_RETRY after the header has already been consumed; it's a no-op
on the buffered path.
Verification
Fix is in patches/, so the usual git stash -- src/ gate doesn't
cover it. Verified manually:
# main's nonblocking-read.patch
bun bd test test/cli/install/bun-install-streaming-extract.test.ts -t damaged-block
(fail) buffered extract: damaged-block retry resets header state (upstream semantics)
error: expect(received).not.toContain(expected)
Expected to not contain: "Fail extracting tarball"
Received: "...error: Fail extracting tarball from damaged-pkg..."
# this PR's patch
bun bd test test/cli/install/bun-install-streaming-extract.test.ts -t damaged-block
(pass) buffered extract: damaged-block retry resets header state (upstream semantics)
Full bun-install-streaming-extract.test.ts (5 tests, both streaming
and buffered paths) and test/js/bun/archive.test.ts (99 tests) pass.
50be3c zig: hoist try out of tagged-union literals to avoid partial writes (#29422)
What
Fixes four sites where a tagged-union assignment of the form lhs = .{ .tag = <expr> } has early-exit control flow (try, catch return)
inside <expr>. Zig writes the union tag to the result location before evaluating the payload expression, so if the early-exit fires
the union is left with the new tag and the old/garbage payload bytes.
The fix everywhere is to hoist the fallible expression into a temporary
before assigning the union literal.
Sites fixed
src/http/Decompressor.zig — this.* = .{ .zlib = try Zlib.init(...) } (and brotli/zstd). On init failure the tag flips with
a garbage *Reader pointer; InternalState.reset() later calls decompressor.deinit() which dereferences it.
src/bun.js/webcore/Body.zig — this.* = .{ .Locked = .{ .readable = ...(try ReadableStream.fromJS(...)).?, ... } }. If fromJS
throws, Body.Value (heap state on Request/Response) is left as .Locked with garbage Strong/*JSGlobalObject; later body access or
GC finalize reads it.
src/bun.js/api/bun/socket/Listener.zig (Windows) — this.listener = .{ .namedPipe = listen(...) catch return throw(...) }
with errdefer this.deinit() registered. On listen failure errdefer
runs deinit(), which hits bun.assert(this.listener == .none) — but
the tag was already flipped to .namedPipe.
src/bun.js/api/JSBundler.zig — resolve.value = .{ .err = Msg.fromJS(...) catch { ...; return; } }. On JS exception the heap *Resolve/*Load is left with .err tag and garbage Msg.
Test plan
bun bd builds
bun run zig:check-all passes (Listener.zig change is
Windows-only)
2a3278 install: fix bunx @anthropic-ai/claude-code + add bunx claude alias (#29428)
What
bunx @​anthropic-ai/claude-code (2.1.113+) exits silently with code 1
instead of running the CLI.
Also adds bunx claude as a shorthand for bunx @​anthropic-ai/claude-code, matching the existing bunx tsc → typescript alias.
Why
Bun's native-binlink optimization (added for esbuild and @​anthropic-ai/claude-code in postinstall_optimizer.zig) skips the
package's postinstall and instead symlinks .bin/<name> directly into
the matching platform-specific optional dependency. It reused the parent
package's bintarget path when looking inside the platform
package, which only works if both lay the binary out the same way.
claude-code 2.1.113+: parent bin: {claude: "bin/claude.exe"} (a
no-shebang placeholder the postinstall normally replaces), but @​anthropic-ai/claude-code-linux-x64 ships the real binary at the package root as claude and has no bin field of its own.
So:
shouldIgnoreLifecycleScripts saw a matching platform
optionalDependency and skipped postinstall.
Fell back to linking the parent's placeholder stub.
bunx execve'd a shebang-less text file → ENOEXEC → silent exit 1.
Fix
src/install/bin.zig: when Bin.Linker is redirected into a
platform package (native binlink) and the root package's bin path
doesn't exist there, also try the root package's bin name at the
platform package root before abandoning the redirect. Both candidates
come straight from the root package's bin entry (value and key
respectively). If neither exists it still falls through to the existing
retry-without-redirect path.
src/cli/bunx_command.zig:bunx claude → @​anthropic-ai/claude-code (the npm package named claude is an
unrelated squatter with no bin). Also sets initial_bin_name = "claude" for the full package name so the fast-path lookup works.
Skipped when --package is explicitly given.
Verification
$ bunx claude --version
2.1.114 (Claude Code)
$ readlink node_modules/.bin/claude # after bun add
../@​anthropic-ai/claude-code-linux-x64/claude # was: ../@​anthropic-ai/claude-code/bin/claude.exe
Hermetic tests:
bun-install-native-binlink.test.ts: new fixture packages mirror the
claude-code layout for both hoisted and isolated linkers; existing
esbuild-style and pure-fallback tests still pass.
bunx.test.ts: mock-registry test confirms bunx claude requests @​anthropic-ai/claude-code, not the claude squatter.
649515 install: stream tarball extraction from HTTP into libarchive (#29404)
What
bun install now extracts package tarballs while they are still
downloading, instead of buffering the full .tgz and then the full
decompressed .tar in memory before handing both to libarchive.
How
Zig side (src/install/TarballStream.zig, NetworkTask.zig, runTasks.zig):
NetworkTask.forTarball enables the HTTP client's response_body_streaming signal (same mechanism fetch() uses).
NetworkTask.notify now runs once per body chunk on the HTTP thread.
On the first 2xx chunk it commits to streaming: each chunk is pushed
into a heap-held TarballStream and a drain task is scheduled on manager.thread_pool. Non-2xx / transport errors before the first chunk
fall back to the existing buffered path so retry and error reporting are
unchanged.
TarballStream owns the struct archive *, the open output bun.FD,
and a want_header/want_data phase. The drain task calls archive_read_next_header / archive_read_data_block until libarchive
reports ARCHIVE_RETRY (out of input), then returns — the worker is
released. The next chunk reschedules the drain task; because all
libarchive state lives on its own heap, the next call resumes exactly
where it stopped. No condvar, no extra thread pool.
Integrity is hashed incrementally (Integrity.Streaming) over the
compressed bytes and verified before the temp tree is promoted into the
cache.
extract_tarball.zig's rename-into-cache / package.json bookkeeping
was factored into moveToCacheDirectory so the streaming and buffered
extractors share it.
Upstream libarchive has no way for the client read callback to say "no
data yet" — any negative return sets filter->fatal = 1 and 0 sets filter->end_of_file = 1, both terminal. The patch teaches the read
path to propagate ARCHIVE_RETRY without poisoning state:
__archive_read_filter_ahead / advance_file_pointer: when the
reader returns ARCHIVE_RETRY, keep whatever is already in filter->buffer and surface ARCHIVE_RETRY via *avail instead of
setting fatal.
gzip filter: peek_at_header / consume_header / consume_trailer / gzip_filter_read propagate retry; a trailer_pending flag makes consume_trailer re-entry-safe.
tar reader: read_data and skip propagate retry. tar_read_header
pre-buffers extension-header payloads before consuming the block, hoists seen_headers/eof_fatal/err into struct tar behind header_in_progress, and _archive_read_next_header2 skips archive_entry_clear while a header read is in progress, so a retry
between a pax x/GNU L header and the real ustar header resumes
cleanly.
Gated behind BUN_FEATURE_FLAG_DISABLE_STREAMING_INSTALL (streaming on
by default).
Memory
Before: compressed_size (HTTP buffer) + decompressed_size
(zlib/libdeflate output) + libarchive internals per tarball.
After: only the in-flight HTTP chunk(s) plus libarchive's fixed
per-archive buffers. The full .tgz/.tar are never materialised.
Tests
test/cli/install/bun-install-streaming-extract.test.ts — drip-feeds a
~80 KB tarball (40 incompressible files + a >100-byte path that forces a
pax x header) in 1 KB chunks:
streaming path: every entry extracted byte-identically to the buffered
path, --verbose output confirms Streamed … tarball was taken.
buffered path with BUN_FEATURE_FLAG_DISABLE_STREAMING_INSTALL=1:
same output, no Streamed … line.
mismatched integrity: install fails before promoting the temp dir.
19635e fix(tls): race in root certificate initialization causing segfault (#29426)
What does this PR do?
Fixes a data race in us_internal_init_root_certs() that could segfault
or return truncated CA certificate lists when multiple threads (e.g.
Workers) hit the initialization path concurrently.
How did you verify your code works?
New test test/js/node/tls/node-tls-root-certs-concurrent-init.test.ts
— 16 Workers concurrently call tls.getCACertificates() while NODE_EXTRA_CA_CERTS points at a ~435-cert bundle.
Before — segfaults on every run:
panic: Segmentation fault at address 0x29A3A2C
and when it didn't segfault, workers observed wildly different counts (0
/ 83 / 145 / 303 / …).
After — all 16 Workers see the exact same, fully-populated list.
Root cause
if (std::atomic_load(&root_cert_instances_initialized) ==1)
return; // (3) reader skips herewhile (atomic_flag_test_and_set_explicit(&lock, …)) ;
if (!atomic_exchange(&root_cert_instances_initialized, 1)) { // (1) flag set TRUE here// (2) …but all the parsing / sk_X509_push / realloc happens AFTERfor (…) root_cert_instances[i] =parse(root_certs[i]);
root_extra_cert_instances=load_from_file(NODE_EXTRA_CA_CERTS);
us_load_system_certificates_*(&root_system_cert_instances);
}
Thread A sets initialized = 1 at (1), then starts the slow work at
(2). Thread B checks (3), sees initialized == 1, returns immediately,
and reads the STACK_OF(X509)* while thread A is still pushing to it. sk_X509_push reallocs the backing array as it grows — thread B reads
through a freed pointer, or gets a torn num/data pair, and hands
garbage to PEM_write_bio_X509 → deep BoringSSL X509/EC codepaths →
segfault at a near-null address.
The race also meant tls.getCACertificates("extra" | "system" | "default") could return a truncated snapshot that then got cached
forever at the JS level.
Fix
Replace the hand-rolled spinlock + premature flag with std::call_once,
which is exactly the primitive for one-time init: the first caller runs
the body, every concurrent caller blocks until it completes, and there
is a proper happens-before edge on return.
375680 fs: fix index-out-of-bounds in Windows readdir iterator (#29425)
What does this PR do?
Fixes a Windows-only panic in fs.readdir:
panic: index out of bounds: index 524288, len 257
dir_iterator.zig:327 fn next
dir_iterator.zig:428 fn next
node_fs.zig:4491 fn readdirWithEntries
node_fs.zig:4961 fn readdirInner
node_fs.zig:4430 fn readdir
Root cause
The Windows directory iterator copies each entry's name into a
fixed-size name_data buffer ([257]u16 for the UTF-16 path, [513]u8
for the UTF-8 path) using FileNameLength from the FILE_DIRECTORY_INFORMATION record as the slice bound:
constlength=dir_info.FileNameLength/2;
@​memcpy(self.name_data[0..length], ...); // <-- panic: index 524288, len 257
FileNameLength comes from the filesystem driver. In the crash report
it was 0x100000 (1 MiB) — far beyond the 255-WCHAR NTFS component
limit and well past the 8 KiB result buffer — which is only possible if
a third-party filesystem / filter driver (network redirector, virtual
FS, AV minifilter, etc.) returned a malformed entry. We trusted it
unconditionally and sliced past name_data.
Fix
Clamp FileNameLength / 2 to what fits in name_data (256
WCHARs) before slicing. NTFS caps a path component at 255 WCHARs so
legitimate names are unaffected; a malformed entry now yields a
truncated name instead of a panic.
Check rc before reading io.Information. The I/O manager only
fills the IO_STATUS_BLOCK on IRP completion — on an NT_ERROR status
the block is left untouched (see the matching comment in bun_shim_impl.zig: "IO_STATUS_BLOCK is filled only if
!NT_ERROR(status)"). Previously io was undefined and io.Information was read/assigned into self.end_index before the
status check, so a failed call could poison iterator state with stack
garbage or silently swallow an error as end-of-directory. The block is
now zero-initialized and the status checks run first.
How did you verify your code works?
This code path is Windows-only and requires a misbehaving filesystem
driver to trigger, so it cannot be reproduced in CI. Verified by code
tracing: the @​min bound guarantees name_len_u16 <= name_data.len - 1, which makes the self.name_data[0..name_len_u16] slice and
subsequent null-terminator write provably in-bounds. zig fmt --check
passes; Windows CI will confirm compilation.
fd41db [WebExtensions] Deprecate Native Client info in runtime API (#29491)
Add notes to runtime.PlatformInfo.nacl_arch and runtime.PlatformNaclArch
describing their deprecation. Google Chrome plans to run a deprecation
experiment removing runtime.PlatformInfo.nacl_arch on all platforms.
Shortly after conclussion of runtime.PlatformInfo.nacl_arch removal experiment
Chrome plans to remove enum runtime.PlatformNaclArch without any experiment.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Updated Packages