Skip to content

feat(compile+runtime): binary-size work — workspace override, nativeLibrary feature forwarding, panic=abort prebuilt, native-module vtable#5041

Merged
proggeramlug merged 4 commits into
mainfrom
feat/size-optimize-npm
Jun 12, 2026
Merged

feat(compile+runtime): binary-size work — workspace override, nativeLibrary feature forwarding, panic=abort prebuilt, native-module vtable#5041
proggeramlug merged 4 commits into
mainfrom
feat/size-optimize-npm

Conversation

@proggeramlug

Copy link
Copy Markdown
Contributor

Context

A 2D game (Bloom Jump) was shipping a 26MB macOS binary. Attribution showed two independent causes: (1) npm-installed perry never finds the workspace source, so auto-optimize silently links the prebuilt full stdlib — sqlite, aws-lc (incl. ML-DSA), tokio/hyper/rustls, ethers — into every binary (~8MB; even hello-world is 7.5MB); (2) no way to pass cargo features to perry.nativeLibrary crates, so engines can't offer lean build profiles. This PR fixes both and lays the substrate for making the runtime's dispatch machinery linker-strippable.

Commit 1 — compile driver + packaging

  • PERRY_WORKSPACE_ROOT env override in find_perry_workspace_root (resolve.rs). npm/homebrew installs place the binary outside the workspace; neither the exe walk nor the cwd walk can ever find the source tree. Validated, with a warning when it doesn't look like a workspace.
  • Fallback is no longer silent (optimized_libs.rs): linking the prebuilt full stdlib now prints a one-line note explaining the size impact and the fix.
  • [native-library."<pkg>"] feature forwarding (link/native_features.rs, new + hookup in link/mod.rs): perry.toml can now declare features / default-features per nativeLibrary package, mapped onto cargo --features / --no-default-features. Longest-key match for module specs, 3 unit tests. First consumer: feat(en-014): feature-gate models3d + image-extras for pure-2D games Bloom-Engine/engine#64 (2D games drop Jolt/glTF/non-PNG codecs).
  • Prebuilt panic=abort runtime variant (release-packages.yml, stage-npm.sh, library_search.rs): out-of-tree installs can't rebuild the runtime, so ship the abort profile and select it in the no-workspace fallback for runtime-only apps with no catch_unwind callers. Unix-only (Windows always links the unwind stdlib). Verified in a simulated npm layout: hello-world 8.9MB → 7.7MB (13.5%, in the documented 12-18% band).

Commit 2 — native-module vtable (runtime)

dispatch_native_module_method is a 643-arm match over (module, method) strings — one statically-reachable symbol that pins every module implementation, -dead_strip notwithstanding. Same for the own-field/Object.keys/has-in tables reachable from generic object paths.

This commit routes all of it through a NativeModuleVtable installed by the only paths that can mint native-module-backed values: js_create_native_module_namespace, the NativeModuleRef fast path, bound_native_callable_export_value, and the node_v8/perf_hooks NATIVE_MODULE_CLASS_ID allocators. Null vtable ⇒ no such value exists ⇒ generic paths fall through to default behavior. Two field_get_set.rs branch bodies relocated verbatim into native_module.rs impls. Cost: one relaxed atomic load + indirect call on paths that already string-match.

Known limitation (by design, documented in the commit): no size win lands yet, because populate_global_this_builtins eagerly creates console/process namespaces in every program (global_this.rs:4772), keeping the installer reachable. The follow-up that monetizes this substrate is splitting the dispatch match (and constant/keys/callable-export tables) per module, with a name→fn map referenced only by js_create_native_module_namespace and direct per-module constructors for the globalThis console/process creation — then no-stdlib programs pin only the console+process arms (est. 2-4MB strippable).

Testing

  • native_features unit tests (3) pass; runtime suite at parity with origin/main (the native_module_stream combined-filter failures pre-exist upstream: 11 passed/2 failed on pristine main vs 12/1 on this branch; both pass in isolation).
  • node:fs namespace dispatch round-trip (write/read/exists) through the vtable.
  • Simulated npm install (binary + archives, no workspace): runtime-only link, abort variant auto-selected, binary runs.
  • Bloom Jump end-to-end: 26.1MB → 17.2MB (with the engine-side feature gates), renders + audio verified.

Measured results

binary before after
hello-world (npm fallback) 8.9MB unwind / full stdlib at 0.5.1125 7.7MB abort, runtime-only
Bloom Jump (macOS) 26.1MB 17.2MB

Ralph Küpper added 4 commits June 12, 2026 15:13
…rary feature forwarding

- PERRY_WORKSPACE_ROOT env override for workspace discovery: npm/homebrew
  installs place the binary outside the workspace, so auto-optimize
  silently fell back to the prebuilt full stdlib (sqlite/crypto/tokio —
  ~8MB of undeadstrippable code via the dynamic dispatch table). The
  fallback note is no longer verbose-gated.
- [native-library."<pkg>"] perry.toml table forwards cargo features /
  --no-default-features to perry.nativeLibrary crate builds (e.g. a
  2D-only build of a 2D+3D engine). New link/native_features.rs + tests.
- Ship a panic=abort prebuilt runtime variant (libperry_runtime_abort.a,
  release-packages.yml + stage-npm.sh) and select it in the no-workspace
  fallback for runtime-only apps with no catch_unwind callers: hello-world
  8.9MB -> 7.7MB out of the box.
- Step 1 of dispatch devirtualization: route NATIVE_MODULE_CLASS_ID method
  calls through a fn-pointer hook installed by
  js_create_native_module_namespace instead of statically referencing
  dispatch_native_module_method from the generic call path. Correctness
  verified (node:fs namespace dispatch works); the size win lands when the
  remaining native-module surface (method-value/get-field tables) gets the
  same vtable treatment.

Measured on Bloom Jump (with engine-side feature gates): 26.1MB -> 17.3MB.
…c object paths (step 2 of dispatch devirtualization)

Replaces the step-1 single dispatch hook with a NativeModuleVtable
covering every native-module behavior reachable from always-linked
generic paths: method dispatch, own-field reads (relocated
vt_get_own_field), Object.keys (relocated vt_own_keys_array), and
has/in checks. Installed by js_create_native_module_namespace, the
NativeModuleRef fast path (js_native_module_property_by_name),
bound_native_callable_export_value, and the node_v8/perf_hooks
NATIVE_MODULE_CLASS_ID allocators — null vtable means no namespace
value can exist, and generic paths fall through to default behavior.

Correctness verified: node:fs namespace dispatch round-trip, runtime
test suite at parity with origin/main (the native_module_stream
combined-filter failures pre-exist on main: 11/2 there vs 12/1 here).

NOTE: no binary-size win lands yet. populate_global_this_builtins
eagerly creates console/process namespaces in every program
(global_this.rs:4772), keeping the installer — and the (module,
method) tables — statically reachable. The follow-up that monetizes
this substrate is splitting dispatch_native_module_method (and the
constant/keys/callable-export tables) per module with a name→fn map
referenced only by js_create_native_module_namespace, and giving the
globalThis console/process creation direct per-module constructors.
Then a no-stdlib-import program pins only console+process arms.
Move the feature-forwarding cargo-args logic into native_features.rs
(apply_native_library_override) and extract the inline
optional_framework_dir_tests mod to a sibling file — the forwarding
block had pushed link/mod.rs from 1996 to 2024 lines.
@proggeramlug proggeramlug merged commit 6298e88 into main Jun 12, 2026
13 checks passed
@proggeramlug proggeramlug deleted the feat/size-optimize-npm branch June 12, 2026 17:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant