Scale initial heap slots by eightbitraptor · Pull Request #3 · eightbitraptor/ruby

eightbitraptor · 2026-02-26T20:26:11Z

I ran the lobste.rs benchmark from Ruby bench and snapshotted the heap. Predictable it looks like a bimodal distribution around the 40 and 160 byte size pools, so I've altered the heap initialisation code to scale initial pages by the same distribution pattern whilst keeping within the RSS usage that the original 10k slots per heap was using.

This eliminates 3 GC's on Interpreter startup on my machine (and exposed a bug in objectspace that relied on internal objects being GC'd before the test runs).

Integer weights table encoding the bimodal object population shape observed in the lobsters benchmark. Two Gaussian modes: IMEMO peak at pool 0 and class/hash peak at pool 2.

… MiB

Converts total page budget and floor pages into per-pool slot counts using the bimodal weights table. No behavioral change yet.

Clamp floor_total so misconfigured env vars (floor*HEAP_COUNT > total_pages) degrade to floor_pages per pool rather than wrapping to near SIZE_MAX. Add comment on intentional slot-count overcount.

Replace uniform GC_HEAP_INIT_SLOTS=10000 per pool with proportional allocation from a bimodal page budget. Total RSS budget unchanged at ~12 MiB (195 pages at 64 KiB). Pools 0 and 2 get the majority of pages, matching observed IMEMO and class/hash populations.

The uniform default is replaced by gc_heap_compute_init_slots. The static initializer now uses { 0 } since objspace_init overwrites all entries via the bimodal distribution.

Users can scale the bimodal distribution up or down without changing the shape. Per-pool RUBY_GC_HEAP_N_INIT_SLOTS still overrides individual pools.

Three tests: - Verify bimodal shape (pool 0 > pool 4, pool 2 > pool 4) - Verify RUBY_GC_HEAP_INIT_TOTAL_PAGES scaling - Verify per-pool env vars override bimodal defaults

The page count is a budget for lazy allocation, not a fixed RSS cost.

heap_prepare previously force-allocated pages outside the GC budget when total_slots < init_slots. With bimodal init_slots giving pool 0 ~139k slots, this meant up to 85 pages allocated invisibly to the GC, breaking the invariant that free_slots + allocatable_slots predicts when GC triggers. Instead, seed objspace->heap_pages.allocatable_slots with the sum of all init_slots at startup. Pages are now allocated through normal budget accounting. The init_slots floor is still enforced by gc_sweep_finish_heap and gc_marks_finish for shrinkage prevention.

ObjectSpace.each_object already skips hidden objects directly, but it could still yield visible container objects that hold hidden internals. In that case, calling methods like Hash#inspect can raise NotImplementedError for a hidden T_ARRAY value. Add a lightweight direct-reference check for Array and Hash entries and skip containers that contain hidden/internal objects. This keeps hidden internals from leaking through enumeration and fixes iteration patterns that call inspect while traversing object space. This was exposed because the changes to the heap init changes the number of GC's that get run on ruby startup, leaving internal objects created during interpreter boot still in the heap until the first GC is run.

eightbitraptor added 11 commits February 26, 2026 11:12

Add bimodal heap init distribution weights table

581b8a8

Integer weights table encoding the bimodal object population shape observed in the lobsters benchmark. Two Gaussian modes: IMEMO peak at pool 0 and class/hash peak at pool 2.

Fix GC_HEAP_INIT_TOTAL_PAGES comment: 64 KiB pages give ~12 MiB not 3…

4b7e4fa

… MiB

Add gc_heap_compute_init_slots for bimodal distribution

569738f

Converts total page budget and floor pages into per-pool slot counts using the bimodal weights table. No behavioral change yet.

Guard gc_heap_compute_init_slots against underflow

861aed9

Clamp floor_total so misconfigured env vars (floor*HEAP_COUNT > total_pages) degrade to floor_pages per pool rather than wrapping to near SIZE_MAX. Add comment on intentional slot-count overcount.

Remove unused GC_HEAP_INIT_SLOTS macro

31537fc

The uniform default is replaced by gc_heap_compute_init_slots. The static initializer now uses { 0 } since objspace_init overwrites all entries via the bimodal distribution.

Add RUBY_GC_HEAP_INIT_TOTAL_PAGES/FLOOR_PAGES env vars

0debd0c

Users can scale the bimodal distribution up or down without changing the shape. Per-pool RUBY_GC_HEAP_N_INIT_SLOTS still overrides individual pools.

Add tests for bimodal heap init distribution

9ff3146

Three tests: - Verify bimodal shape (pool 0 > pool 4, pool 2 > pool 4) - Verify RUBY_GC_HEAP_INIT_TOTAL_PAGES scaling - Verify per-pool env vars override bimodal defaults

Remove misleading MiB comment from GC_HEAP_INIT_TOTAL_PAGES

35a8883

The page count is a budget for lazy allocation, not a fixed RSS cost.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scale initial heap slots#3

Scale initial heap slots#3
eightbitraptor wants to merge 11 commits intomasterfrom
mvh-scale-heap-initial-slots

eightbitraptor commented Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

eightbitraptor commented Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant