Open
Conversation
Integer weights table encoding the bimodal object population shape observed in the lobsters benchmark. Two Gaussian modes: IMEMO peak at pool 0 and class/hash peak at pool 2.
Converts total page budget and floor pages into per-pool slot counts using the bimodal weights table. No behavioral change yet.
Clamp floor_total so misconfigured env vars (floor*HEAP_COUNT > total_pages) degrade to floor_pages per pool rather than wrapping to near SIZE_MAX. Add comment on intentional slot-count overcount.
Replace uniform GC_HEAP_INIT_SLOTS=10000 per pool with proportional allocation from a bimodal page budget. Total RSS budget unchanged at ~12 MiB (195 pages at 64 KiB). Pools 0 and 2 get the majority of pages, matching observed IMEMO and class/hash populations.
The uniform default is replaced by gc_heap_compute_init_slots.
The static initializer now uses { 0 } since objspace_init
overwrites all entries via the bimodal distribution.
Users can scale the bimodal distribution up or down without changing the shape. Per-pool RUBY_GC_HEAP_N_INIT_SLOTS still overrides individual pools.
Three tests: - Verify bimodal shape (pool 0 > pool 4, pool 2 > pool 4) - Verify RUBY_GC_HEAP_INIT_TOTAL_PAGES scaling - Verify per-pool env vars override bimodal defaults
The page count is a budget for lazy allocation, not a fixed RSS cost.
heap_prepare previously force-allocated pages outside the GC budget when total_slots < init_slots. With bimodal init_slots giving pool 0 ~139k slots, this meant up to 85 pages allocated invisibly to the GC, breaking the invariant that free_slots + allocatable_slots predicts when GC triggers. Instead, seed objspace->heap_pages.allocatable_slots with the sum of all init_slots at startup. Pages are now allocated through normal budget accounting. The init_slots floor is still enforced by gc_sweep_finish_heap and gc_marks_finish for shrinkage prevention.
ObjectSpace.each_object already skips hidden objects directly, but it could still yield visible container objects that hold hidden internals. In that case, calling methods like Hash#inspect can raise NotImplementedError for a hidden T_ARRAY value. Add a lightweight direct-reference check for Array and Hash entries and skip containers that contain hidden/internal objects. This keeps hidden internals from leaking through enumeration and fixes iteration patterns that call inspect while traversing object space. This was exposed because the changes to the heap init changes the number of GC's that get run on ruby startup, leaving internal objects created during interpreter boot still in the heap until the first GC is run.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I ran the lobste.rs benchmark from Ruby bench and snapshotted the heap. Predictable it looks like a bimodal distribution around the 40 and 160 byte size pools, so I've altered the heap initialisation code to scale initial pages by the same distribution pattern whilst keeping within the RSS usage that the original 10k slots per heap was using.
This eliminates 3 GC's on Interpreter startup on my machine (and exposed a bug in objectspace that relied on internal objects being GC'd before the test runs).