fix(engine): release managed threadvars on worker-thread exit#891
fix(engine): release managed threadvars on worker-thread exit#891frostney wants to merge 6 commits into
Conversation
FPC does not auto-finalize managed threadvars at thread exit, so every builtin and value type's cached member-definition arrays (FStaticMembers / FPrototypeMembers) and the #805/#806 input memos leaked on each worker-thread exit. Add Goccia.ThreadCleanupRegistry: each leaking unit registers a small ClearThreadvarMembers proc once in its initialization; ShutdownThreadRuntime drains the registry on every worker thread and the registry's finalization drains it on the main thread, so each managed threadvar is released on both teardown paths. The two memos keep their finalization and are cleared on the worker path via explicit ShutdownThreadRuntime calls. Also close two pre-existing main-thread finalization gaps (ImportMeta had none; TimeZone's did not call ClearTimeZoneCache). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub. 1 Skipped Deployment
|
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughIntroduces ChangesThread-local cleanup registry
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
Comment |
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Suite TimingTest Runner (interpreted: 10,962 passed; bytecode: 10,962 passed)
MemoryGC rows aggregate the main thread plus all worker thread-local GCs. Test runner worker shutdown frees thread-local heaps in bulk; that shutdown reclamation is not counted as GC collections or collected objects.
Benchmarks (interpreted: 436; bytecode: 436)
MemoryGC rows aggregate the main thread plus all worker thread-local GCs. Benchmark runner performs explicit between-file collections, so collection and collected-object counts can be much higher than the test runner.
Measured on ubuntu-latest x64. |
Benchmark Results436 benchmarks · PR vs same-runner Interpreted: 🟢 31 improved · 🔴 43 regressed · 362 unchanged · avg +0.9% Typical per-run noise (median variance): interpreted ±2.7%, bytecode ±2.5%. Deltas within noise overlap and read as unchanged. arraybuffer.js — Interp: 14 unch. · avg +0.2% · Bytecode: 14 unch. · avg -0.1%
arrays.js — Interp: 🟢 7, 12 unch. · avg +2.1% · Bytecode: 🟢 1, 🔴 2, 16 unch. · avg +0.2%
async-await.js — Interp: 🔴 1, 5 unch. · avg -0.1% · Bytecode: 🟢 2, 4 unch. · avg +4.5%
async-generators.js — Interp: 2 unch. · avg +0.1% · Bytecode: 2 unch. · avg +1.8%
atomics.js — Interp: 🔴 1, 5 unch. · avg -0.3% · Bytecode: 6 unch. · avg +2.4%
base64.js — Interp: 🔴 2, 8 unch. · avg -0.9% · Bytecode: 🟢 2, 🔴 1, 7 unch. · avg +1.1%
classes.js — Interp: 🔴 4, 27 unch. · avg -0.3% · Bytecode: 🟢 2, 🔴 7, 22 unch. · avg -0.6%
closures.js — Interp: 11 unch. · avg -1.3% · Bytecode: 11 unch. · avg -1.4%
collections.js — Interp: 🟢 1, 11 unch. · avg +3.4% · Bytecode: 🟢 2, 10 unch. · avg -0.5%
csv.js — Interp: 🔴 2, 11 unch. · avg -0.8% · Bytecode: 🟢 1, 🔴 2, 10 unch. · avg -1.8%
destructuring.js — Interp: 🟢 2, 🔴 1, 19 unch. · avg -0.6% · Bytecode: 🟢 4, 18 unch. · avg +1.0%
fibonacci.js — Interp: 8 unch. · avg -0.4% · Bytecode: 8 unch. · avg +0.0%
float16array.js — Interp: 🟢 1, 🔴 9, 22 unch. · avg -1.3% · Bytecode: 🟢 3, 🔴 3, 26 unch. · avg +1.5%
for-in/for-in.js — Interp: 🔴 2, 1 unch. · avg -5.9% · Bytecode: 3 unch. · avg +2.1%
for-of.js — Interp: 7 unch. · avg -2.7% · Bytecode: 🟢 1, 6 unch. · avg +0.1%
generators.js — Interp: 4 unch. · avg +5.8% · Bytecode: 🔴 2, 2 unch. · avg -1.3%
intl.js — Interp: 6 unch. · avg +1.5% · Bytecode: 🔴 2, 4 unch. · avg -4.0%
iterators.js — Interp: 🟢 1, 🔴 4, 37 unch. · avg -2.4% · Bytecode: 🟢 7, 🔴 2, 33 unch. · avg +1.8%
json.js — Interp: 🔴 2, 21 unch. · avg -0.7% · Bytecode: 🟢 4, 🔴 4, 15 unch. · avg -0.2%
jsx.jsx — Interp: 🟢 1, 🔴 1, 19 unch. · avg -1.5% · Bytecode: 🔴 7, 14 unch. · avg -6.7%
modules.js — Interp: 9 unch. · avg +2.1% · Bytecode: 🔴 1, 8 unch. · avg -0.1%
numbers.js — Interp: 🔴 1, 10 unch. · avg -3.3% · Bytecode: 11 unch. · avg -0.2%
objects.js — Interp: 🟢 1, 6 unch. · avg +3.5% · Bytecode: 🔴 1, 6 unch. · avg -0.9%
promises.js — Interp: 12 unch. · avg +1.9% · Bytecode: 🔴 1, 11 unch. · avg -1.3%
property-access.js — Interp: 🔴 1, 4 unch. · avg -4.5% · Bytecode: 5 unch. · avg +0.4%
regexp.js — Interp: 🟢 2, 🔴 1, 8 unch. · avg +2.0% · Bytecode: 🔴 1, 10 unch. · avg +0.6%
strings.js — Interp: 🔴 2, 17 unch. · avg -1.7% · Bytecode: 🟢 1, 🔴 1, 17 unch. · avg +1.5%
temporal.js — Interp: 🔴 1, 5 unch. · avg -0.4% · Bytecode: 6 unch. · avg -2.2%
tsv.js — Interp: 🟢 3, 6 unch. · avg +0.7% · Bytecode: 9 unch. · avg +4.7%
typed-arrays.js — Interp: 🟢 7, 🔴 3, 12 unch. · avg +16.2% · Bytecode: 🔴 10, 12 unch. · avg -18.1%
uint8array-encoding.js — Interp: 🟢 2, 🔴 3, 13 unch. · avg +13.3% · Bytecode: 🟢 5, 🔴 1, 12 unch. · avg +18.1%
weak-collections.js — Interp: 🟢 3, 🔴 2, 10 unch. · avg -1.8% · Bytecode: 🟢 2, 🔴 5, 8 unch. · avg +1.1%
Deterministic profile diffDeterministic profile diff: no significant changes. Measured on ubuntu-latest x64. Each PR run also builds the |
test262 Conformance
Areas closest to 100%
Per-test deltas (+1 / -0)Newly passing (1):
Steady-state failures are non-blocking; regressions vs the cached main baseline (lower total pass count, or any PASS → non-PASS transition) fail the conformance gate. Measured on ubuntu-latest x64, bytecode mode. Areas grouped by the first two test262 path components; minimum 25 attempted tests, areas already at 100% excluded. Δ vs main compares against the most recent cached |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/adr/0077-thread-local-cleanup-registry.md`:
- Line 14: Update the ADR’s test description so it matches the implemented
coverage: `Goccia.ThreadCleanupLeak.Test` only measures a one-shot heap drop
around `RunThreadvarCleanups`, while `Goccia.Threading.Test` covers per-worker
draining separately. Replace the current “repeated worker cycles”/“per-cycle
non-growth” wording with language that reflects these two tests and their actual
assertions, without changing the broader ADR narrative.
In `@source/units/Goccia.Threading.Test.pas`:
- Around line 417-452: The threading tests are registering cleanup callbacks
inside the test bodies, which bypasses the intended startup-only contract of
RegisterThreadvarCleanup in Goccia.ThreadCleanupRegistry. Move the sentinel
registrations for SentinelCleanup, SentinelCleanupSecond, and
SentinelWorkerCleanup into a test-only unit initialization section so they are
set up once at startup, then keep
TTestThreading.TestThreadCleanupRegistryRunsRegistered and
TTestThreading.TestShutdownThreadRuntimeDrainsRegistryPerWorker focused on
resetting counters, draining with RunThreadvarCleanups, and asserting the
expected counts.
In `@source/units/Goccia.Values.ArrayValue.pas`:
- Around line 158-161: ClearThreadvarMembers currently only resets
FPrototypeMembers, but InitializePrototype also pins FPrototypeMethodHost, so
the teardown path still leaves the prototype host alive after thread exit.
Update ClearThreadvarMembers to also unpin and nil out FPrototypeMethodHost, or
refactor the prototype host to a true process-wide singleton; apply the same fix
pattern in InitializePrototype/ClearThreadvarMembers for
Goccia.Values.BigIntValue and Goccia.Values.BooleanObjectValue as well.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: f9d3c615-c896-4289-8e08-44042d70db67
📒 Files selected for processing (74)
docs/adr/0077-thread-local-cleanup-registry.mddocs/adr/README.mdsource/shared/TextSemantics.passource/units/Goccia.Builtins.CSV.passource/units/Goccia.Builtins.Console.passource/units/Goccia.Builtins.GlobalArray.passource/units/Goccia.Builtins.GlobalArrayBuffer.passource/units/Goccia.Builtins.GlobalBigInt.passource/units/Goccia.Builtins.GlobalFFI.passource/units/Goccia.Builtins.GlobalNumber.passource/units/Goccia.Builtins.GlobalObject.passource/units/Goccia.Builtins.GlobalPromise.passource/units/Goccia.Builtins.GlobalReflect.passource/units/Goccia.Builtins.GlobalRegExp.passource/units/Goccia.Builtins.GlobalString.passource/units/Goccia.Builtins.GlobalSymbol.passource/units/Goccia.Builtins.GlobalURL.passource/units/Goccia.Builtins.JSON.passource/units/Goccia.Builtins.JSON5.passource/units/Goccia.Builtins.JSONL.passource/units/Goccia.Builtins.Math.passource/units/Goccia.Builtins.TOML.passource/units/Goccia.Builtins.TSV.passource/units/Goccia.Builtins.YAML.passource/units/Goccia.ImportMeta.passource/units/Goccia.RegExp.VM.passource/units/Goccia.Temporal.TimeZone.passource/units/Goccia.ThreadCleanupLeak.Test.passource/units/Goccia.ThreadCleanupRegistry.passource/units/Goccia.Threading.Test.passource/units/Goccia.Threading.passource/units/Goccia.Values.ArrayBufferValue.passource/units/Goccia.Values.ArrayValue.passource/units/Goccia.Values.BigIntValue.passource/units/Goccia.Values.BooleanObjectValue.passource/units/Goccia.Values.FFILibrary.passource/units/Goccia.Values.FFIPointer.passource/units/Goccia.Values.FinalizationRegistryValue.passource/units/Goccia.Values.HeadersValue.passource/units/Goccia.Values.IntlCollator.passource/units/Goccia.Values.IntlDateTimeFormat.passource/units/Goccia.Values.IntlDisplayNames.passource/units/Goccia.Values.IntlDurationFormat.passource/units/Goccia.Values.IntlListFormat.passource/units/Goccia.Values.IntlLocale.passource/units/Goccia.Values.IntlNumberFormat.passource/units/Goccia.Values.IntlPluralRules.passource/units/Goccia.Values.IntlRelativeTimeFormat.passource/units/Goccia.Values.IntlSegmenter.passource/units/Goccia.Values.IteratorValue.passource/units/Goccia.Values.MapValue.passource/units/Goccia.Values.NumberObjectValue.passource/units/Goccia.Values.ObjectValue.passource/units/Goccia.Values.PromiseValue.passource/units/Goccia.Values.ResponseValue.passource/units/Goccia.Values.SetValue.passource/units/Goccia.Values.SharedArrayBufferValue.passource/units/Goccia.Values.StringObjectValue.passource/units/Goccia.Values.SymbolValue.passource/units/Goccia.Values.TemporalDuration.passource/units/Goccia.Values.TemporalInstant.passource/units/Goccia.Values.TemporalPlainDate.passource/units/Goccia.Values.TemporalPlainDateTime.passource/units/Goccia.Values.TemporalPlainMonthDay.passource/units/Goccia.Values.TemporalPlainTime.passource/units/Goccia.Values.TemporalPlainYearMonth.passource/units/Goccia.Values.TemporalZonedDateTime.passource/units/Goccia.Values.TextDecoderValue.passource/units/Goccia.Values.TextEncoderValue.passource/units/Goccia.Values.TypedArrayValue.passource/units/Goccia.Values.URLValue.passource/units/Goccia.Values.WeakMapValue.passource/units/Goccia.Values.WeakRefValue.passource/units/Goccia.Values.WeakSetValue.pas
Address PR review: RegisterThreadvarCleanup is a write-once-at-init API, but the registry tests registered their sentinels inside the test bodies. Move the sentinel registrations (and the nil-guard) into the test program's startup block, before any worker thread is spawned, and keep the tests to resetting counters, draining, and asserting. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Address PR review: the ADR described the leak gate as spawning "repeated worker cycles" with per-cycle non-growth, which the implemented tests do not do. Describe the two actual gates instead: Goccia.ThreadCleanupLeak.Test (one-shot heap-drop on drain) and Goccia.Threading.Test (per-worker drain wiring). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Summary
threadvars at thread exit. Every builtin and value type caches its member-definition arrays (FStaticMembers/FPrototypeMembers : TArray<TGocciaMemberDefinition>, whose records hold strings) in a threadvar, and two units hold input memos (RegExpdecode memo perf(regexp): memoize the decoded subject across regex VM calls #805, is-ASCII memo perf(string): ASCII byte-op fast paths for the UTF-16 string helpers #806). These leaked on each worker-thread exit (and, for the member arrays, were never released on the main thread either).Goccia.ThreadCleanupRegistry: each leaking unit registers a smallClearThreadvarMembersproc once in itsinitialization;ShutdownThreadRuntimedrains the registry on every worker thread, and the registry'sfinalizationdrains it on the main thread — so each managed threadvar is released on both teardown paths, matching the issue's expected behaviour and the existingDisposableStackpattern.FStaticMembersplus ~40Goccia.Values.*FPrototypeMembers, e.g. Map/Set/Promise/allIntl*/allTemporal*/TypedArray/Iterator), not just the ~20 enumerated. All 64 are fixed (confirmed via #885 grilling).ShutdownThreadRuntimecalls their exportedClearRegExpInputMemo/ClearAsciiMemoexplicitly (same shape as the 5 caches already routed there).Goccia.ImportMetahad nofinalization;Goccia.Temporal.TimeZone'sfinalizationdid not callClearTimeZoneCache.@threadvarresolves to the registering thread's copy, so a generic pointer-based clear can't release another thread's copy. Decision recorded in ADR 0077.Closes #885
Deferred follow-ups (tracked as separate issues)
ShutdownThreadRuntime(ImportMeta, Atomics, DisposableStack, Semver, TimeZone, + the 2 memos) into the registry, for one uniform mechanism — Migrate explicit ShutdownThreadRuntime cache clears into the thread-cleanup registry #893.GPublishedGetterHosts, compilerTListstate) and route any genuine leakers — Audit object-reference threadvars for worker-exit lifetime #892.Note on the leak gate
The grilled plan called for a "permanent heaptrc
-ghzero-leak CI job".build.pashas no-ghmode, and a full-engine worker harness under heaptrc does not report a literal zero (it has legitimate process-lifetime allocations), so a-gh"0 unfreed" assertion would be unreliable. Instead the gate is a deterministic Pascal heap-reclaim test (Goccia.ThreadCleanupLeak.Test) that CI's existing Pascal-test job runs automatically: it populates the member-definition threadvars, drains the registry, and asserts the live heap drops by a meaningful amount. It is cross-platform robust and verified to fail when the drain is disabled and pass when enabled — directly locking in the fix. Happy to add a literal-ghjob as a follow-up if preferred.Testing
GocciaTestRunner testsand--mode=bytecodeboth 10958/10958 (100%). (The fix has no JS-observable behaviour; it is internal threadvar lifetime, so it is confirmed via the Pascal tests below.)Goccia.Threading.Test(12/12), and a newGoccia.ThreadCleanupLeak.Testheap-reclaim gate (verified it fails without the fix). All 26 Pascal suites pass../format.pas --checkclean.