docs(D-W6.4): weak-metaclass reproducer + concrete hypotheses for the actual drift by fglock · Pull Request #606 · fglock/PerlOnJava

fglock · 2026-04-29T07:50:18Z

Summary

D-W6.4 investigation continued. The simple "store strong → weaken in
place → outer keepalive" pattern works correctly without the walker
gate — yet another shape ruled out as the drift source.

What landed

src/test/resources/unit/refcount/drift/weak_metaclass.t (14
tests) — direct weakened slot, weakened slot + outer @Keepalive,
20-entry loop variant, weak-ref → strong-ref "rescue"
(Schema::DESTROY shape). All 14 pass on master AND with the
walker gate disabled.

What we now know is not the drift source

After four reproducer files (sub_install, closure_capture,
hash_slot, weak_metaclass) and 53 total bare-Perl test cases:

✅ Sub installation (glob assign, named sub, loop install,
temp drop, nested install) — works without the gate
✅ Closure capture (single-, two-, three-, five-layer wrap, plus
20-closure chain) — works without the gate
✅ Hash-slot tracking (direct slot, package global, 50-entry
registry, slot overwrite) — works without the gate
✅ Weakened-hash + multi-holder — works without the gate

The Class::MOP drift is something more specific than any of
these shapes alone.

Three concrete next-step hypotheses

Documented in dev/modules/moose_support.md D-W6.4 section:

Audit args.push(self) and the rebalance walk in
doCallDestroy. When the DESTROY body's first line is
my $self = shift, the shift queues a deferred decrement.
The rebalance-walk-after-DESTROY may double-count this — going
refCount=1 (push) → 0 (rebalance) → -1 (drainPendingSince
processing the queued shift decrement).
Guard drainPendingSince against entries with
destroyFired=true. Such entries have already been handled
by the cascading destroy and re-processing them in
drainPendingSince clears weak refs that downstream code (the
weakened associated_class ref in Class::MOP::Attribute)
relies on.
Instrument pending.add to log identity-hash + caller when
the same RuntimeBase is added twice. This surfaces the
duplicate-add path directly.

#3 is the cheapest experiment to try first — it's a one-line
IdentityHashMap wrapped around pending.add.

Open D-W6 PR backlog

docs(D-W6.1): sub-install reproducer + revised drift diagnosis #603 (D-W6.1) — sub-install reproducer + diagnosis
docs(D-W6.2): closure-capture + hash-slot reproducers, identify real drift #605 (D-W6.2) — closure-capture + hash-slot reproducers + identification of real drift
this PR (D-W6.4) — weak-metaclass reproducer + three concrete next-step hypotheses
docs(D-W6): empirical findings — drift source at sub install path #600 — full D-W6 plan with empirical findings and current
fallback recommendation (PR fix(walker-gate): replace class-name heuristic with universal walker check #599 universal walker)
fix(walker-gate): replace class-name heuristic with universal walker check #599 — universal walker (no class-name dispatch)

Test plan

make (build + unit tests) green.
weak_metaclass.t 14/14 on master (gate active).
weak_metaclass.t 14/14 with the gate disabled (probe build).
All other drift reproducers still pass.

Generated with Devin

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>

…drift D-W6.2 investigation outcome: The simple closure-capture and hash-slot patterns all work correctly without the walker gate. Reproducers landing in src/test/resources/unit/refcount/drift/: - closure_capture.t (8 tests) — single, two-, three-, five-layer wrap, plus a 20-closure chain. - hash_slot.t (14 tests) — direct slot, package global, 50-entry registry, slot overwrite. - sub_install.t (12 tests, copied from earlier branch) — five sub-install patterns. All pass on master AND with the walker gate disabled. Therefore the simple shapes of these three code paths have correct cooperative refCount semantics; they are NOT the source of the drift. PJ_DESTROY_TRACE=1 instrumentation added to DestroyDispatch.callDestroy (zero-cost when off; prints Pkg::subname for RuntimeCode and the class name for blessed objects). The actual drift, surfaced by `PJ_DESTROY_TRACE=1 ./jperl -e 'use Class::MOP'` (gate disabled), is in the metaclass-instance lifecycle: the same Class::MOP::Class instance is destroyed TWICE (same identity hash) — once via MortalList.flush, once via MortalList.drainPendingSince in a cascading flush. Investigation notes in dev/modules/moose_support.md (Phase D-W6.2) describe three concrete next leads: 1. Audit MortalList.deferDecrementIfTracked for double-add. 2. Audit MortalList.drainPendingSince for entries that have already been zeroed. 3. Trace which scope-exit on Class/MOP/Class.pm:260 puts the metaclass on the deferred queue. D-W6.4 (a new sub-phase) is added to track this work; D-W6.1 and D-W6.2 are closed as "the simple patterns work, the actual drift is elsewhere". Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>

…potheses D-W6.4 investigation continued. Added one more reproducer: - weak_metaclass.t (14 tests) — store strong → weaken in place → outer keepalive holds strong; 20-entry loop variant; and the weak-ref → strong-ref "rescue" (Schema::DESTROY) pattern. All 14 pass on master AND with the walker gate disabled. So the simple "store strong, weaken in place" pattern is also not the drift source. Combined with D-W6.1 / D-W6.2 findings, the drift is something MORE specific than: sub installation, closure capture, hash-slot tracking, weakened-hash + multi-holder. Each of those simple shapes works correctly without the walker gate. The trace data points to the destroyFired branch in DestroyDispatch.callDestroy as the cleanup path that actually clears the weak refs that break Class::MOP's bootstrap. The plausible path that re-enters callDestroy after the first destroy is `drainPendingSince` post-DESTROY — when the `my $self = shift` inside Class::MOP::Class's DESTROY body queues a deferred decrement on a RuntimeBase that the rebalance walk thought it had already handled. Three concrete next-step hypotheses recorded in moose_support.md: 1. Audit args.push(self) and the rebalance walk in doCallDestroy for the case where the DESTROY body's `shift @_` queues a decrement that drainPendingSince re-processes. 2. Guard drainPendingSince against entries with destroyFired=true. 3. Instrument pending.add to log identity-hash + caller when the same RuntimeBase is added twice. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>

fglock and others added 2 commits April 29, 2026 09:40

fglock mentioned this pull request Apr 29, 2026

fix(D-W6): three diagnostic env-flags + sharper Class::MOP drift diagnosis #607

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(D-W6.4): weak-metaclass reproducer + concrete hypotheses for the actual drift#606

docs(D-W6.4): weak-metaclass reproducer + concrete hypotheses for the actual drift#606
fglock wants to merge 2 commits intomasterfrom
fix/d-w6-4-pending-double-add

fglock commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

fglock commented Apr 29, 2026

Summary

What landed

What we now know is not the drift source

Three concrete next-step hypotheses

Open D-W6 PR backlog

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant