Skip to content

Add incremental invalidation engine#641

Merged
st0012 merged 4 commits intomainfrom
add-incremental-invalidation-pr2
Mar 25, 2026
Merged

Add incremental invalidation engine#641
st0012 merged 4 commits intomainfrom
add-incremental-invalidation-pr2

Conversation

@st0012
Copy link
Copy Markdown
Member

@st0012 st0012 commented Mar 4, 2026

Replaces the old remove_definitions_for_document + invalidate_ancestor_chains approach with a targeted invalidation engine. When a file is updated or deleted, the engine traces through the name_dependents reverse index to invalidate only the affected declarations, names, and references — instead of requiring a full graph rebuild.

How it works

consume_document_changes() (renamed from update()) and delete_document() run a three-step pipeline:

  1. invalidate — read-only scan that seeds a worklist from old/new definitions and references, building a pending_detachments side table for definitions that need to be detached from their declarations
  2. remove_document_data — cleans up phase 1 data (definitions, references, names, strings) from maps
  3. extend — merges new content and queues work items for the resolver

The resolver still does clear_declarations + full rebuild. Wiring it to drain pending_work incrementally is a follow-up.

Invalidation worklist

The worklist processes three item types:

  • Declaration — two modes:
    • Remove: no definitions remain or owner was already removed (orphaned). Cascades to members, singleton class, and descendants. Orphaned definitions are re-queued for re-resolution (e.g. class Foo::Bar survives even if Foo changes from a module to an alias).
    • Update: declaration survives but its ancestor chain may have changed (mixin added/removed, superclass changed). Clears ancestors/descendants and re-queues ancestor resolution.
  • Name — structural dependency broken (name's nesting or parent scope removed). Unresolves the name and cascades to all dependents.
  • References — ancestor context changed, but the name itself is still valid. Needed for mixin-related invalidation:
class Foo < Bar
  CONST
end

# Another file adds:
class Foo
  include Baz # Foo's ancestors changed, so references like CONST need re-evaluation
end

Cascade differentiation

The name_dependents reverse index distinguishes ChildName (compact syntax Foo::Bar) from NestedName (nested syntax module Foo; class Bar; end; end):

  • Structural cascade (name removed): both ChildName and NestedNameName
  • Ancestor-triggered cascade (mixin changed): ChildNameName (resolves through parent), NestedNameReferences (only references need rechecking)

@st0012 st0012 force-pushed the add-incremental-invalidation-pr2 branch from 4a7fc18 to f39ef05 Compare March 4, 2026 23:00
@st0012 st0012 marked this pull request as ready for review March 4, 2026 23:45
@st0012 st0012 requested a review from a team as a code owner March 4, 2026 23:45
@st0012 st0012 force-pushed the add-incremental-invalidation-pr2 branch from f39ef05 to 267785b Compare March 4, 2026 23:50
self.remove_definitions_for_document(&document);
let old_document = self.documents.remove(&uri_id);

self.invalidate(old_document.as_ref(), Some(&other));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth adding a fast path for the initial indexing on boot (which can trigger no invalidation and no removal of data).

Maybe a boolean flag for skipping invalidation.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated 👍

Comment on lines +865 to +866
&& let Some(nesting_id) = name_ref.nesting()
&& let Some(NameRef::Resolved(resolved)) = self.names.get(nesting_id)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify that document change this is accounting for? I'm having a hard time understanding it.

A constant reference was changed and we're enqueuing invalidation for the reference's nesting declaration.

Copy link
Copy Markdown
Member Author

@st0012 st0012 Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for

class Foo
  include Bar # constant reference, when added we invalidate Foo entirely for now
end

I've added comments for it.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Similar to the other, I would say it's best to unwrap the option here to avoid hiding data inconsistencies in the graph.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't panic here because it's possible both the const reference and the name exists only in the local graph and haven't been extended into global graph yet.


self.declarations.remove(&decl_id);
} else {
// Ancestor-stale mode
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added more comments. Basically, it's triggered in

class Foo
  include Bar # this is added/removed
end

So we simply change ancestors/descendents update in this branch.

@st0012 st0012 force-pushed the add-incremental-invalidation-pr2 branch 4 times, most recently from de13a32 to d560ba6 Compare March 9, 2026 23:14
@st0012 st0012 requested a review from vinistock March 9, 2026 23:16
@st0012 st0012 self-assigned this Mar 9, 2026
@st0012 st0012 force-pushed the add-incremental-invalidation-pr2 branch from d560ba6 to 6862eca Compare March 10, 2026 17:24
@st0012
Copy link
Copy Markdown
Member Author

st0012 commented Mar 10, 2026

If we adopt #654, we should carry the memoized name_depth caching into the incremental path here. The depth sort in prepare_units is a correctness requirement — without it, 13 resolution tests fail because the resolution loop's made_progress check gives up when children are processed before parents. This means the incremental prepare_units must also sort by depth, and for large invalidation cascades (e.g., a change to a widely-used module rippling through name_dependents), the non-memoized recursive name_depth in the sort comparator would hit the same bottleneck we fixed in #654. The fix is straightforward: call compute_name_depths before sorting the pending units, same as the full resolution path.

@st0012 st0012 force-pushed the add-incremental-invalidation-pr2 branch 2 times, most recently from e37a74a to 9ae2496 Compare March 13, 2026 17:46
Introduces a worklist-based invalidation engine that cascades changes
through the graph when documents are updated or deleted. Uses
ChildName/NestedName edges from the name_dependents index to propagate
invalidation with two distinct modes:

- Structural cascade (UnresolveName): declaration removed or scope broken
- Ancestor cascade (UnresolveReferences): ancestor chain changed

Replaces the has_unresolved_dependency runtime check with explicit
invalidation variants determined at queue time.
@st0012 st0012 force-pushed the add-incremental-invalidation-pr2 branch from 9ae2496 to 2b9316b Compare March 23, 2026 17:00
Copy link
Copy Markdown
Member

@vinistock vinistock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work. I think we can ship and continue to iterate, to unblock this significant portion of the work.

However, let's please add these test scenarios (which I believe are missing) and ensure that they are doing the right thing.

  • Switching the target of an include, prepend or extend
  • Removing an existing superclass
  • Changing the target of an alias (e.g.: first Foo = Bar then Foo = Baz)
  • Switching the order of mixin operations (which impacts the final ancestor chain order). For example:
# first
class Foo
  include Bar
  include Baz
end

# after modification
class Foo
  include Baz
  include Bar
end

// invalidate the nesting declaration.
// We can optimize this later by checking where the constant reference is used.
for cr in lg.constant_references().values() {
if let Some(name_ref) = self.names.get(cr.name_id())
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would vote for doing this here

let name_ref = self.names.get(cr.name_id()).unwrap();
// if statement

A constant reference that holds a name_id for a name that doesn't exist in the graph is a significant data consistency issue. It's best to crash immediately and discover those bugs early than to silently ignore.

Comment on lines +865 to +866
&& let Some(nesting_id) = name_ref.nesting()
&& let Some(NameRef::Resolved(resolved)) = self.names.get(nesting_id)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Similar to the other, I would say it's best to unwrap the option here to avoid hiding data inconsistencies in the graph.

@vinistock vinistock added the enhancement New feature or request label Mar 24, 2026
st0012 added 3 commits March 24, 2026 19:06
- Rename `cr` to `const_ref` for readability
- Unwrap `names.get()` in `remove_document_data` — a constant
  reference with a missing name is a data inconsistency
- Add comment explaining why `invalidate` can't unwrap the same
  lookup (runs before `extend`, names not in global graph yet)
On boot, no documents or declarations exist yet, so invalidation
iterates definitions/references doing hash lookups that always miss.
Skip it when there's no old document and no resolved declarations.

~5.4% indexing speedup on Shopify core (12.17s → 11.52s).
@st0012 st0012 merged commit b3d022b into main Mar 25, 2026
32 checks passed
@st0012 st0012 deleted the add-incremental-invalidation-pr2 branch March 25, 2026 18:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants