Skip to content
This repository was archived by the owner on Aug 25, 2025. It is now read-only.

junction-core: Fix an add-remove-add caching bug#179

Merged
blinsay merged 3 commits into
mainfrom
benl/smoke-bug
May 14, 2025
Merged

junction-core: Fix an add-remove-add caching bug#179
blinsay merged 3 commits into
mainfrom
benl/smoke-bug

Conversation

@blinsay
Copy link
Copy Markdown
Member

@blinsay blinsay commented May 14, 2025

We managed to turn up an add-remove-add bug where the client cache was marking an xDS Cluster as not-found when removing it instead of just removing it from the cache. This was, unfortunately, fairly easy to turn up - speeding up ingest and changing route names in the smoke test (a6cb862) were enough to trigger it reliably.

The fix is trivial - on removing a resource from the cache, don't create a subscription node if one doesn't already exist.

     fn remove(&mut self, rtype: ResourceType, name: &str) {
-        if let Some(sub) = self.find_subcribed(rtype, name) {
+        if let Some(sub) = self.find(rtype, name) {
            self.reset_refs(sub);
          }
     }

For resources that are not wildcards, this was a noop, but for Clusters this would both tombstone resource data and create a wildcard subscription, which marks the resource as not-found.

Most of the fix is a connection level test I wrote trying to reproduce, and a cache-level test that scopes this more tightly. To repro, it was necessary to interleave a call to cache.collect between removing a reference to the cluster from the RouteConfig and removing the cluster itself from the cache.

More Debugging Tools

While reproducing, it was useful to have tracing enabled in Rust. Enabling tracing doesn't happen automatically, but there is now a hook that Python clients can call to enable the standard tracing subscriber. This isn't the best interface but is an improvement.

@blinsay blinsay merged commit ba38801 into main May 14, 2025
11 checks passed
@blinsay blinsay deleted the benl/smoke-bug branch May 14, 2025 19:50
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant