Skip to content

Add race-network-and-cache source to static routing api. #1764

Open
monica-ch wants to merge 10 commits intow3c:mainfrom
monica-ch:SW-Race-NetworkRace
Open

Add race-network-and-cache source to static routing api. #1764
monica-ch wants to merge 10 commits intow3c:mainfrom
monica-ch:SW-Race-NetworkRace

Conversation

@monica-ch
Copy link
Collaborator

@monica-ch monica-ch commented Apr 16, 2025

This PR adds new source race-network-and-cache to the SW static routing api per https://github.com/WICG/service-worker-static-routing-api/blob/main/final-form.md.


Preview | Diff

@monica-ch
Copy link
Collaborator Author

@yoshisatoyanagisawa PTAL when you get a chance, thanks!

@yoshisatoyanagisawa
Copy link
Collaborator

@sisidovski Will you take a look?
I assume it mostly fine but I feel all failure scenarios may not work well for both race-network-and-fetch-event and race-network-and-cache.

@monica-ch monica-ch reopened this Jan 22, 2026
@monica-ch
Copy link
Collaborator Author

@yoshisatoyanagisawa Can you please review the recent changes - removed networkFetchCompleted and fetchHandlerCompleted that was introduced in the PR earlier and incorporated timingInfo per the recent changes in the cache/race-network-and-cache algorithm.

@yoshisatoyanagisawa
Copy link
Collaborator

Thanks for the contribution. Overall it looks good except the raceNetworkAndCacheCacheName source support.
To reduce the algorithm duplication, I hope cache lookup algorithms are factored out.

@monica-ch monica-ch force-pushed the SW-Race-NetworkRace branch from 59f6074 to 9260716 Compare January 29, 2026 19:39
@yoshisatoyanagisawa
Copy link
Collaborator

@youennf @asutherland FYI
We aim to add the race-network-and-cache source to address the slow storage access. Please let us know your thoughts.

Thanks @monica-ch for addressing the comments, the PR looks good to me.

@monica-ch
Copy link
Collaborator Author

@youennf @asutherland What do you think of this change?

@youennf
Copy link

youennf commented Feb 5, 2026

I haven't looked at the PR, a few thoughts/questions:

  1. How is a web application knowledgeable enough to properly choose cache vs. race-cache-and-network?
  2. A UA can implement race-network-and-cache by using the existing cache route implementation. This approach might actually be useful in some conditions (metered connection e.g.). AFAIUI, the web application will have no way of identifying the UA did an actual race or not.
  3. A UA can already start the network load without waiting for the cache. This is a partial race (it cannot use the network response before knowing whether there is a cache entry), but can it solve the perf issue? Should we consider allowing the cache route to be fully raceable with network by default?
  4. Are there privacy concerns? For instance, can a web application infer, from the selected route, new information (slow hard drive vs. networking)? Does this reveal new HW/config information? Could this be used by two pages running on the device as a side-information channel (one using intensively hard drive for instance, and the other one checking whether network or cache is used)?

@yoshisatoyanagisawa
Copy link
Collaborator

yoshisatoyanagisawa commented Feb 6, 2026

Thank you for sharing your thoughts/questions @youennf
Let me answer one by one.

I think the Resource Timing API should also be updated to help web developers to use the API properly.

How is a web application knowledgeable enough to properly choose cache vs. race-cache-and-network?

If the elapsed time between cacheLookupStart and responseStart is large, the cache storage access can be slow. The race-network-and-cache can be chosen for the case.

A UA can implement race-network-and-cache by using the existing cache route implementation. This approach might actually be useful in some conditions (metered connection e.g.). AFAIUI, the web application will have no way of identifying the UA did an actual race or not.

I think this can be guessed from the resource timing API. Especially seeing the workerFinalRouterSource ratio.

A UA can already start the network load without waiting for the cache. This is a partial race (it cannot use the network response before knowing whether there is a cache entry), but can it solve the perf issue? Should we consider allowing the cache route to be fully raceable with network by default?

Since ServiceWorkerAutoPreload has been around for a while, there may not actually be much resistance to racing, but I think the current cache behavior will leave web developers with an explicit choice to prioritize the cache. i.e. no network access if the cache is available.

Are there privacy concerns? For instance, can a web application infer, from the selected route, new information (slow hard drive vs. networking)? Does this reveal new HW/config information? Could this be used by two pages running on the device as a side-information channel (one using intensively hard drive for instance, and the other one checking whether network or cache is used)?

I think this is the equivalent attack surface with the fetch handler. The service worker static routing API is offloading of the traditional fetch handler. We keep this offloading principle as much as possible. With the race-network-and-cache, the web application can guess the user's device situation like disk access performance and network access performance. However, the same thing can be done with the fetch handler.

@youennf
Copy link

youennf commented Feb 9, 2026

If the elapsed time between cacheLookupStart and responseStart is large, the cache storage access can be slow. The race-network-and-cache can be chosen for the case.

If decision is made this way, that would require reinstalling the service worker whenever a website wants to opt-in/out of racing. This seems inadequate as the usefulness of racing may depend on varying network conditions.

I would have expected that racing could be enabled/disabled without the need to reinstall the service worker.
Similarly, is there a use case for having different race strategies for different cache routes of a single service worker?

I think this can be guessed from the resource timing API. Especially seeing the workerFinalRouterSource ratio.

It can be guessed but not known for sure, especially if the UA is disabling racing in some specific cases (5G metered vs. wifi for instance).

However, the same thing can be done with the fetch handler.

Not with the same precision. A fetch handler adds JS, different IPC messages and so on.

@yoshisatoyanagisawa
Copy link
Collaborator

Thanks for the follow-up. I feel I might not have addressed your points clearly in my previous response, so let me clarify my perspective on each.

Regarding the opt-in/out of racing and SW reinstallation: In my view, the decision to use racing is less about varying network conditions and more about whether the server-side is willing to accept the additional load. The motivation of the feature is that developers opt for race-network-and-cache because disk/storage access can be slow, rather than simply because the network is fast. However, providing the option not to race is important for those who need to prioritize minimizing server load.

Regarding different strategies for different routes: A potential use case for route-specific strategies would be a "niche" optimization where a developer races resources critical to LCP (Largest Contentful Paint) to ensure they arrive as fast as possible, while relying solely on the cache for other resources to reduce server pressure—even if those resources take slightly longer to load.

Regarding the UA disabling racing (e.g., 5G metered vs. Wi-Fi): I understand your point now. I was previously thinking about how developers use the Resource Timing API to send data to the server for research and statistics. From that perspective, it's discernible on the server side. However, if you are referring to the web app knowing this on the client side, I’m not sure what behavior the app would change based on that information. Beyond server-side statistics, what specific use cases do you have in mind for the client-side web app?

Regarding precision: While I agree that bypassing the fetch handler reduces "noise" from JavaScript and IPC, I’m not sure how impactful that is compared to the much larger noise inherent in network or HDD access. In terms of measurement precision, both approaches use the same underlying timers, so the results should be comparable. If you believe the precision difference is significant, I would appreciate more details on that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants