Refactor device selection: Rename to computePolicy, remove accelerated, and add fallback by mingmingtasd · Pull Request #923 · webmachinelearning/webnn

mingmingtasd · 2026-03-18T06:32:28Z

To fix #911

Description:
This PR refactors the device selection preference API to establish a more extensible framework by replacing MLPowerPreference with MLComputePolicy in MLContextOptions.

Key changes included:

Renamed preference enum: MLPowerPreference has been refactored to MLComputePolicy (and the corresponding MLContextOptions dictionary member updated to computePolicy).
Clarified extensibility: The spec now explicitly states that the policy is designed to be extensible to accommodate future execution profiles, moving beyond just power and performance metrics.
Removed "accelerated" option: Removed the accelerated value to eliminate ambiguity, as performance and power semantics are better covered by other profiles.
Introduced "fallback" policy: Added a new "fallback" compute policy. This option prioritizes maximum compatibility over other considerations (typically running on a CPU) and is highly useful for verifying a model's numeric behavior without utilizing parallel accelerators like GPUs or NPUs.

The corresponding chromium CL is https://chromium-review.googlesource.com/c/chromium/src/+/7513189

PTAL, thanks! @huningxin

Preview | Diff

handellm · 2026-03-26T15:21:44Z

 <dt>"<dfn enum-value>low-power</dfn>"</dt>
 <dd>Prioritizes power consumption over other considerations such as execution speed.</dd>
+<dt>"<dfn enum-value>fallback</dfn>"</dt>
+<dd>Prioritizes maximum compatibility over other considerations, typically running on a CPU. This is useful for testing a model's numeric behavior without utilizing parallel accelerators like GPUs or NPUs.</dd>


This policy is useful for audio processing that doesn't want to depend on accelerators due to latency reasons. Can the text be updated to mention this case?

(just naming) Hmm, reading the description of this enum ("Prioritizes maximum compatibility...", "useful for testing a model's numeric behavior..."), some more immediately enlightening names to readers would be something like "compatible"/"compatibility"/"stable"/"precision". I would have not guessed that description from the word "fallback", which implies you fell back to a less capable device than the one you really wanted, when actually CPU may have been exactly what you wanted (that is, it was the primary preference, not a fallback). e.g.

enum MLComputePolicy { "default", "high-performance", "low-power", "compatible", // or stable/local... };

This policy is useful for audio processing that doesn't want to depend on accelerators due to latency reasons

Totally, as I recall some teams in the past complaining about GPU overhead for background audio filtering in chat apps, preferring to keep compute more local on the CPU. Perhaps as a separate PR, we could add an explicit "low-latency" option too, which would be even clearer in intent for that scenario.

Also "precision" would be a useful preference, as some NPU's and GPU's chop off low bits (also, separate PR).

+1 for both the "compatible" (in this PR), adding "low-latency" in a next PR, and adding a preference/hint for "precision" (to MLContextOptions?).

Let me throw in "non-accelerated" here or even "cpu" if we don't like "fallback". Reading "compatible" makes me be mildly suspecting an accelerator could still be in the mix :)

Not tied to "fallback" but it does align with existing WebGPU naming.

Some opposition to "cpu" since it describes an implementation detail and not a policy. Definition of fallback as used in the WebGPU specification https://www.w3.org/TR/webgpu/#fallback-adapter:

An adapter may be considered a fallback adapter if it has significant performance caveats in exchange for some combination of wider compatibility, more predictable behavior, or improved privacy.

whatever name chosen should help describe the behavior and not the implementation. The spec does not require cpu execution so having that in the name would seem incorrect.

zolkis

Thanks!

fdwr

Consider clearer naming, otherwise LGTM sir.

fdwr · 2026-03-27T23:07:01Z

 <dt>"<dfn enum-value>low-power</dfn>"</dt>
 <dd>Prioritizes power consumption over other considerations such as execution speed.</dd>
+<dt>"<dfn enum-value>fallback</dfn>"</dt>
+<dd>Prioritizes maximum compatibility over other considerations, typically running on a CPU. This is useful for testing a model's numeric behavior without utilizing parallel accelerators like GPUs or NPUs.</dd>


(just naming) Hmm, reading the description of this enum ("Prioritizes maximum compatibility...", "useful for testing a model's numeric behavior..."), some more immediately enlightening names to readers would be something like "compatible"/"compatibility"/"stable"/"precision". I would have not guessed that description from the word "fallback", which implies you fell back to a less capable device than the one you really wanted, when actually CPU may have been exactly what you wanted (that is, it was the primary preference, not a fallback). e.g.

enum MLComputePolicy { "default", "high-performance", "low-power", "compatible", // or stable/local... };

This policy is useful for audio processing that doesn't want to depend on accelerators due to latency reasons

Totally, as I recall some teams in the past complaining about GPU overhead for background audio filtering in chat apps, preferring to keep compute more local on the CPU. Perhaps as a separate PR, we could add an explicit "low-latency" option too, which would be even clearer in intent for that scenario.

Also "precision" would be a useful preference, as some NPU's and GPU's chop off low bits (also, separate PR).

mwyrzykowski · 2026-03-30T16:14:00Z

 <dt>"<dfn enum-value>low-power</dfn>"</dt>
 <dd>Prioritizes power consumption over other considerations such as execution speed.</dd>
+<dt>"<dfn enum-value>fallback</dfn>"</dt>
+<dd>Prioritizes maximum compatibility over other considerations, typically running on a CPU. This is useful for testing a model's numeric behavior without utilizing parallel accelerators like GPUs or NPUs.</dd>


I think this wording is too implementation specific:

Prioritizes maximum compatibility over other considerations, typically running on a CPU. This is useful for testing a model's numeric behavior without utilizing parallel accelerators like GPUs or NPUs

and something more general similar to the WebGPU wording where either "compatibility, more predictable behavior, or improved privacy" can be prioritized would be more consistent with existing web specification.

mwyrzykowski

Would like to see the references to CPU removed from the non-normative wording https://github.com/webmachinelearning/webnn/pull/923/changes#r3010808209 but otherwise no objections to the API surface changes ✅

mingmingtasd · 2026-04-03T05:16:42Z

I have no preference for the naming; I'll wait for the decision. For the description wording, I'm OK with not mentioning CPU to avoid being implementation-specific.

mwyrzykowski

✅ with the current

 "low-power",
 "fallback"

values

anssiko · 2026-04-23T16:07:08Z

RESOLUTION: Use MLComputePolicy "fallback" as the name for maximum compatibility preference. (PR #923)

anssiko · 2026-04-23T16:08:45Z

We also acknowledged the following enhancements in our discussion today:

Anssi: other enhancements from this PR, to be addressed in a separate PR:
… - "low-latency" MLComputePolicy, example use case audio processing
… - "precision" MLContextOptions, to signal chopping off low bits is not preferred for this context

mingmingtasd · 2026-04-24T08:24:41Z

I have no preference for the naming; I'll wait for the decision. For the description wording, I'm OK with not mentioning CPU to avoid being implementation-specific.

Thanks for your discussion! Let's keep the "fallback".

And in the latest commit, I removed the "CPU" word in the description for "fallback". @mwyrzykowski

anssiko

This is ready to be merged by the editors. Thank you all.

The editors can track the discussed enhancements #923 (comment) as deemed appropriate.

huningxin · 2026-04-24T21:47:53Z

  undefined destroy();

-  readonly attribute boolean accelerated;
+  readonly attribute MLComputePolicy computePolicy;


Not sure whether compute policy should remain internal slot as previous power preference. It is not like accelerated attribute that can indicate whether the native AI runtime does fallback or not. @handellm , is there any use case of exposing it?

The intent for exposing it was that the implementation can change it (as it was a context creation hint). The create algorithm should contain clauses for how to do this. Also, after context creation this might be changed by the impl, right? One use case could be to check if the context still has the same policy it was created with.

But one could also argue that the impl got the hint, thank you, it will translate it into a set of internal states, and will take it into account in the best possible way henceforth, so no need for the app meddles with it later. In that case, it could be only an internal slot. The question is what problems might be if exposing it (other than impl complexity to translating back an internal state to the MLComputePolicy).

define compute policy

34fa320

huningxin requested review from RafaelCintron, fdwr, huningxin and reillyeon March 19, 2026 05:37

anssiko added the device selection label Mar 20, 2026

mwyrzykowski approved these changes Mar 26, 2026

View reviewed changes

handellm reviewed Mar 26, 2026

View reviewed changes

zolkis approved these changes Mar 26, 2026

View reviewed changes

fdwr approved these changes Mar 27, 2026

View reviewed changes

mwyrzykowski reviewed Mar 30, 2026

View reviewed changes

mwyrzykowski suggested changes Mar 30, 2026

View reviewed changes

mingmingtasd requested a review from mwyrzykowski April 3, 2026 05:12

anssiko added the Agenda+ label Apr 23, 2026

mwyrzykowski approved these changes Apr 23, 2026

View reviewed changes

anssiko removed the Agenda+ label Apr 23, 2026

remove CPU in description for fallback

79249ab

anssiko approved these changes Apr 24, 2026

View reviewed changes

huningxin reviewed Apr 24, 2026

View reviewed changes

Conversation

mingmingtasd commented Mar 18, 2026 • edited by pr-preview Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zolkis left a comment

Choose a reason for hiding this comment

Uh oh!

fdwr left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mwyrzykowski left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mingmingtasd commented Apr 3, 2026

Uh oh!

mwyrzykowski left a comment

Choose a reason for hiding this comment

Uh oh!

anssiko commented Apr 23, 2026

Uh oh!

anssiko commented Apr 23, 2026

Uh oh!

mingmingtasd commented Apr 24, 2026

Uh oh!

anssiko left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

mingmingtasd commented Mar 18, 2026 •

edited by pr-preview Bot

Loading

mwyrzykowski left a comment •

edited

Loading