You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For the UnknownError: Internal error opening backing store for indexedDB.open error class analyzed in #87862, neither retries (action item §8.2) nor degrading to MemoryOnlyProvider actually addresses the underlying problem. The session continues working from the in-memory cache so users don't see immediate degradation, but the storage layer is silently broken — and that broken state has two real costs:
Operational cost: log/Sentry volume from the silent retry storm (mitigated separately by §8.2).
Data loss risk on offline refresh: if the user is offline and accumulates queued writes (e.g. SequentialQueue items stored as an Onyx key), and the cache rebuilds empty from broken storage on refresh, those queued writes are gone. The user has no indication this happened.
The right strategy is to attempt to heal the IDB connection so writes get back onto disk. There is no need to swap providers, show user-visible UI, or call deleteDatabase() (which Chapter 2 of #87862 proved also fails when corrupt LevelDB files persist on disk). Just reopen the connection and let normal operation resume if it heals; if it doesn't, fall through to the cache-only behavior the session already exhibits, without further log noise.
Precedent: Dexie's workaround
Dexie ships a clean precedent for this approach — catch UnknownError from indexedDB.open() and retry up to 3 times. No backoff, no fallback, no provider swap:
.catch((err)=>{switch(err?.name){case'UnknownError':
if(state.PR1398_maxLoop>0){state.PR1398_maxLoop--;console.warn('Dexie: Workaround for Chrome UnknownError on open()');returntryOpenDB();}break;// ...}returnPromise.reject(err);});
There's also a sibling workaround in Dexie's temp-transaction.ts: when a transaction throws InvalidStateError while the DB reports as open, close and reopen the DB and retry once. Both are bounded by the same PR1398_maxLoop = 3 budget.
Solution
Scope: web (IDB) only. The error class this issue addresses is Chromium-IDB-specific. The native SQLite provider hits a different set of errors (disk I/O error, database is locked, database or disk is full) with categorically different root causes — filesystem-level issues, lock contention, or genuine capacity exhaustion — none of which benefit from a close+reopen heal pattern. If a SQLite-side mitigation is needed, it should be designed and tracked separately.
Implement two related healing mechanisms on IDBKeyValProvider, designed to work together as a single healing strategy with a shared retry budget — directly mirroring Dexie's PR1398_maxLoop pattern.
Shared retry counter
Maintain a single counter inside IDBKeyValProvider — call it healAttemptsRemaining — initialized to 3. The counter is:
Decremented on every heal attempt (both init retries and mid-session reopens).
Reset to 3 on every successful IDB operation (IDBKeyValProvider.setItem, multiSet, mergeItem, multiMerge, removeItem, etc.).
Checked before any heal attempt — if it's already at 0, fall through to cache-only behavior without further attempts.
The counter, the heal logic, and all the log messages live entirely inside IDBKeyValProvider. None of this code runs on native; the SQLite provider is untouched.
This naturally creates a circuit breaker for the permanent-corruption case (no successes → counter drains to 0 → no further heal attempts), while allowing a healthy session to recover from multiple separate transient incidents (each success replenishes the budget). It mirrors Dexie's PR1398_maxLoop directly.
1. On provider init: retry indexedDB.open() on UnknownError
When IDBKeyValProvider init throws UnknownError, retry the indexedDB.open() call up to the remaining budget. This is the direct Dexie-style workaround for the transient post-Clear cookies and site data class (Walexander's repro in Dexie #543).
When the error fires during a write operation in an active session, attempt a close + reopen of the IDB connection — up to the remaining budget — before considering the operation failed:
asyncfunctionhealAndRetry<T>(operation: ()=>Promise<T>): Promise<T>{try{constresult=awaitoperation();healAttemptsRemaining=3;// reset budget on success — mirrors Dexie's patternreturnresult;}catch(error){if(!isBackingStoreError(error)||healAttemptsRemaining<=0){throwerror;}healAttemptsRemaining--;Logger.logInfo(`IDB heal: backing store error during operation — attempting close + reopen. attemptsRemaining=${healAttemptsRemaining}`);awaitcloseConnection();awaitopenIDB(DB_NAME,DB_VERSION);returnhealAndRetry(operation);// recursive retry, bounded by shared counter}}
If a heal succeeds and the subsequent operation completes, the counter resets to 3, so a fresh transient incident later in the session gets a full budget again. If 3 heal attempts fail in succession with no intervening success, the counter hits 0 and subsequent operations fall through to cache-only behavior without further heal attempts or log noise (the cache already absorbed the write per parent comment §5).
Important constraints
No provider swap. The cache already serves reads and absorbs writes (parent comment §7). Swapping to MemoryOnlyProvider changes nothing observable during the session and adds complexity without benefit.
No user-visible UI / notification. The session is already serving correctly from cache; there's nothing to surface.
Bounded heal attempts via shared counter. 3 total in flight; reset on success. Healing should be cheap and silent; if it doesn't work within the budget, further attempts are pure noise until something succeeds and refreshes the budget.
Coordination with action items §8.1 and §8.2
This issue can be worked on independently of the others — the heal logic lives inside the provider (e.g. IDBKeyValProvider.setItem wrapping the raw IDB call), so errors are caught and retried before they ever reach tryOrDegradePerformance or retryOperation. None of the other action items are strict prerequisites.
That said, they interact:
§8.1 (async-catch fix + removal of the 'Internal error opening backing store' string check). Currently the string check is dead code because of Bug 1, so it doesn't conflict with this issue. But §8.1 must remove the string check at the same time it fixes the catch — otherwise the live string check would route this error class to degradePerformance, which we don't want once heal is in place. As long as §8.1 ships its two changes together, the order with this issue doesn't matter.
Unit test (init heal): mock indexedDB.open to reject with UnknownError twice, then resolve. Confirm init succeeds after 3 attempts total (1 initial + 2 retries), heal-attempt logs fire, and the counter ends at 1 (3 - 2 decrements).
Unit test (init heal exhaustion): mock indexedDB.open to always reject with UnknownError. Confirm init fails after 3 attempts, counter is 0, and subsequent operations skip the heal path.
Unit test (mid-session heal): mock Storage.setItem to reject once with the backing-store error, then resolve. Confirm a close+reopen happens, the second setItem succeeds, the heal log fires, and the counter is reset to 3 after success.
Unit test (mid-session heal exhaustion): mock Storage.setItem to reject 3 times consecutively. Confirm 3 heal attempts happen, the counter drains to 0, and a 4th failing setItem does NOT trigger another heal attempt.
Unit test (counter reset after success): drain the counter to 1 with mid-session heals, then a successful setItem, then a new error. Confirm the counter was reset to 3 and a fresh heal attempt fires.
Verify in VictoriaLogs post-deploy: IDB heal log lines appear, and a fraction of users emit them and then continue without further Failed to save to storage errors (indicating successful heal). The ratio of post-heal successes to heal attempts gives a direct readout of how often this mechanism helps.
Coming from #87862 (comment).
Issue
For the
UnknownError: Internal error opening backing store for indexedDB.openerror class analyzed in #87862, neither retries (action item §8.2) nor degrading toMemoryOnlyProvideractually addresses the underlying problem. The session continues working from the in-memory cache so users don't see immediate degradation, but the storage layer is silently broken — and that broken state has two real costs:SequentialQueueitems stored as an Onyx key), and the cache rebuilds empty from broken storage on refresh, those queued writes are gone. The user has no indication this happened.The right strategy is to attempt to heal the IDB connection so writes get back onto disk. There is no need to swap providers, show user-visible UI, or call
deleteDatabase()(which Chapter 2 of #87862 proved also fails when corrupt LevelDB files persist on disk). Just reopen the connection and let normal operation resume if it heals; if it doesn't, fall through to the cache-only behavior the session already exhibits, without further log noise.Precedent: Dexie's workaround
Dexie ships a clean precedent for this approach — catch
UnknownErrorfromindexedDB.open()and retry up to 3 times. No backoff, no fallback, no provider swap:There's also a sibling workaround in Dexie's
temp-transaction.ts: when a transaction throwsInvalidStateErrorwhile the DB reports as open, close and reopen the DB and retry once. Both are bounded by the samePR1398_maxLoop = 3budget.Solution
Implement two related healing mechanisms on
IDBKeyValProvider, designed to work together as a single healing strategy with a shared retry budget — directly mirroring Dexie'sPR1398_maxLooppattern.Shared retry counter
Maintain a single counter inside
IDBKeyValProvider— call ithealAttemptsRemaining— initialized to3. The counter is:IDBKeyValProvider.setItem,multiSet,mergeItem,multiMerge,removeItem, etc.).The counter, the heal logic, and all the log messages live entirely inside
IDBKeyValProvider. None of this code runs on native; the SQLite provider is untouched.This naturally creates a circuit breaker for the permanent-corruption case (no successes → counter drains to 0 → no further heal attempts), while allowing a healthy session to recover from multiple separate transient incidents (each success replenishes the budget). It mirrors Dexie's
PR1398_maxLoopdirectly.1. On provider init: retry
indexedDB.open()onUnknownErrorWhen
IDBKeyValProviderinit throwsUnknownError, retry theindexedDB.open()call up to the remaining budget. This is the direct Dexie-style workaround for the transient post-Clear cookies and site dataclass (Walexander's repro in Dexie #543).Sketch:
2. Mid-session:
close + reopenon this errorWhen the error fires during a write operation in an active session, attempt a
close + reopenof the IDB connection — up to the remaining budget — before considering the operation failed:If a heal succeeds and the subsequent operation completes, the counter resets to 3, so a fresh transient incident later in the session gets a full budget again. If 3 heal attempts fail in succession with no intervening success, the counter hits 0 and subsequent operations fall through to cache-only behavior without further heal attempts or log noise (the cache already absorbed the write per parent comment §5).
Important constraints
MemoryOnlyProviderchanges nothing observable during the session and adds complexity without benefit.deleteDatabase()calls. Chapter 2 of [Onyx] InvestigateUnknownError: Internal error opening backing store for indexedDB.open.storage error #87862 demonstrated thatdeleteDatabase()also fails when corrupt LevelDB files persist — it's not a viable healing primitive in this scenario.Coordination with action items §8.1 and §8.2
This issue can be worked on independently of the others — the heal logic lives inside the provider (e.g.
IDBKeyValProvider.setItemwrapping the raw IDB call), so errors are caught and retried before they ever reachtryOrDegradePerformanceorretryOperation. None of the other action items are strict prerequisites.That said, they interact:
'Internal error opening backing store'string check). Currently the string check is dead code because of Bug 1, so it doesn't conflict with this issue. But §8.1 must remove the string check at the same time it fixes the catch — otherwise the live string check would route this error class todegradePerformance, which we don't want once heal is in place. As long as §8.1 ships its two changes together, the order with this issue doesn't matter.NON_RETRIABLE_ERRORSclassification for this error class) preventsOnyxUtils.retryOperationfrom retrying the operation independently when it eventually reaches that layer (after a failed heal). Without §8.2, a heal-failed write would still produce 5 retry log lines + an exhaustion alert fromretryOperation. With §8.2, it produces a single log line and exits cleanly. Either order works; the user-visible behavior is the same, but §8.2 cuts the residual log noise that survives a failed heal.Test plan
indexedDB.opento reject withUnknownErrortwice, then resolve. Confirm init succeeds after 3 attempts total (1 initial + 2 retries), heal-attempt logs fire, and the counter ends at1(3 - 2 decrements).indexedDB.opento always reject withUnknownError. Confirm init fails after 3 attempts, counter is0, and subsequent operations skip the heal path.Storage.setItemto reject once with the backing-store error, then resolve. Confirm a close+reopen happens, the secondsetItemsucceeds, the heal log fires, and the counter is reset to3after success.Storage.setItemto reject 3 times consecutively. Confirm 3 heal attempts happen, the counter drains to0, and a 4th failingsetItemdoes NOT trigger another heal attempt.1with mid-session heals, then a successfulsetItem, then a new error. Confirm the counter was reset to3and a fresh heal attempt fires.IDB heallog lines appear, and a fraction of users emit them and then continue without furtherFailed to save to storageerrors (indicating successful heal). The ratio of post-heal successes to heal attempts gives a direct readout of how often this mechanism helps.