Skip to content

# Design Deficiency: No timeout or recovery path on blocking startup hydration #158

@Midway65

Description

@Midway65

Design Deficiency: No timeout or recovery path on blocking startup hydration

Version

Introduced: v5.7.3 (PR #134849657d7)
Still present: v5.8.0 (b03e7630)


Summary

PR #134 introduced a blocking "Updating local chat index…" spinner that gates the chat UI on fullRebuild() completing. This is intentional behaviour for post-migration startup. However, there is no timeout on fullRebuild() and no in-app recovery path if it fails to complete. If fullRebuild() stalls for any reason, the spinner runs permanently and the plugin becomes unusable with no way for the user to recover without manually editing data.json.

This is not a confirmed widespread failure — for most users the spinner resolves normally. The concern is that the code has no defence against the scenario where it does not.


Background

Before PR #134, fullRebuild() was awaited in the background. The plugin was already marked initialized before the sync ran, so a slow or failed rebuild was tolerable — the chat UI remained usable. PR #134 added a blocking UI gate on top of the same operation without adding a corresponding timeout, creating a scenario where a stall becomes unrecoverable.


The gap

In HybridStorageAdapter.performInitialization():

const shouldBlockStartupHydration = await this.shouldBlockStartupHydration(storagePlan);
if (shouldBlockStartupHydration) {
    this.startBlockingStartupHydration();  // blocks chat UI
}

// ...

if (!syncState || actuallyMigrated || shouldBlockStartupHydration) {
    try {
        await this.syncCoordinator.fullRebuild({ onProgress: ... });  // no timeout
    } catch (rebuildError) {
        this.failStartupHydration(...);  // only reachable on a thrown error, not a stall
    }
}

if (shouldBlockStartupHydration && this.startupHydrationState.phase !== 'error') {
    this.completeStartupHydration();  // never called if fullRebuild() stalls
}

If fullRebuild() neither resolves nor throws — for example because Obsidian Sync holds a file lock on a shard being downloaded — completeStartupHydration() is never called and failStartupHydration() is never triggered. The spinner runs forever.

There is also no in-app escape: the Settings > Data tab has no migration reset control, so a user in this state has no recovery path short of manually editing data.json.


When the blocking spinner appears

shouldBlockStartupHydrationForVerifiedCutover returns true when:

Condition Notes
migrationState === 'verified' Migration ran to completion
sourceOfTruthLocation === 'vault-root' Set automatically when state is 'verified'
conversationFileCount > 0 Shard files exist in vault-root
cachedConversationCount === 0 SQLite has no conversations
cachedMessageCount === 0 SQLite has no messages

The SQLite-empty condition arises legitimately in at least three common scenarios:

  • First launch after upgrading from a pre-v5.7.3 version — migration runs and completes in the same session, SQLite is empty, blocking condition is immediately met
  • New device via Obsidian Syncdata.json and shard files sync across but cache.db is local-only and does not sync, so SQLite is empty on arrival
  • After deleting cache.db — documented in various places as a safe reset action

For most users in these scenarios fullRebuild() completes and the spinner resolves normally. The risk is that if it does not, there is currently nothing to catch that.


Suggested fixes

Option A — Add a timeout (simplest safeguard)

Wrap fullRebuild() in Promise.race() so that a stall triggers failStartupHydration() rather than looping forever:

const REBUILD_TIMEOUT_MS = 30_000;

const rebuildTimeout = new Promise<never>((_, reject) =>
    setTimeout(() => reject(new Error('fullRebuild timed out after 30s')), REBUILD_TIMEOUT_MS)
);

try {
    await Promise.race([
        this.syncCoordinator.fullRebuild({ onProgress: ... }),
        rebuildTimeout
    ]);
} catch (rebuildError) {
    console.error('[HybridStorageAdapter] Full rebuild failed:', rebuildError);
    this.failStartupHydration(rebuildError instanceof Error ? rebuildError.message : String(rebuildError));
}

Option B — Do not block startup on vault-root hydration

An alternative is to not gate the UI on fullRebuild() completing. The plugin is already marked initialized before the sync runs — the blocking gate is additive. Removing it restores the pre-PR-#134 behaviour where a slow rebuild is tolerable and the chat UI remains usable throughout:

// In performInitialization():
const shouldBlockStartupHydration = false;
this.clearStartupHydrationState();

To ensure reads fall back to plugin-scoped legacy paths while the rebuild runs in the background:

// In applyStoragePlan():
this.jsonlWriter.setVaultEventStoreReadEnabled(false);

Option C — Add a recovery control in Settings > Data

A "Reset migration state" button that clears migration.state and reloads the plugin would give any user caught in the spinner a self-service escape without needing to touch data.json.


Affected files

  • src/database/adapters/HybridStorageAdapter.tsperformInitialization(), failStartupHydration()
  • src/database/sync/SyncCoordinator.tsfullRebuild() (no timeout)
  • src/settings/tabs/DataTab.ts — no migration reset control

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions