#63 fix: self-heal stale watcher_item_id on WatchedItem 404#64
Merged
Conversation
Distinguish a permanently deleted WatchedItem (HTTP 404) from a transient Watcher outage in the dashboard read/action paths. On a confirmed WatcherNotFound, NULL the stale watcher_item_id and fall back to the not_watching state (re-exposing "Begin Watching") instead of sticking in degraded forever. Transient failures (network/5xx) still render degraded and retain the link, so a brief outage never drops it. - _clear_stale_watcher_link helper clears the pointer and commits - _render_status_partial / _render_watcher_section catch WatcherNotFound ahead of the broad except; gain a session param (threaded to all callers) - check-now / toggle-watch-active flash an accurate "no longer watched" message and clear the link on 404 Keeps the one-way Archiver->Watcher control-plane model intact: Archiver reconciles lazily on its next read rather than requiring Watcher to notify it. Dashboard-only; no schema/SDK/API-contract change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- _clear_stale_watcher_link is now best-effort: it never raises. A failed reconcile commit is rolled back, the item refreshed, and the failure reported as False so the render/action paths degrade rather than 500 (preserving the "never 500 the partial" contract). [CR finding 1] - Render helpers branch on the bool: render not_watching only when the link was durably cleared, else degraded. check-now/toggle flash an accurate message when the local update fails. - Add tests: resync-watcher 404 self-heal; _clear_stale_watcher_link returns False (no raise) on commit failure; watcher-section degrades (200, link retained) when the reconcile commit fails. [CR findings 1, 2] - UI.md: document resync-watcher's 404 self-heal-via-render. [CR finding 3] Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…, PR #64) Addresses CR findings 6-9: - check-now / toggle: on a failed link-clear, render degraded straight from the path item_id instead of re-rendering through the (possibly expired) item — removes the double clear-attempt and the not_watching/flash mismatch (finding 6), and closes the triple-fault path where the trailing render could touch an expired attribute and 500 (finding 9). - Factor _status_degraded / _section_degraded (render degraded from an id alone) and shared message constants so the flash and degraded-panel copy stay aligned across render and action handlers (finding 8). - Add action-handler commit-failure tests for check-now and toggle: degrade (200) with the reconcile flash, never 500, link retained (finding 7). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This was referenced Jun 22, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #63.
Problem
The only coupling between an Archiver
InfoItemand its sibling WatcherWatchedItemis the soft pointerInfoItem.watcher_item_id(nullableVARCHAR(50), no FK). The dashboard read path caught all Watcher errors with a broadexcept Exceptionand collapsed them into thedegradedstate — so a permanently deleted WatchedItem (HTTP 404) was indistinguishable from a transient outage. Consequences: the stale pointer was never cleared, and becausenot_watching/ "Begin Watching" are gated onwatcher_item_id IS NULL, the InfoItem was stuck indegradedforever with no recovery path.This becomes load-bearing as the sibling Watcher service adds permanent deletion of WatchedItems.
Fix
Distinguish
WatcherNotFound(404) from the generic failure path:_clear_stale_watcher_link(session, item)— NULLswatcher_item_idand commits._render_status_partial/_render_watcher_section— catchWatcherNotFoundahead ofexcept Exception: on 404 clear the link and rendernot_watching; any other error still rendersdegradedand retains the id. Both gained asessionparam (threaded through all callers).check-now/toggle-watch-active— catchWatcherNotFound→ clear the link + flash "This item is no longer watched — it was removed in Watcher" instead of the misleading "try again shortly."Only a confirmed 404 clears the pointer; transient failures keep it, so a brief Watcher outage never drops the link. Keeps the one-way Archiver→Watcher control-plane model intact — Archiver reconciles lazily on its next read; no inbound coupling added.
Tests
tests/dashboard/test_watcher_404_reconcile.py(TDD, 5 cases):not_watching+ link cleareddegradedand retains the idFull suite: 666 passed.
ruff check/ruff formatclean.Scope
Dashboard-only (
src/dashboard/routes/info_items.py). No schema, SDK, or API-contract change → no CHANGELOG entry required (per the changelog policy). Living docs (docs/UI.md) updated for the 404 self-heal.Out of scope:
resync-watcherroutes through the coresync_on_source_swaphelper (swallows all errors intoWatcherSyncOutcome.FAILED); it still self-heals via the trailing_render_status_partialre-render, but its flash stays generic. Distinguishing 404 there would change a contract shared with the API routes — deferred.🤖 Generated with Claude Code