Skip to content

fix: reset task status to queued on intermediate BullMQ retry attempts#100

Merged
DaxServer merged 2 commits into
mainfrom
fix/reset-status-on-retry
Jun 21, 2026
Merged

fix: reset task status to queued on intermediate BullMQ retry attempts#100
DaxServer merged 2 commits into
mainfrom
fix/reset-status-on-retry

Conversation

@DaxServer

Copy link
Copy Markdown
Owner

When a job fails with a retryable error (SourceCdnError, HashLockError) and BullMQ schedules a retry, the failed event handler was returning early without updating the DB. Since status was set to in_progress at job start (line 122), it stayed there until the next attempt ran — making tasks appear stuck.

Fix: call updateUploadStatus(id, 'queued') in the early-return path before BullMQ retries. The access token is not touched, so retries continue to work.

— Claude Sonnet 4.6

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@greptile-apps

greptile-apps Bot commented Jun 21, 2026

Copy link
Copy Markdown
Contributor

Confidence Score: 5/5

Safe to merge — the change is narrowly scoped to the intermediate-retry branch, is wrapped in a try/catch, and is covered by new and updated tests.

The fix is small and self-contained: one new DB call in a guarded try/catch, the access-token logic is untouched, and both the happy path and the DB-error path are exercised by the updated test suite.

No files require special attention.

Important Files Changed

Filename Overview
backend/src/workers/upload.worker.ts Adds updateUploadStatus(uploadId, 'queued') inside a try-catch in the intermediate-retry path of the failed event handler, so tasks no longer appear stuck at in_progress between BullMQ retry attempts. Comment updated to match new behavior.
backend/src/tests/upload.worker.test.ts Updates existing intermediate-retry test to assert updateUploadStatus is called with 'queued', and adds a new test proving DB errors are swallowed so the handler never rejects.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant BullMQ
    participant FailedHandler as failed event handler
    participant DB as uploads DB
    participant Logger

    BullMQ->>FailedHandler: emit('failed', job, err) [intermediate attempt]
    FailedHandler->>FailedHandler: "check !(err instanceof StorageError) && attemptsMade < attempts"
    FailedHandler->>Logger: warn — job attempt failed, will retry
    FailedHandler->>DB: updateUploadStatus(uploadId, 'queued')
    alt DB write succeeds
        DB-->>FailedHandler: ok
    else DB write fails
        DB-->>FailedHandler: throws
        FailedHandler->>Logger: error — failed to update db status
    end
    FailedHandler-->>BullMQ: return (early)
    BullMQ->>BullMQ: schedule next retry attempt

    Note over BullMQ,Logger: Final failure path (unchanged)
    BullMQ->>FailedHandler: emit('failed', job, err) [last attempt]
    FailedHandler->>DB: updateUploadStatus(uploadId, 'failed') + clearUploadAccessToken
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant BullMQ
    participant FailedHandler as failed event handler
    participant DB as uploads DB
    participant Logger

    BullMQ->>FailedHandler: emit('failed', job, err) [intermediate attempt]
    FailedHandler->>FailedHandler: "check !(err instanceof StorageError) && attemptsMade < attempts"
    FailedHandler->>Logger: warn — job attempt failed, will retry
    FailedHandler->>DB: updateUploadStatus(uploadId, 'queued')
    alt DB write succeeds
        DB-->>FailedHandler: ok
    else DB write fails
        DB-->>FailedHandler: throws
        FailedHandler->>Logger: error — failed to update db status
    end
    FailedHandler-->>BullMQ: return (early)
    BullMQ->>BullMQ: schedule next retry attempt

    Note over BullMQ,Logger: Final failure path (unchanged)
    BullMQ->>FailedHandler: emit('failed', job, err) [last attempt]
    FailedHandler->>DB: updateUploadStatus(uploadId, 'failed') + clearUploadAccessToken
Loading

Reviews (2): Last reviewed commit: "fix: guard queued status reset in failed..." | Re-trigger Greptile

Comment thread backend/src/workers/upload.worker.ts Outdated
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@DaxServer

Copy link
Copy Markdown
Owner Author

Both fixed in 313ef39.

  1. Wrapped the updateUploadStatus('queued') call in its own try-catch matching the existing error-handling pattern in the handler — a DB error now logs and the handler resolves cleanly.
  2. Updated the comment from "skip DB updates" to "reset status to queued" to match the new behavior.

— Claude Sonnet 4.6

@DaxServer DaxServer merged commit 78e8f17 into main Jun 21, 2026
5 checks passed
@DaxServer DaxServer deleted the fix/reset-status-on-retry branch June 21, 2026 11:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant