-
-
Notifications
You must be signed in to change notification settings - Fork 6
ref(eco): Adds report_timeout_errors options to Task definitions, allows ProcessingDeadlineExceeded to be retried #592
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
b7f98fc
5482423
91c9dff
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -229,14 +229,15 @@ def handle_alarm(signum: int, frame: FrameType | None) -> None: | |
| _execute_activation(task_func, inflight.activation, app.context_hooks) | ||
| next_state = TASK_ACTIVATION_STATUS_COMPLETE | ||
| except ProcessingDeadlineExceeded as err: | ||
| with sentry_sdk.isolation_scope() as scope: | ||
| scope.fingerprint = [ | ||
| "taskworker.processing_deadline_exceeded", | ||
| inflight.activation.namespace, | ||
| inflight.activation.taskname, | ||
| ] | ||
| scope.set_transaction_name(inflight.activation.taskname) | ||
| sentry_sdk.capture_exception(err) | ||
| if task_func.report_timeout_errors: | ||
| with sentry_sdk.isolation_scope() as scope: | ||
| scope.fingerprint = [ | ||
| "taskworker.processing_deadline_exceeded", | ||
| inflight.activation.namespace, | ||
| inflight.activation.taskname, | ||
| ] | ||
| scope.set_transaction_name(inflight.activation.taskname) | ||
| sentry_sdk.capture_exception(err) | ||
| metrics.incr( | ||
| "taskworker.worker.processing_deadline_exceeded", | ||
| tags={ | ||
|
|
@@ -245,7 +246,11 @@ def handle_alarm(signum: int, frame: FrameType | None) -> None: | |
| "taskname": inflight.activation.taskname, | ||
| }, | ||
| ) | ||
| next_state = TASK_ACTIVATION_STATUS_FAILURE | ||
| retry = task_func.retry | ||
| if retry and retry.should_retry(inflight.activation.retry_state, err): | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would you have tasks configured to do retries on
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The use case for us would be: we try a webhook dispatch, it fails with a timeout, so we retry x number of times total before bailing on the webhook attempt entirely. If there's another way to achieve this without hooking into processingDeadlineExceeded except block here, I'm game for that.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You could have the task processing deadline be longer than the cumulative timeouts of your HTTP request (and retries). That would let you do some retries within the task, handle those failures, and request a full task retry without having to catch processing deadlines.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Discussed off issue but the issue we're hitting is with the DNS resolution in certain cases (ngrok is typically a culprit here), which we can't meaningfully apply a timeout to. This was a way for us to allow retries without reimplementing the same logic we already have here. |
||
| next_state = TASK_ACTIVATION_STATUS_RETRY | ||
| else: | ||
| next_state = TASK_ACTIVATION_STATUS_FAILURE | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sentry error captured before retry check causes duplicate reportsMedium Severity In the Reviewed by Cursor Bugbot for commit 91c9dff. Configure here. |
||
| except Exception as err: | ||
| retry = task_func.retry | ||
| captured_error = False | ||
|
|
||


There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: The retry logic does not implicitly handle
ProcessingDeadlineExceededas intended. It requires an expliciton=(ProcessingDeadlineExceeded,)configuration, causing tasks to fail instead of retrying on timeout.Severity: HIGH
Suggested Fix
Update the
should_retrymethod inretry.pyto also check forisinstance(exc, ProcessingDeadlineExceeded). This will make the exception implicitly retriable, aligning the code's behavior with the pull request's description. Also, update the stale comment that incorrectly refers toTimeoutErrorbeing raised.Prompt for AI Agent
Did we get this right? 👍 / 👎 to inform future reviews.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's what we want now.