Skip to content

Add retry with exponential backoff for transient external service calls#950

Open
aliyudotdev wants to merge 1 commit into
rinafcode:mainfrom
aliyudotdev:fix/886-retry-backoff
Open

Add retry with exponential backoff for transient external service calls#950
aliyudotdev wants to merge 1 commit into
rinafcode:mainfrom
aliyudotdev:fix/886-retry-backoff

Conversation

@aliyudotdev

Copy link
Copy Markdown

Summary

  • Add a shared RetryPolicy utility (src/common/utils/retry-policy.ts): max 3 retries, 1s base delay, 2x multiplier, 30s max delay, full jitter. Client errors (4xx) are never retried; 5xx and network errors are.
  • Apply RetryPolicy to CdnService.invalidate, a new EmailService, and a new PaymentProviderService (wraps IPaymentProvider).
  • Add a Prometheus counter external_call_retry_total{service, attempt} incremented on every retry attempt.
  • Add unit tests covering: transparent retry-then-success, retry exhaustion/propagation, and 4xx short-circuiting.

Test plan

  • npm test -- retry-policy cdn.service email.service payment-provider.service
  • Reviewed retry/backoff math and jitter bounds by hand
  • Verified 4xx errors short-circuit without retrying in all three wrapped services

Closes #886

…lures

Wrap CdnService.invalidate, EmailService, and PaymentProviderService with
a shared RetryPolicy utility (3 retries, 1s base delay, 2x multiplier,
30s max, full jitter). Transient 5xx/network errors are retried; 4xx
client errors propagate immediately. Tracks retries via the
external_call_retry_total{service,attempt} Prometheus counter.

Closes rinafcode#886
@drips-wave

drips-wave Bot commented Jun 29, 2026

Copy link
Copy Markdown

@aliyudotdev Great news! 🎉 Based on an automated assessment of this PR, the linked Wave issue(s) no longer count against your application limits.

You can now already apply to more issues while waiting for a review of this PR. Keep up the great work! 🚀

Learn more about application limits

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add retry with exponential backoff for transient failures in external service calls

1 participant