Skip to content

Webhook notification system has no delivery guarantee, deduplication, or retry idempotency — events are silently dropped under transient failures #30

Description

@Uchechukwu-Ekezie

Summary

The notification service delivers webhook payloads in a fire-and-forget pattern with no persistence, acknowledgment tracking, or idempotency key:

// backend/src/services/notificationService.ts
function signPayload(payload: any, secret: string): { signature: string; timestamp: string }
function verifyWebhookSignature(...): boolean

Webhook delivery is invoked directly within the rebalance execution path. If the consumer endpoint times out, returns 5xx, or the backend process restarts mid-delivery, the event is permanently lost with no retry and no record that delivery was attempted.

Impact

1. Silent event loss

A rebalance or circuitBreaker notification that fails delivery is never retried. Downstream systems (trading bots, monitoring dashboards, compliance audit trails) built on webhook events will have invisible gaps.

2. Duplicate delivery on process restart

If the backend crashes after sending the webhook but before persisting the delivery record, the event will be resent on next startup — a duplicate. Consumer systems receive the same rebalance event twice, potentially triggering double actions.

3. No idempotency key on payload

The signed payload contains no stable event ID. Consumers cannot distinguish a legitimate second event from a duplicate retry, making safe deduplication impossible on the consumer side.

4. Timestamp-only replay protection is insufficient

verifyWebhookSignature rejects payloads older than 300 seconds. This prevents replays older than 5 minutes but does nothing to prevent duplicates within that window — a common attack vector in webhook security.

Steps to Reproduce

  1. Configure a webhook endpoint that returns 500 on the first attempt.
  2. Trigger a rebalance.
  3. Observe: no retry occurs, no error is logged to a delivery failure table, the event is gone.

Suggested Fix

Short-term

  • Persist outbound webhook attempts to a webhook_deliveries table with (event_id, status, attempts, next_retry_at).
  • Add a stable eventId (UUID v4) to every NotificationPayload.
  • Integrate a BullMQ retry queue for failed deliveries with exponential backoff (max 5 attempts).

Long-term

  • Expose a GET /webhooks/deliveries?eventId= endpoint so consumers can verify receipt.
  • Store X-SentientFi-Event-Id in the signature input so consumers can use it as an idempotency key.
export interface NotificationPayload {
  eventId: string;          // ← stable UUID, missing today
  userId: string;
  eventType: 'rebalance' | 'circuitBreaker' | 'priceMovement' | 'riskChange';
  // ...
}

References

  • backend/src/services/notificationService.tssignPayload, verifyWebhookSignature, notification providers
  • backend/src/db/notificationDb.ts — missing delivery tracking table
  • backend/src/db/migrations/004_add_webhook_secret.up.sql — current webhook schema
  • docs/NOTIFICATIONS.md — documented delivery guarantees

Severity: High — rebalance and circuit-breaker events are silently dropped under any transient network or server failure

Metadata

Metadata

Labels

GrantFox OSSIssue tracked in GrantFox OSSMaybe RewardedIssue may be eligible for a GrantFox rewardOfficial CampaignCampaign: Official CampaignbugSomething isn't working

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions