Skip to content

feat(outbox): implement transactional outbox and async event relay#3

Merged
AlpNuhoglu merged 3 commits into
mainfrom
feat/transactional-outbox
Jun 17, 2026
Merged

feat(outbox): implement transactional outbox and async event relay#3
AlpNuhoglu merged 3 commits into
mainfrom
feat/transactional-outbox

Conversation

@AlpNuhoglu

Copy link
Copy Markdown
Owner

Eliminates the dual-write problem in the player service by collapsing business logic and event writes into a single Postgres transaction. Events are asynchronously polled and dispatched to NATS via a dedicated outbox-relay service.

  • Atomicity: Injected OutboxPublisher to route player events into the outbox_events table within the same ACID transaction.
  • Relay Architecture: Implemented safe concurrent polling using FOR UPDATE SKIP LOCKED with a bounded worker pool in outbox-relay.
  • Observability: Captured and replayed W3C trace contexts inside JSONB carrier to ensure unbroken distributed tracing spans.
  • Resilience: Assured at-least-once delivery; consumers de-duplicate on stable Event.ID. Added integration tests for rollbacks and retries.
  • Scope Limitation: Documented that Redis-backed services (Matchmaking/Leaderboard) remain direct-publish due to lack of shared ACID transactions.

Eliminates the dual-write problem in the player service by collapsing
business logic and event writes into a single Postgres transaction.
Events are asynchronously polled and dispatched to NATS via a dedicated
`outbox-relay` service.

- **Atomicity**: Injected `OutboxPublisher` to route player events into
  the `outbox_events` table within the same ACID transaction.
- **Relay Architecture**: Implemented safe concurrent polling using
  `FOR UPDATE SKIP LOCKED` with a bounded worker pool in `outbox-relay`.
- **Observability**: Captured and replayed W3C trace contexts inside
  JSONB `carrier` to ensure unbroken distributed tracing spans.
- **Resilience**: Assured at-least-once delivery; consumers de-duplicate
  on stable `Event.ID`. Added integration tests for rollbacks and retries.
- **Scope Limitation**: Documented that Redis-backed services
  (Matchmaking/Leaderboard) remain direct-publish due to lack of
  shared ACID transactions.
…vent

Eliminates redundant exported name stuttering in the `outbox` package
by renaming `OutboxEvent` to `Event`, adhering to idiomatic Go naming
conventions.

- Updated type definition, TableName, and model refs in `store.go`.
- Refactored internal channels and helper signatures in `relay.go`.
- Updated GORM `AutoMigrate` call in `cmd/player/main.go`.
- Refactored `relay_test.go`, `postgres_test.go`, and `outbox_test.go`.
- Kept `pkg/metrics` fields unchanged as they do not stutter.
Guarantee that a committed Postgres transaction can never lose its
domain event. The player service now writes business rows and the
event to outbox_events in one transaction; a dedicated outbox-relay
process polls committed rows and publishes them to NATS JetStream,
retrying until success (at-least-once).

- outbox_events table (migration 0003 + schema.sql + GORM model)
- OutboxPublisher implements events.Publisher, inserting on the
  business tx instead of publishing inline
- Relay: bounded worker pool, FOR UPDATE SKIP LOCKED batching,
  graceful shutdown, PENDING->PUBLISHED lifecycle with attempt_count
- New PlayerRegistered/PlayerUpdated events on the events.player stream
- Trace continuity via stored W3C carrier; outbox.* spans
- Prometheus metrics: pending, published_total, failures_total,
  publish_duration_seconds
- OUTBOX_* config, docker-compose service, Prometheus scrape target
- Unit tests (relay publish/retry/replay/trace) + integration tests
  (atomic write, rollback, RunBatch)
@AlpNuhoglu AlpNuhoglu merged commit a4cc670 into main Jun 17, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant