A case study from Wayfair demonstrates using PostgreSQL as a Dead Letter Queue (DLQ) for event-driven systems instead of Kafka topics. Failed events from Kafka consumers are stored in a PostgreSQL table with metadata like error reasons, retry counts, and status fields. A scheduled retry mechanism using ShedLock ensures safe, distributed retries across multiple instances using PostgreSQL's FOR UPDATE SKIP LOCKED feature. This approach provides better visibility, queryability, and operational simplicity for handling failures in distributed event processing pipelines while letting Kafka handle high-throughput ingestion and PostgreSQL manage durable failure recovery.

6m read timeFrom diljitpr.net
Post cover image
Table of contents
DLQ Table Schema and Indexing StrategyDLQ Retry Mechanism with ShedLockOperational BenefitsMy Thoughts

Sort: