State drift occurs when two components synchronize state via deltas and the observer's state diverges from the producer's due to lost messages or failures. The core problem is that without a self-correction mechanism, a single lost delta can permanently corrupt state. The recommended solution is to treat delta updates opportunistically while periodically performing full state reconciliation — similar to how Git uses Merkle trees, rsync compares directory listings, or video codecs use keyframes. Strong delivery guarantees (e.g., Kafka's durable commit log) are a complementary strategy. Webhooks are criticized for lacking redelivery guarantees, requiring inbound HTTP servers, and making debugging difficult, though a durable queue proxy service is proposed as a mitigation.

6m read timeFrom erikbern.com
Post cover image
Table of contents
What is state drift?Why can’t you just write code with 100% uptime? :trollface:How to solve it?Speaking of webhooks…

Sort: