A personal story about a streaming subscription that kept cancelling itself minutes after being reactivated. Through black-box debugging, the author identified a sync-vs-async race condition: the bank's account unlinking was processed asynchronously, so when the user quickly re-linked the accounts, the delayed async cancellation event arrived after the new subscription was established, deactivating it. Waiting overnight before re-linking allowed the async workflow to complete first, resolving the issue. The post reflects on the complexity of cross-organizational systems and the engineering effort required to make them work reliably.

11m read timeFrom predr.ag
Post cover image
Table of contents
Easy to fix, surely?"No issues on our end"Support ping-pong and debugging steps for someone elseBlack-box debugging to fall asleepWhat (probably) happenedSystems are hard

Sort: