Adding a queue to handle traffic spikes doesn't solve capacity problems — it defers them. With concrete numbers, this post shows how a 2x traffic spike for just one hour can create 3.6 million queued requests that never drain without adding capacity. Server-side latency metrics look healthy while clients wait hours, meaning dashboards actively mislead engineers. The post explores FIFO vs. random vs. weighted quartile queue selection strategies, showing that reshaping latency distribution is a zero-sum game. The only real fix is adding capacity: 50% excess capacity drains a spike-induced queue in about an hour. The conclusion is blunt — queues make capacity problems invisible, not solved.

10m read timeFrom pushtoprod.substack.com
Post cover image
Table of contents
When Things Are Slow, Look for QueuesA Few NumbersPerceived Latency and Queue SizesQueuing and SurrealismAlternative Approaches to Queue ProcessingQueue Selection Approaches In The WildThe Road to RecoveryConclusion

Sort: