Queue backlogs in distributed systems can be solved with a small set of practical formulas rather than guesswork. The core insight is that systems provisioned exactly for steady-state traffic have zero surplus capacity and will never drain a backlog on their own. Key formulas covered include: drain time (backlog size / surplus capacity), the headroom formula for sizing consumer fleets against an RTO, and auto-scaling triggers based on queue growth rate rather than depth alone. The article also explains three critical failure modes: stale message degradation (apply a degradation factor to drain estimates), retry amplification leading to metastable failure states where recovery generates more load than it resolves, and cascading bottlenecks in multi-stage pipelines where scaling the wrong stage provides zero benefit. Load shedding via TTL-based message expiry is presented as a cost-effective alternative to over-provisioning. Post-incident measurement of degradation factor, retry amplification, and actual drain time is recommended to calibrate future estimates.
Table of contents
IntroductionThe Three Numbers That MatterLittle's Law: The One Formula Everyone Should KnowHow Backlogs Form and DrainThe Complications That Actually MatterCascading Backlogs in Multi-Stage PipelinesWhen to Shed Load Instead of DrainingCapacity Planning: Turning Formulas Into DecisionsCaveat: Unprocessable Messages and Dead-Letter QueuesWhat to Measure and RecordConclusionAbout the AuthorSort: