Teams running Kubernetes can easily identify overprovisioned workloads, yet HPA-managed services rarely get optimized. The core reason is that resource requests and HPA scaling behavior are tightly coupled — changing requests shifts when and how aggressively the autoscaler responds, making it feel riskier than a code deploy. Engineers who get paged during incidents are the same ones who set those resource values, so the personal accountability calculus favors leaving waste in place over introducing unpredictability. Standard rightsizing loops break here because they treat it as a math problem rather than a resilience problem. The right approach requires adjusting requests and HPA targets atomically, providing transparent reasoning, respecting existing SLOs, offering fast rollback, and building trust incrementally before moving to automation.
Table of contents
The problem isn’t finding the wasteWhat teams are actually protectingWhy standard rightsizing stops hereWhat would need to be true for teams to actSort: