HPA-managed workloads: Why the obvious waste stays

Teams running Kubernetes can easily identify overprovisioned workloads, yet HPA-managed services rarely get optimized. The core reason is that resource requests and HPA scaling behavior are tightly coupled — changing requests shifts when and how aggressively the autoscaler responds, making it feel riskier than a code deploy. Engineers who get paged during incidents are the same ones who set those resource values, so the personal accountability calculus favors leaving waste in place over introducing unpredictability. Standard rightsizing loops break here because they treat it as a math problem rather than a resilience problem. The right approach requires adjusting requests and HPA targets atomically, providing transparent reasoning, respecting existing SLOs, offering fast rollback, and building trust incrementally before moving to automation.

#kubernetes

#finops

Apr 12•5m read time•From thenewstack.io

Table of contents

The problem isn’t finding the waste What teams are actually protecting Why standard rightsizing stops here What would need to be true for teams to act

Comment

Bookmark

Copy

Sort: