The 2026 State of Kubernetes Optimization Report reveals worsening resource utilization across tens of thousands of production clusters: CPU utilization dropped to 8% (from 10%), memory to 20% (from 23%), and GPU sits at just 5%. CPU overprovisioning jumped from 40% to 69% year-over-year. The report argues overprovisioning is structural — driven by padded resource requests, Helm chart conservatism, and lack of post-deployment review — and that it doesn't actually improve reliability. One case study showed OOM kills dropped to near zero after automated rightsizing cut provisioned CPUs by half. GPU economics are highlighted as especially problematic given rising prices (AWS raised H200 Capacity Block prices 15% in Jan 2026). GPU sharing via time-slicing and Spot instances can yield 70%+ savings. The key differentiator for organizations that closed the efficiency gap was continuous automated rightsizing rather than one-time or manual approaches.

5m read timeFrom cast.ai
Post cover image
Table of contents
The overprovisioning problem is structural, not accidentalGPUs deserve a separate conversationGPU sharing is well understood, and almost nobody uses itWhat the organizations that closed the gap did differently

Sort: