Why is your Kubernetes cluster adding nodes when the dashboards look fine?

Kubernetes clusters often add nodes unexpectedly even when CPU and memory dashboards show available headroom. The root cause is typically stale resource requests — values set with safety buffers that were never revisited as workloads evolved. Because the scheduler places pods based on declared requests rather than actual usage, inflated requests cause nodes to appear full before real utilization is reached, triggering the autoscaler unnecessarily. The fix involves comparing requests to observed usage over time, identifying the few namespaces that dominate capacity, and gradually bringing requests back in line with reality. Inference workloads amplify this problem due to fast-scaling replica counts. Practical steps include checking for persistent gaps between requests and usage, using cost allocation views to identify offenders, and rolling out changes incrementally to maintain trust and avoid pager noise.

#kubernetes

Mar 08•6m read time•From thenewstack.io

Table of contents

The confusing part is that metrics look fine This shows up in well-run clusters too The first question to ask when scaling feels wrong How to confirm drift without turning it into a project Where cost allocation helps without turning this into a billing conversation Getting unstuck is usually smaller than it feels What changes when drift is under control

Comment

Bookmark

Copy

Sort: