A detailed guide to the key Prometheus-formatted metrics for monitoring Karpenter, the Kubernetes autoscaler. Covers five categories: scheduling and pod lifecycle metrics (startup duration, queue depth), disruption and consolidation metrics (eligible nodes, termination duration, NodeClaim counters), cloud provider metrics (errors and API latency), controller internals (reconcile time, work queue depth), and cost/interruption metrics (instance pricing estimates, Spot interruption events). For each metric, the guide explains what it measures, what abnormal values indicate, and how to correlate metrics to diagnose root causes like cloud API throttling, PodDisruptionBudget contention, or misconfigured NodePools.
Table of contents
Track Karpenter metrics to monitor performanceGain visibility into your just-in-time provisioningSort: