Linux pressure stall information (PSI) can be enabled in Red Hat OpenShift 4.21 via a MachineConfig object, exposing CPU, memory, and I/O stall metrics at the node, pod, and container level. Unlike traditional utilization metrics, PSI reveals actual resource contention and stalled work. Performance testing with 500+ containers shows no measurable impact on kubelet CPU or memory, but Prometheus pod RSS increases by up to 1.3 GB (42% above baseline) due to higher metric cardinality. Each pod generates 18 PSI metric series (3 container types × 6 metric types), including pause and POD cgroup containers. Metric relabeling to drop non-application container PSI series could reduce Prometheus memory, but this configuration is currently unsupported in OpenShift's built-in cluster monitoring operator. Recommendations include pre-allocating an additional 1.4 GB RSS per Prometheus pod before enabling PSI in production.

Table of contents
Vocabulary for PSITest MethodologyPrometheus memory impactKubelet CPU and memoryUnderstanding PSI metric cardinalityReducing Prometheus resource usageTest conclusionLearn moreSort: