Unlocking efficiency: A guide to operator cache configuration on Red Hat OpenShift and Kubernetes

When building Kubernetes operators with controller-runtime, the default caching behavior can cause unexpected memory spikes and OOM errors in large production clusters. The controller-runtime caches all resources of a watched type cluster-wide, not just the filtered subset, because filtering happens after caching. Two main problem areas exist: resources watched via controller configuration and resources fetched during reconciliation via client Get/List calls. Solutions include disabling cache for specific resource types via CacheOptions.DisableFor, using the dynamic client from client-go to bypass the cache entirely, and configuring cache.ByObject with label/field/namespace selectors to limit which resources are loaded into memory. The ReaderFailOnMissingInformer option can prevent accidental informer creation. Auditing for implicit informers can be done by scanning logs for 'Starting EventSource' messages.

#kubernetes

#openshift

Mar 31•11m read time•From developers.redhat.com

Table of contents

Why is the cost there The role of the controller-runtime How it works Startups and running compared How the problem be solved Resources watched by the controller When not to filter the cache Identifying the scale of the problem Conclusion

Comment

Bookmark

Copy

Sort: