When building Kubernetes operators with controller-runtime, the default caching behavior can cause unexpected memory spikes and OOM errors in large production clusters. The controller-runtime caches all resources of a watched type cluster-wide, not just the filtered subset, because filtering happens after caching. Two main problem areas exist: resources watched via controller configuration and resources fetched during reconciliation via client Get/List calls. Solutions include disabling cache for specific resource types via CacheOptions.DisableFor, using the dynamic client from client-go to bypass the cache entirely, and configuring cache.ByObject with label/field/namespace selectors to limit which resources are loaded into memory. The ReaderFailOnMissingInformer option can prevent accidental informer creation. Auditing for implicit informers can be done by scanning logs for 'Starting EventSource' messages.

Table of contents
Why is the cost thereThe role of the controller-runtimeHow it worksStartups and running comparedHow the problem be solvedResources watched by the controllerWhen not to filter the cacheIdentifying the scale of the problemConclusionSort: