Grab engineers improved image cache management in their Android app by replacing a standard LRU cache with a Time-Aware LRU (TLRU) cache. The existing 100 MB LRU cache had two failure modes: filling up quickly for heavy users causing performance degradation, or retaining images for months in low-usage scenarios wasting storage. TLRU adds three parameters—TTL for expiration, a minimum cache size threshold to retain essential images, and a maximum cache size cap. Rather than building from scratch, they forked Glide and extended its DiskLruCache, preserving its crash recovery, thread safety, and performance optimizations. Key engineering challenges included persisting last-access timestamps across restarts, implementing time-based eviction on each cache access, and migrating existing LRU caches by assigning a uniform migration timestamp. Bidirectional compatibility was maintained so rollbacks remain safe. The result: 95% of users saw a 50 MB reduction in cache size, reclaiming terabytes of storage fleet-wide while keeping cache hit ratio degradation within a 3 percentage point threshold and avoiding increased server costs.
Sort: