Netflix has enhanced its reliability by extending prioritized load shedding techniques from the API gateway level to the individual service level, specifically for the video streaming control plane. This approach prioritizes critical user-initiated requests over non-critical pre-fetch requests, using partitioned concurrency limiters. This strategy proved effective during high-traffic incidents, ensuring high availability for user-initiated requests. Netflix also developed an internal library for prioritized load shedding using predefined priority buckets and incorporated CPU and IO-based load shedding techniques to maintain system performance under stress.
Sort: