Netflix's Compute and Performance Engineering teams are tackling the 'noisy neighbor' problem in their multi-tenant environment by using eBPF for continuous, low-overhead Linux scheduler instrumentation. This approach enables effective self-serve monitoring of resource-heavy containers causing performance issues. By instrumenting run queue latency and leveraging eBPF hooks like sched_wakeup and sched_switch, they efficiently track performance degradation. The gathered data helps refine CPU isolation strategies and enhance infrastructure observability. This initiative has also inspired the development of tools like bpftop for optimizing eBPF code.
2 Comments
Sort: