Running Kubernetes across multiple availability zones improves resilience but can silently inflate cloud bills through unnecessary cross-zone traffic. This post explains how to use Cilium to keep traffic local by default while preserving multi-AZ failover. Key techniques covered include: using Kubernetes trafficDistribution with Cilium's kube-proxy replacement to prefer same-zone service backends, Local Redirect Policy to keep high-volume platform traffic (DNS, metrics, logs) on the same node, zone-aligned Egress Gateway to prevent egress hairpinning across zones, and Bandwidth Manager to prevent retry storms under load. Hubble is shown as the observability layer to validate that locality controls are working as intended. Operational guidance covers phased rollout, handling zone skew, testing fallback paths, and explicitly marking exceptions like shared data stores.

12m read timeFrom isovalent.com
Post cover image
Table of contents
Designing for locality in a multi AZ clusterThe hidden cost drivers in multi AZ clustersThe Cilium feature set that enables cost conscious multi AZOperational guidanceWrap up

Sort: