A real-world walkthrough of diagnosing and fixing a Go memory leak that caused OOM crashes in a Kafka consumer service. Covers Go memory fundamentals (stack vs. heap, GC tricolor algorithm, TCMalloc-based arenas), then details two concrete fixes: using Uber's automaxprocs to correctly set GOMAXPROCS in Kubernetes containers, and using pprof heap profiling to identify two root causes — repeated time.LoadLocation disk reads fixed with sync.Once, and a memory leak in the Confluent Kafka Go library resolved by switching to the Segmentio kafka-go package, reducing memory from hundreds of MB to a stable 25 MB.
2 Comments
Sort: