Best of Apache Kafka — March 2026

1
Article
Confluent Blog·12w
Real-Time Decisioning and Autonomous Data Systems
Real-time decisioning and autonomous data systems represent a shift from passive reporting to active, automated action on live data. Real-time decisioning follows a signal-logic-action loop, triggering immediate responses without human intervention. Autonomous data systems extend this by adding closed feedback loops that allow systems to self-correct and adapt over time. The post outlines four architectural components: event ingestion, stateful enrichment, a decision engine (rules or ML models), and action connectors with feedback capture. Concrete use cases include anomaly-triggered authentication, dynamic supply-demand balancing, predictive failure prevention, and intelligent event routing. Key risks covered include the need for kill switches, schema validation, audit logs, and monitoring for model drift. A phased adoption path is recommended, starting with identifying slow decisions and incrementally introducing event streaming (Kafka), enrichment, rule automation, and eventually AI models.
38
2
2
Article
System Design Codex·11w
How Agoda Load Balanced Kafka
Agoda processes hundreds of terabytes of Kafka data daily for real-time price updates from suppliers. Standard round-robin partitioning caused over-provisioning due to heterogeneous hardware and uneven message workloads. Static solutions like identical pod deployments and weighted load balancing were rejected as impractical. Instead, Agoda built a dynamic lag-aware system with two components: a lag-aware producer that routes fewer messages to high-lag partitions using Same-Queue Length and Outlier Detection algorithms, and lag-aware consumers that proactively unsubscribe to trigger rebalancing when experiencing high lag, leveraging Kafka 2.4's incremental cooperative rebalance protocol.
29
3
Article
Debezium·11w
Hello Debezium Team!
Vincenzo Santonastaso introduces himself as a new core contributor to the Debezium open source project. He shares his background as a Senior Product Engineer at lastminute.com working on distributed systems for flight booking, and prior experience at BMC Software with time-series data and forecasting. His interests center on distributed systems and event-driven architectures, and he expresses enthusiasm for contributing more deeply to Debezium.
17
4
Article
ByteByteGo·10w
How Reddit Migrated Petabyte-Scale Kafka from EC2 to Kubernetes
Reddit's engineering team migrated its entire Apache Kafka fleet — over 500 brokers and more than a petabyte of live data — from Amazon EC2 to Kubernetes using Strimzi, with zero downtime and no client-side changes. The migration was executed in six phases: introducing a DNS abstraction layer to decouple clients from broker addresses, freeing up broker ID space by reshuffling EC2 brokers, running a mixed EC2/Kubernetes cluster via a forked Strimzi operator, gradually shifting partition leadership and data using Cruise Control, migrating the control plane from ZooKeeper to KRaft, and finally handing off to the standard Strimzi operator. Key lessons include using abstraction layers to decouple clients from infrastructure, treating logical state as the primary asset to protect, and designing every migration step to be reversible.
10

See all Apache Kafka archives