Best of Kafka — November 2025

1
Article
Debezium·25w
CQRS Design Pattern
Command Query Responsibility Segregation (CQRS) separates read and write operations into distinct data models and databases, enabling independent scaling, improved security, and better performance. The pattern addresses challenges in both monolithic and microservice architectures by decoupling data access patterns. Implementation approaches range from database-native replication (like PostgreSQL streaming replication) to Change Data Capture solutions using Debezium. A practical voting application demonstrates three replication scenarios: PostgreSQL-to-PostgreSQL using native replication, PostgreSQL-to-MySQL using Debezium JDBC sink connector, and PostgreSQL-to-QuestDB using vendor-specific sink connectors. Debezium simplifies CQRS implementation in heterogeneous environments by capturing database changes in real-time and propagating them across different database technologies through Kafka Connect.
118
1
2
Article
Anubhav Bhatt·28w
Pub/Sub Model Saved Our Insurance System from Collapse
A tightly coupled insurance policy activation system was failing when any downstream service experienced an outage. By refactoring from sequential service calls to an event-driven pub/sub architecture using Kafka, the system became resilient and decoupled. Each service now independently subscribes to policy activation events, allowing failures to be isolated and new services to be added without modifying core logic.
108
12
3
Article
Platformatic·26w
kafka 223% Faster (And What We Learned Along the Way)
Platformatic improved their Kafka client for Node.js by 223% after discovering their benchmark methodology was flawed. By fixing measurement issues (per-operation timing, proper delivery tracking, larger sample sizes), they identified real bottlenecks including CRC32C computation, error handling, and metadata request bugs. Key optimizations included switching to native Rust CRC32C implementation, refactoring async error handling, and fixing connection handling. The pure JavaScript implementation now achieves 92,441 op/sec for single messages and 159,828 op/sec for consumption with <2% variance, outperforming native librdkafka bindings by avoiding cross-boundary overhead while maintaining minimal buffer copying and non-blocking event loop usage.
73
4
Article
Grafana Labs·29w
Grafana Mimir 3.0 release: performance improvements, a new query engine, and more
Grafana Mimir 3.0 introduces a redesigned architecture that separates read and write operations using Apache Kafka as an asynchronous buffer, eliminating performance bottlenecks between ingestion and queries. The release features the Mimir Query Engine (MQE), which processes queries in a streaming fashion rather than bulk loading, reducing peak memory usage by up to 92%. These improvements deliver 15% lower resource usage in large clusters while maintaining faster query execution and higher reliability. The new ingest storage component ensures query spikes won't slow down data ingestion and vice versa, enabling independent scaling of each path.
47
1
5
Article
ByteByteGo·29w
How Datadog Built a Custom Database to Ingest Billions of Metrics Per Second
Datadog built Monocle, a custom time-series database in Rust, to handle billions of metrics per second. The system uses Kafka for data distribution and replication, separates metadata storage from time-series data, and employs a thread-per-core architecture with LSM-tree storage. Key optimizations include arena allocators, time-based file pruning, and cost-based query scheduling. The platform splits storage into real-time (24 hours) and long-term systems, with the real-time database handling 99% of queries. Future plans include dynamic load balancing and merging separate databases into a unified columnar format.
37
6
Article
System Design Codex·29w
Key Concepts of Kafka
Kafka is a distributed event store and streaming platform that has become essential for large-scale data pipelines at companies like Netflix and Uber. The core architecture consists of messages organized into topics and partitions, with producers writing data and consumers reading it in groups. Brokers form clusters that handle message storage and replication for reliability. Key advantages include support for multiple producers and consumers, disk-based retention for durability, and horizontal scalability. However, challenges include complex configuration options, inconsistent tooling, limited client library maturity outside Java/C, and lack of true multi-tenancy.
25
7
Article
Architecture Weekly·27w
Requeuing Roulette in Event-Driven Architecture and Messaging
Explores the "Requeuing Roulette" pattern in event-driven systems, where messages are put back into queues hoping for correct ordering. While this technique can work when messages aren't causally correlated and consumers are stable, it creates risks under load: messages may be reprocessed out of order, causing race conditions and CPU waste. The pattern attempts to maintain strict ordering while maximizing throughput, but this trade-off often fails in distributed systems. Better alternatives include using message grouping features (RabbitMQ routing keys, SQS message groups, Service Bus sessions) or streaming solutions like Kafka that handle ordering through partitions. Understanding actual ordering requirements and choosing simpler solutions typically beats trying to make requeueing work reliably.
20
8
Video
ByteByteGo·28w
System Design: Why is Kafka Popular?
Kafka enables companies like LinkedIn, Netflix, and Uber to handle billions of messages daily through its distributed log architecture. It decouples services by allowing producers and consumers to communicate asynchronously, absorbs traffic spikes, and enables event replay for debugging. Messages are written to append-only partitions organized into topics across broker clusters. Key features include consumer groups for parallel processing, replication for durability, and three delivery guarantees (at-most-once, at-least-once, exactly-once). Partitioning strategy is critical—poor key selection creates hot partitions, while compound keys distribute load effectively. Trade-offs include added operational complexity, optimized throughput over latency, and ordering guarantees limited to single partitions. Event sourcing patterns use Kafka as the source of truth by appending state changes as events.
10

See all Kafka archives