Best of KafkaNovember 2024

  1. 1
    Article
    Avatar of architectureweeklyArchitecture Weekly·1y

    Deduplication in Distributed Systems: Myths, Realities, and Practical Solutions

    Duplication in distributed systems is a common issue due to retries, processing failures, and fault tolerance mechanisms. Deduplication aims to identify and eliminate duplicate messages, but it comes with challenges that impact scalability, performance, and reliability. The post explores how deduplication is implemented in technologies like Kafka and RabbitMQ, and discusses the trade-offs and complexities involved. It also highlights the concept of exactly-once processing as a more realistic goal than exactly-once delivery, emphasizing patterns like idempotency and transactional outboxes to achieve robust message handling.

  2. 2
    Article
    Avatar of cerbosCerbos·2y

    How to pick the right inter-service communication pattern for your microservices

    Efficient inter-service communication is essential for a successful microservices architecture. Different communication patterns, such as synchronous, asynchronous, and event-driven, offer various benefits and challenges. Strategies like retries, circuit breakers, timeouts, and bulkheads can enhance fault tolerance and resilience. Spotify's adoption of Apache Kafka for event-driven communication illustrates a scalable and decoupled microservices environment, supporting independent service evolution and robust failure management.

  3. 3
    Article
    Avatar of baeldungBaeldung·2y

    Consuming Messages in Batch in Kafka Using @KafkaListener

    The post discusses handling Kafka messages in batches using the @KafkaListener annotation from the Spring Kafka library. It covers the advantages of batch processing, such as increased system efficiency and fault tolerance, and presents a use case involving an IT infrastructure's KPIs. The article demonstrates the implementation of a basic listener and a batch-processing listener, explaining the configuration details and differences between the two approaches.

  4. 4
    Article
    Avatar of detlifeData Engineer Things·1y

    Change Data Capture (CDC): Comprehensive Guide-PostgreSQL To S3(MinIO) Using NiFi

    Change Data Capture (CDC) technology is essential for real-time database updates, ensuring data integrity and quick access. This guide provides a step-by-step approach to using CDC with PostgreSQL, Debezium, Apache NiFi, and storing data in MinIO. The process involves setting up Docker Compose, configuring Debezium to monitor PostgreSQL, using Kafka as a message broker, and employing NiFi for data flow management to transfer data to MinIO for real-time analysis.

  5. 5
    Article
    Avatar of detlifeData Engineer Things·2y

    AutoMQ: Achieving Auto Partition Reassignment In Kafka Without Cruise Control

    AutoMQ provides a cloud-native solution for Kafka that addresses the challenges of rebalancing partitions without the need for data transfer between brokers. By leveraging object storage and a shared storage architecture, AutoMQ achieves compute-storage separation, making the rebalancing process more efficient. Unlike tools like Cruise Control, AutoMQ's self-balancing feature, AutoBalancer, simplifies cluster management through stateless brokers and metadata adjustments, ensuring optimal resource utilization and quick, effective decision-making.

  6. 6
    Article
    Avatar of trendyoltechTrendyol Tech·1y

    Ensuring Client Continuity in Kafka: Handling Broker Restarts with No Disruptions

    Trendyol's Data Streaming team addresses challenges in maintaining uninterrupted Kafka services by leveraging Confluent Stretch Kafka across multiple data centers. The team ensures high availability and fault tolerance by configuring replication factors and monitoring topic configurations. By implementing custom alert mechanisms and offering different topic creation options, they reduce downtime and errors during broker restarts, ensuring client applications remain unaffected.