Best of KafkaOctober 2024

  1. 1
    Article
    Avatar of bytebytegoByteByteGo·2y

    The Trillion Message Kafka Setup at Walmart

    Walmart's Apache Kafka setup processes trillions of messages daily with a 99.99% availability rate, supporting critical data movement, event-driven microservices, and streaming analytics. The team addressed challenges like consumer rebalancing, poison pill messages, and costs by designing a Message Proxy Service (MPS). This service decouples Kafka message consumption from its partition-based model, allowing consumer applications to scale independently and handling consumer failures effectively.

  2. 2
    Article
    Avatar of systemdesigncodexSystem Design Codex·2y

    3 Kafka Messaging Strategies

    Exploring three Kafka messaging strategies: Fire and Forget, Synchronous Send, and Asynchronous Send. Discusses how Kafka Producers work, the trade-offs between each strategy, and recommendations for when to use each approach.

  3. 3
    Article
    Avatar of systemdesigncodexSystem Design Codex·2y

    Kafka Load Balancing at Agoda for Terabytes of Data

    Agoda uses Kafka to manage hundreds of terabytes of data across various supply systems, including hotels and restaurants, ensuring real-time price updates. They faced challenges with the traditional round-robin partitioning and consumer assignment due to heterogeneous hardware and uneven workloads, resulting in over-provisioning. Agoda addressed these issues by implementing dynamic, lag-aware strategies, including a lag-aware producer and consumer, to optimize message distribution and reduce latency.

  4. 4
    Article
    Avatar of trendyoltechTrendyol Tech·2y

    Turning Millions of Kafka Events Into Meaningful Reports for Sellers

    Trendyol’s Export Center developed the Seller Reporting API to transform Kafka event data into insightful reports for sellers. They used Elasticsearch for data storage and effective Date Histogram Aggregation to handle time-based data. The implementation involved creating a system to index order events and querying the indexed data to create detailed reports. These reports cater to sellers' needs for data over various periods, comparing sales across different regions and currencies.

  5. 5
    Article
    Avatar of threedotslabsThree Dots Labs·2y

    Watermill 1.4 Released (Event-Driven Go Library)

    Watermill 1.4, an event-driven Go library, has seen significant updates. Key highlights include support for a new universal requeuer component that works with all Pub/Subs (requiring PostgreSQL for now), a pq CLI tool for managing poison queues, and a delay package to add delay metadata to messages. There's also enhanced support for AWS SNS/SQS Pub/Subs and major updates to watermill-amqp and watermill-sql. The new release includes refreshed documentation, a new logo, and several new examples showcasing features like delayed messages and distributed transactions.

  6. 6
    Article
    Avatar of hnHacker News·2y

    FrigadeHQ/trench: Trench — Open-Source Analytics Infrastructure. A single production-ready Docker image built on ClickHouse, Kafka, and Node.js for tracking events, users, page views, and interactions

    Trench is an event tracking system built on Apache Kafka and ClickHouse, capable of handling large volumes of events and providing real-time analytics. It is compliant with GDPR and PECR, allowing users full control over their data. Trench can be deployed quickly using a production-ready Docker image and offers both self-hosted and fully-managed cloud solutions. It supports Segment API and can process thousands of events per second on a single node. Users can query their data in real-time and connect it to other destinations using webhooks.