Best of Apache Kafka2024

  1. 1
    Article
    Avatar of bytebytegoByteByteGo·2y

    EP110: Top 5 Strategies to Reduce Latency

    Learn about strategies to reduce latency, use cases of load balancers, and data sharding algorithms commonly used.

  2. 2
    Article
    Avatar of towardsdevTowards Dev·2y

    Microservices with NodeJs Using NestJs and Apache Kafka

    Learn how to use NestJs and Apache Kafka to build microservices in NodeJs. Understand the capabilities of Kafka and its integration with real-time streaming data analysis tools like Apache Storm and Spark.

  3. 3
    Video
    Avatar of communityCommunity Picks·2y

    Apache Kafka vs RabbitMQ Performance (Latency - Throughput - Saturation)

    A detailed comparison between Apache Kafka and RabbitMQ, focusing on latency, throughput, CPU and memory usage, and disk operations. The test reveals that RabbitMQ offers lower latency, while Kafka provides higher throughput and resilience. The comparison extends to RabbitMQ Streams, where Kafka again shows superior performance. The post provides a comprehensive analysis suited for selecting the right messaging broker based on specific needs.

  4. 4
    Article
    Avatar of devtoDEV·2y

    Introducing AutoMQ: a cloud-native replacement of Apache Kafka

    AutoMQ is a cloud-native replacement for Apache Kafka, designed to address the evolving needs of modern data architectures with a focus on efficiency, scalability, and cost-effectiveness. Originating from a team of open-source pioneers, it offers a unique architecture that decouples storage and computation, leveraging cloud storage to provide significant cost savings and operational efficiency. AutoMQ maintains full compatibility with Kafka, supports multi-cloud environments, and aims to integrate stream data into data lakes to enhance data access and break down silos. The growing community and successful funding highlight its potential impact on the stream storage industry.

  5. 5
    Article
    Avatar of bytebytegoByteByteGo·2y

    How Uber Manages Petabytes of Real-Time Data

    Uber's real-time data infrastructure processes petabytes of data daily, supporting features like customer incentives and fraud detection. The system relies on Apache Kafka for streaming data, Apache Flink for stream processing, and Apache Pinot for real-time OLAP. Key requirements include consistency, availability, data freshness, scalability, and cost efficiency. Customizations and tools like FlinkSQL and uReplicator enhance reliability and performance. This enables real-time decisions such as dynamic pricing and operational insights. Scalability strategies, including Active-Active and Active-Passive Kafka setups, ensure high availability and fault tolerance.

  6. 6
    Video
    Avatar of communityCommunity Picks·2y

    Top Kafka Use Cases You Should Know

    Explore the top five use cases of Apache Kafka, starting from log analysis to real-time machine learning pipelines, system monitoring and alerting, change data capture (CDC), and system migration. Kafka excels at ingesting and processing high-volume data from multiple sources with low latency, making it invaluable in modern software architecture. Key integrations include the ELK stack for log analysis and Apache Flink and Spark for stream processing.

  7. 7
    Article
    Avatar of hnHacker News·2y

    AutoMQ/automq: A cloud native implementation for Apache Kafka, reducing your cloud infrastructure bill by up to 90%.

    AutoMQ is a cloud-native, serverless implementation of Apache Kafka that reduces cloud infrastructure costs by up to 90%.

  8. 8
    Article
    Avatar of communityCommunity Picks·1y

    Kafka Architecture & Troubleshooting Quiz

    Gauge your knowledge of Apache Kafka with a quiz that covers multi-datacenter deployments, event sourcing patterns, performance optimization, and troubleshooting in production. Enhance your understanding of Kafka's internals and best practices for building reliable, scalable distributed systems.

  9. 9
    Article
    Avatar of medium_jsMedium·2y

    How We Solve Load Balancing Challenges in Apache Kafka

    This post discusses the challenges of load balancing in Apache Kafka and presents solutions, such as lag-aware producers and consumers, to address these challenges.

  10. 10
    Article
    Avatar of communityCommunity Picks·2y

    How To Set Up a Multi-Node Kafka Cluster using KCraft

    Learn how to set up a multi-node Kafka cluster using the KRaft consensus protocol. Configure nodes to be part of the cluster, observe topic partition assignments, and assign topics to specific brokers. Explore how to connect to the cluster, create and consume messages, and handle node unavailability. Finally, discover how to migrate topics between nodes in the cluster.

  11. 11
    Article
    Avatar of quastorQuastor Daily·2y

    Tech Dive on Apache Kafka

    This post provides a tech dive into Apache Kafka, discussing its design goals, components, messaging patterns, and more. It also explores the differences between message queues and publish/subscribe systems.

  12. 12
    Article
    Avatar of infoworldInfoWorld·1y

    3 data engineering trends riding Kafka, Flink, and Iceberg

    Apache Kafka, Apache Flink, and Apache Iceberg are revolutionizing data management. Kafka enables real-time data movement, Flink processes this data efficiently, and Iceberg structures stored data for query accessibility. Innovations in these open-source tools are shaping data engineering practices, particularly in microservices, AI integration, and community-driven Iceberg tools. Staying informed on these trends ensures proficiency in a rapidly evolving field.

  13. 13
    Article
    Avatar of confConfluent Blog·2y

    Inside the Kafka Black Box—How Producers Prepare Event Data for Brokers

    Apache Kafka is a robust distributed event streaming platform ideal for real-time data handling. This detailed guide explores the inner workings of Kafka, focusing on Kafka producers, consumers, and brokers. Key insights include the path data takes from producer to broker, essential configurations, partitioning strategies, batching techniques, and performance metrics to monitor. The aim is to equip developers with the knowledge needed to debug and optimize their Kafka applications.

  14. 14
    Article
    Avatar of databricksdatabricks·2y

    Supernovas, Black Holes and Streaming Data

    Learn how publicly available NASA satellite data on supernovas and black holes can be consumed and processed with Apache Kafka and Databricks. This guide covers the setup and use of Databricks Data Intelligence Platform, Delta Live Tables, and AI/BI Genie to simplify the ingestion, transformation, and visualization of streaming data. Detailed steps for using open-source Apache Spark and Kafka, along with Databricks enhancements for serverless compute and natural language querying, make this complex data accessible to data scientists and engineers.

  15. 15
    Article
    Avatar of infoqInfoQ·2y

    Uber Builds Scalable Chat Using Microservices with GraphQL Subscriptions and Kafka

    Uber builds a scalable chat using microservices with GraphQL subscriptions and Kafka, replacing the legacy architecture to improve reliability, scalability, and maintainability.