Best of Apache Flink — 2025

1
Article
Materialized View·1y
Kafka: The End of the Beginning
Apache Kafka has dominated streaming data for over a decade, but innovation has stagnated while batch processing has evolved rapidly. The streaming ecosystem faces challenges with slow growth, long sales cycles, and lack of new ideas. While Kafka's protocol has become the de facto standard, its architecture shows limitations for modern cloud-native requirements. New solutions like S2 are emerging with fresh approaches, and the next decade could see a transition similar to how batch processing moved beyond Hadoop, potentially ushering in a truly cloud-native streaming era.
293
6
2
Article
Confluent Blog·1y
Designing Event-Driven, Multi-Agent AI Architectures w/Kafka and Flink
Learn how to build an event-driven, multi-agent AI architecture to simplify meal planning. Using tools like Kafka, Flink, LangChain, and Claude, the system coordinates multiple AI agents—each with specialized tasks like planning child-friendly or adult meals—into a cohesive meal plan. The approach ensures real-time responsiveness, adaptability, and fault tolerance, making complex daily tasks more manageable.
120
3
Article
Towards Dev·43w
Building a Scalable Real-Time ETL Pipeline with Kafka, Debezium, Flink, Airflow, MinIO, and ClickHouse
A comprehensive guide to building a scalable real-time ETL pipeline using open-source tools including Kafka for data streaming, Debezium for change data capture, Flink for stream processing, ClickHouse as a lakehouse solution, Airflow for orchestration, and MinIO for object storage. The architecture separates hot and cold data layers, with real-time data stored locally for performance and historical data in remote storage for cost optimization. Includes practical implementation steps, Docker configurations, and dashboard creation using Apache Superset.
102
4
Article
Netflix TechBlog·32w
How and Why Netflix Built a Real-Time Distributed Graph: Part 1 — Ingesting and Processing Data Streams at Internet Scale
Netflix built a Real-Time Distributed Graph (RDG) to analyze member interactions across different business verticals like streaming, gaming, and live events. The system processes over 1 million Kafka messages per second using Apache Flink jobs that transform events into graph nodes and edges, writing more than 5 million records per second to storage. The architecture evolved from a monolithic Flink job to a 1:1 mapping between Kafka topics and Flink jobs for better operational stability and tuning. This first part covers the ingestion and processing pipeline, with future posts planned for storage and serving layers.
82
5
Article
Confluent Blog·1y
How to build a multi-agent orchestrator using Flink and Kafka
The post explores the creation of multi-agent systems using an orchestrator pattern, with Apache Flink and Kafka as key technologies. It highlights the necessity of dividing complex tasks among specialized AI agents for better collaboration and problem-solving. The orchestrator facilitates efficient message routing and real-time decision-making by interpreting and distributing tasks dynamically. The combination of Flink's real-time processing and Kafka's event-driven messaging creates a scalable, adaptable system without rigid dependencies.
69
6
Article
ByteByteGo·34w
How OpenAI Uses Kubernetes And Apache Kafka for GenAI
OpenAI built a stream processing platform using Apache Flink (PyFlink) on Kubernetes to handle real-time data for AI model training and experimentation. The architecture addresses three key challenges: providing Python-first APIs for ML practitioners, handling cloud capacity constraints, and managing multi-primary Kafka clusters. The system features a control plane for multi-cluster failover, per-namespace isolation in Kubernetes, watchdog services for Kafka topology monitoring, and decoupled state management using RocksDB with highly available blob storage. Custom Kafka connectors enable reading from multiple primary clusters simultaneously while maintaining resilience during outages.
56
7
Article
ByteByteGo·1y
How Airbnb Powers Personalization With 1M Events Per Second
Airbnb utilizes its User Signals Platform (USP) to handle real-time personalization effectively, processing over 1 million events per second. USP ingests and processes user actions with sub-second latency, stores both real-time and historical data, and provides a configurable interface for developers. Key features include the use of Flink for event-driven processing, an append-only data model for resilience, and a robust infrastructure with hot standby Task Managers for operational stability.
37
8
Article
Flink·1y
Apache Flink 2.0.0: A new Era of Real-Time Data Processing
Apache Flink 2.0.0 marks a significant release in the Flink series, introducing new features and architectural enhancements for real-time data processing. Key highlights include Disaggregated State Management, Materialized Tables, and deep integration with Apache Paimon for streaming lakehouse architectures. The release focuses on improving performance, scalability, and resource efficiency, making real-time computing more accessible and practical for diverse use cases. It also includes a new DataStream V2 API and removes several deprecated APIs, resulting in backward-incompatible changes.
30
9
Article
Tinybird·32w
Flink is a 95% problem
Apache Flink is marketed as essential for real-time data processing, but it's overkill for 95% of use cases. Most real-time problems can be solved with simpler solutions: HTTP services with Postgres (65%), OLAP databases like ClickHouse (25%), or custom solutions (5%). Only about 5% of companies actually need Flink's complexity. The platform introduces massive operational overhead including new APIs to learn, additional infrastructure (Kafka, ZooKeeper/K8s), 700+ configuration parameters, complex observability requirements, and JVM dependency. Even Flink's creators acknowledge its limitations, and recent acquisitions of Flink-based companies suggest limited market traction. For most organizations under 100 developers, simpler alternatives like ClickHouse with SQL or native programming language Kafka consumers provide better cost-benefit tradeoffs without the engineering complexity.
27
1
10
Article
The New Stack·1y
A2A, MCP, Kafka and Flink: The New Stack for AI Agents
The post discusses the need for a new infrastructure stack to enable AI agents to collaborate effectively. This stack includes four open components: Google’s Agent2Agent (A2A) protocol for agent communication, Anthropic’s Model Context Protocol (MCP) for tool access, Apache Kafka for event-driven communication, and Apache Flink for real-time data processing. By integrating these technologies, AI agents can operate beyond isolated silos, scaling to complex ecosystems that facilitate collaboration, observability, and resilience.
23
1
11
Article
Google Open Source Blog·35w
Apache Iceberg 1.10: Maturing the V3 spec, the REST API and Google contributions
Apache Iceberg 1.10.0 introduces major improvements including full Spark 4.0 and Flink 2.0 compatibility, production-ready Deletion Vectors for faster row-level updates, and a hardened REST Catalog API. The release matures the V3 specification with features like row lineage and variant types. Google contributed native BigQuery Metastore Catalog support and Google AuthManager, enabling seamless integration with BigLake-managed tables through open REST protocols.
16
12
Article
Confluent Blog·50w
Build an AI Personalization Engine with Confluent & Databricks
Confluent and Databricks can be combined to build real-time AI applications by bridging operational and analytical data systems. The tutorial demonstrates creating an AI-powered marketing personalization engine using Tableflow to convert Kafka topics into Delta Lake tables, Apache Flink for stream processing, and Oracle CDC connectors for real-time data ingestion. The example implementation helps a fictional hotel brand identify low-booking properties and generate targeted promotional campaigns using AI-generated content.
10
2

See all Apache Flink archives