A comprehensive guide to building a real-time flight data pipeline using Kafka for streaming, Spark for processing, and Airflow for orchestration. The pipeline fetches live flight data from a custom API, streams it through Kafka to MongoDB for storage, then uses Airflow to schedule daily ETL jobs that extract landed flight

16m read timeFrom blog.det.life
Post cover image
Table of contents
🚀 Running Docker Services🧩 Project StructureInserting Data into PostgreSQLReporting and CSV OutputTriggering the Airflow Process with the DAG FileProject Summary
4 Comments

Sort: