A comprehensive guide to building a real-time flight data pipeline using Kafka for streaming, Spark for processing, and Airflow for orchestration. The pipeline fetches live flight data from a custom API, streams it through Kafka to MongoDB for storage, then uses Airflow to schedule daily ETL jobs that extract landed flight
Table of contents
🚀 Running Docker Services🧩 Project StructureInserting Data into PostgreSQLReporting and CSV OutputTriggering the Airflow Process with the DAG FileProject Summary4 Comments
Sort: