PoC Data Platform project utilizing modern data stack (Airflow, Spark, DBT, Trino, Hive metastore, Lightdash, Delta Lake)
The PoC Data Platform demonstrates extracting, loading, and transforming data using modern data technologies like Airflow, Spark, DBT, Trino, Hive Metastore, Lightdash, and Delta Lake. It utilizes AdventureWorks data within a data lake environment and offers insights into configuring these tools for data engineering and system design. The platform provides a comprehensive Docker setup with detailed instructions, making it a valuable resource for both beginners and professionals in data systems.