This post provides a reference implementation for Write-Audit-Publish (WAP) patterns on a data lake using Apache Iceberg and Project Nessie, all running in Python without the need for a JVM. It discusses the concept of WAP, the architecture and workflow of the implementation, and the advantages of using tables,
•8m read time• From towardsdatascience.com
Table of contents
Write-Audit-Publish for Data Lakes in Pure Python (no JVM)IntroductionWhat on earth is WAP?WAP on a data lake in PythonArchitecture and workflowVisualizeFrom the lake to the LakehouseConclusionsSort: