Google Cloud now offers public datasets accessible through Apache Iceberg REST Catalog on BigLake, enabling read-only querying from any compute engine (Spark, Trino, Flink, BigQuery) without infrastructure setup. The NYC Taxi dataset is available as a production-grade Iceberg table, demonstrating features like partition pruning
•4m read time• From opensource.googleblog.com
Table of contents
How to Access Public DatasetsExploring the Data: Sample QueriesComing Soon: An Iceberg V3 PlaygroundStart Building TodaySort: