Decathlon migrated data pipelines processing small to mid-size datasets (under 50 GiB) from Apache Spark clusters to Polars running on single Kubernetes pods. The switch reduced compute launch time from 8 to 2 minutes and significantly lowered infrastructure costs. Polars' streaming engine enables processing datasets larger

3m read timeFrom infoq.com
Post cover image

Sort: