Apache Spark 3.4 and 3.5 in 2023 introduced several new features and improvements to PySpark, including Spark Connect for remote connectivity, Arrow-optimized Python UDFs for better performance, and the English SDK to simplify PySpark programming.

5m read timeFrom databricks.com
Post cover image
Table of contents
Apache Spark 3.5 and 3.4: Feature Deep DivesArrow-optimized Python UDFs: Boosting the performance of Python UDFsPython UDTFsNew SQL FeaturesPython arbitrary stateful processingTorchDistributor: Native PyTorch IntegrationTesting API: Easier Testing for PySpark DataFramesEnglish SDK: English as a Programing LanguageOther Notable ImprovementsReflections and the Road AheadGetting Started with the New Features

Sort: