The post delves into the details of the Apache Spark scheduling process. It covers the anatomy of a Spark job, stages, tasks, and the Directed Acyclic Graph (DAG) scheduler. It explains how SparkContext initiates scheduling, the roles of TaskScheduler and SchedulerBackend, and the concept of data locality in task execution. The post also discusses speculative execution to handle slow tasks and the entire end-to-end scheduling process in Spark.

8m read timeFrom blog.det.life
Post cover image
Table of contents
SparkContextDagSchedulerTaskSchedulerSchedulerBackendTask Execution on ExecutorsThings go on

Sort: