Speculative tasks (also speculatable tasks or task strugglers) are tasks that run slower than most (FIXME the setting) of the all tasks in a job.
Speculative execution of tasks is a health-check procedure that checks for tasks to be speculated, i.e. running slower in a stage than the median of all successfully completed tasks in a taskset (FIXME the setting). Such slow tasks will be re-submitted to another worker. It will not stop the slow tasks, but run a new copy in parallel.
The thread starts as
TaskSchedulerImpl starts in clustered deployment modes with spark.speculation enabled. It executes periodically every spark.speculation.interval after the initial
When enabled, you should see the following INFO message in the logs:
INFO TaskSchedulerImpl: Starting speculative execution thread
It works as
task-scheduler-speculation daemon thread pool using
j.u.c.ScheduledThreadPoolExecutor with core pool size
The job with speculatable tasks should finish while speculative tasks are running, and it will leave these tasks running - no KILL command yet.
checkSpeculatableTasks method that asks
rootPool to check for speculatable tasks. If there are any,
SchedulerBackend is called for reviveOffers.
|FIXME How does Spark handle repeated results of speculative tasks since there are copies launched?|
|Spark Property||Default Value||Description|
The time interval to use before checking for speculative tasks.
The percentage of tasks that has not finished yet at which to start speculation.