TaskLocation

TaskLocation is a location where a task should run.

TaskLocation can either be a host alone or a (host, executorID) pair (as ExecutorCacheTaskLocation).

With ExecutorCacheTaskLocation the Spark scheduler prefers to launch the task on the given executor, but the next level of preference is any executor on the same host if this is not possible.

TaskLocation is a Scala private[spark] sealed trait (i.e. all the available implementations of TaskLocation trait are in a single Scala file).
Table 1. Available TaskLocations
Name Description

HostTaskLocation

A location on a host.

ExecutorCacheTaskLocation

A location that includes both a host and an executor id on that host.

HDFSCacheTaskLocation

A location on a host that is cached by Hadoop HDFS.

Used exclusively when HadoopRDD and NewHadoopRDD are requested for their placement preferences (aka preferred locations).