SchedulerBackend

SchedulerBackend is an abstraction of task scheduling systems that can revive resource offers from cluster managers.

SchedulerBackend abstraction allows TaskSchedulerImpl to use variety of cluster managers (with their own resource offers and task scheduling modes).

Being a scheduler backend system assumes a Apache Mesos-like scheduling model in which "an application" gets resource offers as machines become available so it is possible to launch tasks on them. Once required resource allocation is obtained, the scheduler backend can start executors.

Direct Implementations and Extensions

SchedulerBackend Description

CoarseGrainedSchedulerBackend

Base SchedulerBackend for coarse-grained scheduling systems

LocalSchedulerBackend

Spark local

MesosFineGrainedSchedulerBackend

Fine-grained scheduling system for Apache Mesos

Starting SchedulerBackend

start(): Unit

Starts the SchedulerBackend

Used when TaskSchedulerImpl is requested to start

Contract

Method Description

applicationAttemptId

applicationAttemptId(): Option[String]

Execution attempt ID of the Spark application

Default: None (undefined)

Used exclusively when TaskSchedulerImpl is requested for the execution attempt ID of a Spark application

applicationId

applicationId(): String

Unique identifier of the Spark Application

Default: spark-application-[currentTimeMillis]

Used exclusively when TaskSchedulerImpl is requested for the unique identifier of a Spark application

defaultParallelism

defaultParallelism(): Int

Default parallelism, i.e. a hint for the number of tasks in stages while sizing jobs

Used exclusively when TaskSchedulerImpl is requested for the default parallelism

getDriverLogUrls

getDriverLogUrls: Option[Map[String, String]]

Driver log URLs

Default: None (undefined)

Used exclusively when SparkContext is requested to postApplicationStart

isReady

isReady(): Boolean

Controls whether the SchedulerBackend is ready (true) or not (false)

Default: true

Used exclusively when TaskSchedulerImpl is requested to wait until scheduling backend is ready

killTask

killTask(
  taskId: Long,
  executorId: String,
  interruptThread: Boolean,
  reason: String): Unit

Kills a given task

Default: Throws an UnsupportedOperationException

Used when:

maxNumConcurrentTasks

maxNumConcurrentTasks(): Int

Maximum number of concurrent tasks that can be launched now

Used exclusively when SparkContext is requested to maxNumConcurrentTasks

reviveOffers

reviveOffers(): Unit

Handles resource allocation offers (from the scheduling system)

Used when TaskSchedulerImpl is requested to:

stop

stop(): Unit

Stops the SchedulerBackend

Used when: