Skip to content

= LocalSchedulerBackend

LocalSchedulerBackend is a <<../SchedulerBackend.md#, SchedulerBackend>> and an executor:ExecutorBackend.md[] for the <>.

LocalSchedulerBackend is <> when SparkContext is requested to ROOT:SparkContext.md#createTaskScheduler[create the SchedulerBackend with the TaskScheduler] for the following master URLs:

  • local (with exactly <>)

  • local[n] (with exactly <>)

  • ++local[]++* (with the <> that is the number of available CPU cores on the local machine)

  • local[n, m] (with exactly <>)

  • ++local[, m]++* (with the <> that is the number of available CPU cores on the local machine)

While being <>, LocalSchedulerBackend requests the <> to <<../spark-LauncherBackend.md#connect, connect>>.

When an executor sends task status updates (using ExecutorBackend.statusUpdate), they are passed along as <> to <>.

.Task status updates flow in local mode image::LocalSchedulerBackend-LocalEndpoint-Executor-task-status-updates.png[align="center"]

[[appId]] [[applicationId]] When requested for the <<../SchedulerBackend.md#applicationId, applicationId>>, LocalSchedulerBackend uses local-[currentTimeMillis].

[[maxNumConcurrentTasks]] When requested for the <<../SchedulerBackend.md#maxNumConcurrentTasks, maxNumConcurrentTasks>>, LocalSchedulerBackend simply divides the <> by scheduler:TaskSchedulerImpl.md#CPUS_PER_TASK[spark.task.cpus] configuration (default: 1).

[[defaultParallelism]] When requested for the <<../SchedulerBackend.md#defaultParallelism, defaultParallelism>>, LocalSchedulerBackend uses <<../configuration-properties.md#spark.default.parallelism, spark.default.parallelism>> configuration (if defined) or the <>.

[[userClassPath]] When <>, LocalSchedulerBackend <> the <<../configuration-properties.md#spark.executor.extraClassPath, spark.executor.extraClassPath>> configuration property (in the given <>) for the user-defined class path for executors that is used exclusively when LocalSchedulerBackend is requested to <> (and creates a <> that in turn uses it to create the one <>).

[[creating-instance]] LocalSchedulerBackend takes the following to be created:

  • [[conf]] <<../SparkConf.md#, SparkConf>>
  • [[scheduler]] scheduler:TaskSchedulerImpl.md[TaskSchedulerImpl]
  • [[totalCores]] Total number of CPU cores (aka totalCores)

[[internal-registries]] .LocalSchedulerBackend's Internal Properties (e.g. Registries, Counters and Flags) [cols="1m,3",options="header",width="100%"] |=== | Name | Description

| localEndpoint a| [[localEndpoint]] rpc:RpcEndpointRef.md[RpcEndpointRef] to LocalSchedulerBackendEndpoint RPC endpoint (that is <> which LocalSchedulerBackend registers when <>)

Used when LocalSchedulerBackend is requested for the following:

  • <> (and sends a <> one-way asynchronous message)

  • <> (and sends a <> one-way asynchronous message)

  • <> (and sends a <> one-way asynchronous message)

  • <> (and sends a <> asynchronous message)

| launcherBackend a| [[launcherBackend]] <<../spark-LauncherBackend.md#, LauncherBackend>>

Used when LocalSchedulerBackend is <>, <> and <>

| listenerBus a| [[listenerBus]] scheduler:LiveListenerBus.md[] that is used exclusively when LocalSchedulerBackend is requested to <>

|===

[[logging]] [TIP] ==== Enable INFO logging level for org.apache.spark.scheduler.local.LocalSchedulerBackend logger to see what happens inside.

Add the following line to conf/log4j.properties:

log4j.logger.org.apache.spark.scheduler.local.LocalSchedulerBackend=INFO

Refer to <<../spark-logging.md#, Logging>>.

== [[start]] Starting Scheduling Backend -- start Method

[source, scala]

start(): Unit

NOTE: start is part of the <<../SchedulerBackend.md#start, SchedulerBackend Contract>> to start the scheduling backend.

start requests the SparkEnv object for the current core:SparkEnv.md#rpcEnv[RpcEnv].

start then creates a <> and requests the RpcEnv to rpc:RpcEnv.md#setupEndpoint[register it] as LocalSchedulerBackendEndpoint RPC endpoint.

start requests the <> to scheduler:LiveListenerBus.md#post[post] a ROOT:SparkListener.md#SparkListenerExecutorAdded[SparkListenerExecutorAdded] event.

In the end, start requests the <> to <<../spark-LauncherBackend.md#setAppId, setAppId>> as the <> and <<../spark-LauncherBackend.md#setState, setState>> as RUNNING.

== [[reviveOffers]] reviveOffers Method

[source, scala]

reviveOffers(): Unit

NOTE: reviveOffers is part of the <<../SchedulerBackend.md#reviveOffers, SchedulerBackend Contract>> to...FIXME.

reviveOffers...FIXME

== [[killTask]] killTask Method

[source, scala]

killTask( taskId: Long, executorId: String, interruptThread: Boolean, reason: String): Unit


NOTE: killTask is part of the <<../SchedulerBackend.md#killTask, SchedulerBackend Contract>> to kill a task.

killTask...FIXME

== [[statusUpdate]] statusUpdate Method

[source, scala]

statusUpdate( taskId: Long, state: TaskState, data: ByteBuffer): Unit


NOTE: statusUpdate is part of the executor:ExecutorBackend.md#statusUpdate[ExecutorBackend] abstraction.

statusUpdate...FIXME

== [[stop]] Stopping Scheduling Backend -- stop Method

[source, scala]

stop(): Unit

NOTE: stop is part of the <<../SchedulerBackend.md#stop, SchedulerBackend Contract>> to stop a scheduling backend.

stop...FIXME

== [[getUserClasspath]] User-Defined Class Path for Executors -- getUserClasspath Method

[source, scala]

getUserClasspath(conf: SparkConf): Seq[URL]

getUserClasspath simply requests the given SparkConf for the <<../configuration-properties.md#spark.executor.extraClassPath, spark.executor.extraClassPath>> configuration property and converts the entries (separated by the system-dependent path separator) to URLs.

NOTE: getUserClasspath is used exclusively when LocalSchedulerBackend is <>.


Last update: 2020-10-06