Standalone Worker

Standalone Worker (aka standalone slave) is a logical node in a Spark Standalone cluster.

Worker is a ThreadSafeRpcEndpoint that uses Worker for the RPC endpoint name when registered.

You can have one or many standalone workers in a standalone cluster. They can be started and stopped using management scripts.

Worker is created when…​FIXME

When started, Worker…​FIXME

Table 1. Worker’s Internal Properties (e.g. Registries, Counters and Flags)
Name Description


Working directory of the executors that the Worker manages

Initialized when Worker is requested to createWorkDir (when Worker RPC Endpoint is requested to start on a RPC environment).

Used when Worker is requested to handleRegisterResponse and receives a WorkDirCleanup message.

Used when Worker is requested to onStart (to create a WorkerWebUI), receives LaunchExecutor or LaunchDriver messages.

receive Method

receive: PartialFunction[Any, Unit]
receive is part of RpcEndpoint Contract to process messages.


handleRegisterResponse Internal Method

handleRegisterResponse(msg: RegisterWorkerResponse): Unit


handleRegisterResponse is used when…​FIXME

Launching Worker Standalone Application — main Method

main(argStrings: Array[String]): Unit


Starting RPC Environment And Registering Worker RPC Endpoint — startRpcEnvAndEndpoint Method

  host: String,
  port: Int,
  webUiPort: Int,
  cores: Int,
  memory: Int,
  masterUrls: Array[String],
  workDir: String,
  workerNumber: Option[Int] = None,
  conf: SparkConf = new SparkConf): RpcEnv


startRpcEnvAndEndpoint creates a RpcEnv for the input host and port.

startRpcEnvAndEndpoint creates a Worker RPC endpoint (for the RPC environment and the input webUiPort, cores, memory, masterUrls, workDir and conf).

startRpcEnvAndEndpoint requests the RpcEnv to register the Worker RPC endpoint under the name Worker.

startRpcEnvAndEndpoint is used when:

  • Worker is launched from a command line

  • LocalSparkCluster is requested to start

Creating Worker Instance

Worker takes the following when created:

  • RpcEnv

  • Port of the administrative web UI

  • Number of cores

  • Amount of memory

  • standalone Master’s RpcAddresses

  • RPC endpoint name

  • Path to the working directory

  • SparkConf

  • SecurityManager

Worker initializes the internal registries and counters.

createWorkDir Internal Method

createWorkDir(): Unit

createWorkDir sets workDir to be either workDirPath if defined or sparkHome with work subdirectory.

In the end, createWorkDir creates workDir directory (including any necessary but nonexistent parent directories).

createWorkDir reports…​FIXME

createWorkDir is used exclusively when Worker RPC Endpoint is requested to start on a RPC environment.

onStart Method

onStart(): Unit
onStart is part of RpcEndpoint Contract to activate an endpoint and start accepting messages.