SparkEnv takes the following to be created:
SparkEnv is created when:
get returns the SparkEnv on the driver and executors.
import org.apache.spark.SparkEnv assert(SparkEnv.get.isInstanceOf[SparkEnv])
create( conf: SparkConf, executorId: String, bindAddress: String, advertiseAddress: String, port: Option[Int], isLocal: Boolean, numUsableCores: Int, ioEncryptionKey: Option[Array[Byte]], listenerBus: LiveListenerBus = null, mockOutputCommitCoordinator: Option[OutputCommitCoordinator] = None): SparkEnv
create is an utility to create the "base" SparkEnv (that is "enhanced" for the driver and executors later on).
create creates a
Serializer (based on spark.serializer setting). You should see the following
DEBUG message in the logs:
DEBUG SparkEnv: Using serializer: [serializer]
create creates a closure
Serializer (based on spark.closure.serializer).
create creates a NettyBlockTransferService with the following ports:
create uses the
FIXME A picture with SparkEnv,
create creates a BlockManagerMaster object with the
BlockManagerMaster RPC endpoint reference (by registering or looking it up by name and BlockManagerMasterEndpoint), the input SparkConf, and the input
|create registers the BlockManagerMaster RPC endpoint for the driver and looks it up for executors.|
create creates a BroadcastManager.
The choice of the real implementation of MapOutputTracker is based on whether the input
create registers or looks up
RpcEndpoint as MapOutputTracker. It registers MapOutputTrackerMasterEndpoint on the driver and creates a RPC endpoint reference on executors. The RPC endpoint reference gets assigned as the MapOutputTracker RPC endpoint.
It creates a CacheManager.
It creates a MetricsSystem for a driver and a worker separately.
userFiles temporary directory used for downloading dependencies for a driver while this is the executor’s current working directory for an executor.
An OutputCommitCoordinator is created.
registerOrLookupEndpoint( name: String, endpointCreator: => RpcEndpoint)
registerOrLookupEndpoint registers or looks up a RPC endpoint by
If called from the driver, you should see the following INFO message in the logs:
And the RPC endpoint is registered in the RPC environment.
Otherwise, it obtains a RPC endpoint reference by
createDriverEnv( conf: SparkConf, isLocal: Boolean, listenerBus: LiveListenerBus, numCores: Int, mockOutputCommitCoordinator: Option[OutputCommitCoordinator] = None): SparkEnv
createDriverEnv creates a SparkEnv execution environment for the driver.
createDriverEnv accepts an instance of SparkConf, whether it runs in local mode or not, LiveListenerBus, the number of cores to use for execution in local mode or
0 otherwise, and a OutputCommitCoordinator (default: none).
It then passes the call straight on to the create helper method (with
driver executor id,
isDriver enabled, and the input parameters).
createExecutorEnv( conf: SparkConf, executorId: String, hostname: String, numCores: Int, ioEncryptionKey: Option[Array[Byte]], isLocal: Boolean): SparkEnv
createExecutorEnv creates an executor’s (execution) environment that is the Spark execution environment for an executor.
The number of cores
stop checks isStopped internal flag and does nothing when enabled already.
Otherwise, stop turns
isStopped flag on, stops all
pythonWorkers and requests the following services to stop:
Only on the driver, stop deletes the temporary directory. You can see the following WARN message in the logs if the deletion fails.
Exception while deleting Spark temp dir: [path]
set(e: SparkEnv): Unit
set saves the input SparkEnv to env internal registry (as the default SparkEnv).
environmentDetails( conf: SparkConf, schedulingMode: String, addedJars: Seq[String], addedFiles: Seq[String]): Map[String, Seq[(String, String)]]
environmentDetails is used when SparkContext is requested to post a SparkListenerEnvironmentUpdate event.
ALL logging level for
org.apache.spark.SparkEnv logger to see what happens inside.
Add the following line to
Refer to Logging.