Utils Utility¶
Local URI Scheme¶
Utils
defines a local
URI scheme for files that are locally available on worker nodes in the cluster.
The local
URL scheme is used when:
Utils
is used to isLocalUriClient
(Spark on YARN) is used
isLocalUri¶
isLocalUri(
uri: String): Boolean
isLocalUri
is true
when the URI is a local:
URI (the given uri
starts with local: scheme).
isLocalUri
is used when:
- FIXME
getCurrentUserName¶
getCurrentUserName(): String
getCurrentUserName
computes the user name who has started the SparkContext.md[SparkContext] instance.
NOTE: It is later available as SparkContext.md#sparkUser[SparkContext.sparkUser].
Internally, it reads SparkContext.md#SPARK_USER[SPARK_USER] environment variable and, if not set, reverts to Hadoop Security API's UserGroupInformation.getCurrentUser().getShortUserName()
.
NOTE: It is another place where Spark relies on Hadoop API for its operation.
localHostName¶
localHostName(): String
localHostName
computes the local host name.
It starts by checking SPARK_LOCAL_HOSTNAME
environment variable for the value. If it is not defined, it uses SPARK_LOCAL_IP
to find the name (using InetAddress.getByName
). If it is not defined either, it calls InetAddress.getLocalHost
for the name.
NOTE: Utils.localHostName
is executed while SparkContext.md#creating-instance[SparkContext
is created] and also to compute the default value of spark-driver.md#spark_driver_host[spark.driver.host Spark property].
getUserJars¶
getUserJars(
conf: SparkConf): Seq[String]
getUserJars
is the spark.jars configuration property with non-empty entries.
getUserJars
is used when:
SparkContext
is created
extractHostPortFromSparkUrl¶
extractHostPortFromSparkUrl(
sparkUrl: String): (String, Int)
extractHostPortFromSparkUrl
creates a Java URI with the input sparkUrl
and takes the host and port parts.
extractHostPortFromSparkUrl
asserts that the input sparkURL
uses spark scheme.
extractHostPortFromSparkUrl
throws a SparkException
for unparseable spark URLs:
Invalid master URL: [sparkUrl]
extractHostPortFromSparkUrl
is used when:
StandaloneSubmitRequestServlet
is requested tobuildDriverDescription
RpcAddress
is requested to extract an RpcAddress from a Spark master URL
isDynamicAllocationEnabled¶
isDynamicAllocationEnabled(
conf: SparkConf): Boolean
isDynamicAllocationEnabled
is true
when the following hold:
- spark.dynamicAllocation.enabled configuration property is
true
- spark.master is non-
local
isDynamicAllocationEnabled
is used when:
SparkContext
is created (to start an ExecutorAllocationManager)DAGScheduler
is requested to checkBarrierStageWithDynamicAllocationSchedulerBackendUtils
is requested to getInitialTargetExecutorNumberStandaloneSchedulerBackend
(Spark Standalone) is requested tostart
ExecutorPodsAllocator
(Spark on Kubernetes) is requested toonNewSnapshots
ApplicationMaster
(Spark on YARN) is created
checkAndGetK8sMasterUrl¶
checkAndGetK8sMasterUrl(
rawMasterURL: String): String
checkAndGetK8sMasterUrl
...FIXME
checkAndGetK8sMasterUrl
is used when:
SparkSubmit
is requested to prepareSubmitEnvironment (for Kubernetes cluster manager)
getLocalDir¶
getLocalDir(
conf: SparkConf): String
getLocalDir
...FIXME
getLocalDir
is used when:
-
Utils
is requested to <> -
SparkEnv
is core:SparkEnv.md#create[created] (on the driver) -
spark-shell.md[spark-shell] is launched
-
Spark on YARN's
Client
is requested to spark-yarn-client.md#prepareLocalResources[prepareLocalResources] and spark-yarn-client.md#createConfArchive[create ++spark_conf.zip++ archive with configuration files and Spark configuration] -
PySpark's
PythonBroadcast
is requested toreadObject
-
PySpark's
EvalPythonExec
is requested todoExecute
Fetching File¶
fetchFile(
url: String,
targetDir: File,
conf: SparkConf,
securityMgr: SecurityManager,
hadoopConf: Configuration,
timestamp: Long,
useCache: Boolean): File
fetchFile
...FIXME
fetchFile
is used when:
-
SparkContext
is requested to SparkContext.md#addFile[addFile] -
Executor
is requested to executor:Executor.md#updateDependencies[updateDependencies] -
Spark Standalone's
DriverRunner
is requested todownloadUserJar
getOrCreateLocalRootDirs¶
getOrCreateLocalRootDirs(
conf: SparkConf): Array[String]
getOrCreateLocalRootDirs
...FIXME
getOrCreateLocalRootDirs
is used when:
-
Utils
is requested to <> -
Worker
is requested to spark-standalone-worker.md#receive[handle a LaunchExecutor message]
getOrCreateLocalRootDirsImpl¶
getOrCreateLocalRootDirsImpl(
conf: SparkConf): Array[String]
getOrCreateLocalRootDirsImpl
...FIXME
getOrCreateLocalRootDirsImpl
is used when Utils
is requested to getOrCreateLocalRootDirs