SparkSubmit¶
SparkSubmit
is the entry point to spark-submit shell script.
Actions¶
SparkSubmit
executes actions (based on the action argument).
Killing Submission¶
kill(
args: SparkSubmitArguments): Unit
kill
...FIXME
Displaying Version¶
printVersion(): Unit
printVersion
...FIXME
Submission Status¶
requestStatus(
args: SparkSubmitArguments): Unit
requestStatus
...FIXME
Submission¶
submit(
args: SparkSubmitArguments,
uninitLog: Boolean): Unit
submit
...FIXME
Running Main Class¶
runMain(
args: SparkSubmitArguments,
uninitLog: Boolean): Unit
runMain
prepareSubmitEnvironment with the given SparkSubmitArguments (that gives a 4-element tuple of childArgs
, childClasspath
, sparkConf
and childMainClass).
With verbose enabled, runMain
prints out the following INFO messages to the logs:
Main class:
[childMainClass]
Arguments:
[childArgs]
Spark config:
[sparkConf_redacted]
Classpath elements:
[childClasspath]
runMain
creates and sets a context classloader (based on spark.driver.userClassPathFirst
configuration property) and adds the jars (from childClasspath
).
runMain
loads the main class (childMainClass
).
runMain
creates a SparkApplication (if the main class is a subtype of) or creates a JavaMainApplication (with the main class).
In the end, runMain
requests the SparkApplication
to start (with the childArgs
and sparkConf
).
Cluster Managers¶
SparkSubmit
has a built-in support for some cluster managers (that are selected based on the master argument).
Nickname | Master URL |
---|---|
KUBERNETES | k8s:// -prefix |
LOCAL | local -prefix |
MESOS | mesos -prefix |
STANDALONE | spark -prefix |
YARN | yarn |
Launching Standalone Application¶
main(
args: Array[String]): Unit
main
...FIXME
doSubmit¶
doSubmit(
args: Array[String]): Unit
doSubmit
...FIXME
doSubmit
is used when:
InProcessSparkSubmit
standalone application is startedSparkSubmit
standalone application is started
prepareSubmitEnvironment¶
prepareSubmitEnvironment(
args: SparkSubmitArguments,
conf: Option[HadoopConfiguration] = None): (Seq[String], Seq[String], SparkConf, String)
prepareSubmitEnvironment
creates a 4-element tuple made up of the following:
childArgs
for argumentschildClasspath
for Classpath elementssysProps
for Spark properties- childMainClass
Tip
Use --verbose
command-line option to have the elements of the tuple printed out to the standard output.
prepareSubmitEnvironment
...FIXME
prepareSubmitEnvironment
determines the cluster manager based on master argument.
For KUBERNETES, prepareSubmitEnvironment
checkAndGetK8sMasterUrl.
prepareSubmitEnvironment
...FIXME
prepareSubmitEnvironment
is used when...FIXME
childMainClass¶
childMainClass
is the last 4th argument in the result tuple of prepareSubmitEnvironment.
// (childArgs, childClasspath, sparkConf, childMainClass)
(Seq[String], Seq[String], SparkConf, String)
childMainClass
can be as follows:
Deploy Mode | Master URL | childMainClass |
---|---|---|
client | any | mainClass |
cluster | KUBERNETES | KubernetesClientApplication |
cluster | MESOS | RestSubmissionClientApp (for REST submission API) |
cluster | STANDALONE | RestSubmissionClientApp (for REST submission API) |
cluster | STANDALONE | ClientApp |
cluster | YARN | YarnClusterApplication |
isKubernetesClient¶
prepareSubmitEnvironment
uses isKubernetesClient
flag to indicate that:
isKubernetesClusterModeDriver¶
prepareSubmitEnvironment
uses isKubernetesClusterModeDriver
flag to indicate that:
- isKubernetesClient
spark.kubernetes.submitInDriver
configuration property is enabled (Spark on Kubernetes)
renameResourcesToLocalFS¶
renameResourcesToLocalFS(
resources: String,
localResources: String): String
renameResourcesToLocalFS
...FIXME
renameResourcesToLocalFS
is used for isKubernetesClusterModeDriver mode.
downloadResource¶
downloadResource(
resource: String): String
downloadResource
...FIXME
Checking Whether Resource is Internal¶
isInternal(
res: String): Boolean
isInternal
is true
when the given res
is spark-internal.
isInternal
is used when:
SparkSubmit
is requested to isUserJarSparkSubmitArguments
is requested to handleUnknown
Checking Whether Resource is User Jar¶
isUserJar(
res: String): Boolean
isUserJar
is true
when the given res
is none of the following:
isShell
isPython
- isInternal
isR
isUserJar
is used when:
- FIXME