spark-submit Shell Script¶

spark-submit shell script allows managing Spark applications.

spark-submit is a command-line frontend to SparkSubmit.

Command-Line Options¶

archives¶

Command-Line Option: --archives
Internal Property: archives

deploy-mode¶

Deploy mode

Command-Line Option: --deploy-mode
Spark Property: spark.submit.deployMode
Environment Variable: DEPLOY_MODE
Internal Property: deployMode

driver-class-path¶

--driver-class-path

Extra class path entries (e.g. jars and directories) to pass to a driver's JVM.

--driver-class-path command-line option sets the extra class path entries (e.g. jars and directories) that should be added to a driver's JVM.

Tip

Use --driver-class-path in client deploy mode (not SparkConf) to ensure that the CLASSPATH is set up with the entries.

client deploy mode uses the same JVM for the driver as spark-submit's.

Internal Property: driverExtraClassPath

Spark Property: spark.driver.extraClassPath

Note

Command-line options (e.g. --driver-class-path) have higher precedence than their corresponding Spark settings in a Spark properties file (e.g. spark.driver.extraClassPath). You can therefore control the final settings by overriding Spark settings on command line using the command-line options.

driver-cores¶

--driver-cores NUM

--driver-cores command-line option sets the number of cores to NUM for the driver in the cluster deploy mode.

Spark Property: spark.driver.cores

Note

Only available for cluster deploy mode (when the driver is executed outside spark-submit).

Internal Property: driverCores

properties-file¶

--properties-file [FILE]

--properties-file command-line option sets the path to a file FILE from which Spark loads extra Spark properties.

Note

Spark uses conf/spark-defaults.conf by default.

queue¶

--queue QUEUE_NAME

YARN resource queue

Spark Property: spark.yarn.queue
Internal Property: queue

version¶

Command-Line Option: --version

$ ./bin/spark-submit --version
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.1.0-SNAPSHOT
      /_/

Branch master
Compiled by user jacek on 2016-09-30T07:08:39Z
Revision 1fad5596885aab8b32d2307c0edecbae50d5bd7a
Url https://github.com/apache/spark.git
Type --help for more information.

SPARK_PRINT_LAUNCH_COMMAND¶

SPARK_PRINT_LAUNCH_COMMAND environment variable allows to have the complete Spark command printed out to the standard output.

$ SPARK_PRINT_LAUNCH_COMMAND=1 ./bin/spark-shell
Spark Command: /Library/Ja...