spark-class shell script is the Spark application command-line launcher that is responsible for setting up JVM environment and executing a Spark application.
Ultimately, any shell script in Spark, e.g. spark-submit, calls
You can find
spark-class script in
bin directory of the Spark distribution.
spark-class first loads
$SPARK_HOME/bin/load-spark-env.sh, collects the Spark assembly jars, and executes org.apache.spark.launcher.Main.
Depending on the Spark distribution (or rather lack thereof), i.e. whether
RELEASE file exists or not, it sets
SPARK_JARS_DIR environment variable to
[SPARK_HOME]/assembly/target/scala-[SPARK_SCALA_VERSION]/jars, respectively (with the latter being a local build).
SPARK_JARS_DIR does not exist,
spark-class prints the following error message and exits with the code
Failed to find Spark jars directory ([SPARK_JARS_DIR]). You need to build Spark with the target "package" before running this program.
LAUNCH_CLASSPATH environment variable to include all the jars under
SPARK_PREPEND_CLASSES is enabled,
[SPARK_HOME]/launcher/target/scala-[SPARK_SCALA_VERSION]/classes directory is added to
LAUNCH_CLASSPATH as the first entry.
SPARK_SQL_TESTING environment variables enable test special mode.
|FIXME What’s so special about the env vars?|
spark-class uses org.apache.spark.launcher.Main command-line application to compute the Spark command to launch. The
Main class programmatically computes the command that
spark-class executes afterwards.
org.apache.spark.launcher.Main is a Scala standalone application used in
spark-class to prepare the Spark command to execute.
Main expects that the first parameter is the class name that is the "operation mode":
$ ./bin/spark-class org.apache.spark.launcher.Main Exception in thread "main" java.lang.IllegalArgumentException: Not enough arguments: missing class name. at org.apache.spark.launcher.CommandBuilderUtils.checkArgument(CommandBuilderUtils.java:241) at org.apache.spark.launcher.Main.main(Main.java:51)
buildCommand method on the builder to build a Spark command.
SPARK_PRINT_LAUNCH_COMMAND environment variable is enabled,
Main prints the final Spark command to standard error.
Spark Command: [cmd] ========================================
If on Windows it calls
prepareWindowsCommand while on non-Windows OSes
prepareBashCommand with tokens separated by