= ExternalShuffleService :navtitle: External Shuffle Service
ExternalShuffleService is a Spark service that can serve shuffle blocks from outside an executor:Executor.md[Executor] process. It runs as a standalone application and manages shuffle output files so they are available for executors at all time. As the shuffle output files are managed externally to the executors it offers an uninterrupted access to the shuffle output files regardless of executors being killed or down.
You start ExternalShuffleService using <
There is a custom external shuffle service for Spark on YARN -- spark-on-yarn:spark-yarn-YarnShuffleService.md[YarnShuffleService].
== [[start-script]] start-shuffle-service.sh Shell Script
start-shuffle-service.sh shell script allows you to launch ExternalShuffleService. The script is under
When executed, it runs
bin/load-spark-env.sh shell scripts. It then executes
sbin/spark-daemon.sh with start command and the parameters:
$ ./sbin/start-shuffle-service.sh starting org.apache.spark.deploy.ExternalShuffleService, logging to ...logs/spark-jacek-org.apache.spark.deploy.ExternalShuffleService-1-japila.local.out
$ tail -f ...logs/spark-jacek-org.apache.spark.deploy.ExternalShuffleService-1-japila.local.out Spark Command: /Library/Java/JavaVirtualMachines/Current/Contents/Home/bin/java -cp /Users/jacek/dev/oss/spark/conf/:/Users/jacek/dev/oss/spark/assembly/target/scala-2.11/jars/* -Xmx1g org.apache.spark.deploy.ExternalShuffleService ======================================== Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 16/06/07 08:02:02 INFO ExternalShuffleService: Started daemon with process name: email@example.com 16/06/07 08:02:03 INFO ExternalShuffleService: Starting shuffle service on port 7337 with useSasl = false
You can also use tools:spark-class.md[spark-class] to launch ExternalShuffleService.
== [[main]] Launching ExternalShuffleService Standalone Application
When started, it executes
Utils.initDaemon(log)? See spark-submit.
It loads default Spark properties and creates a
It sets ROOT:configuration-properties.md#spark.shuffle.service.enabled[spark.shuffle.service.enabled] to
true (as <
A ExternalShuffleService is <
A shutdown hook is registered so when ExternalShuffleService is shut down, it prints the following INFO message to the logs and the <
Shutting down shuffle service.¶
You should see the following INFO message in the logs:
Registered executor [AppExecId] with [executorInfo]¶
You should also see the following messages when a
SparkContext is closed:
Application [appId] removed, cleanupLocalDirs = [cleanupLocalDirs] Cleaning up executor [AppExecId]'s [executor.localDirs.length] local dirs Successfully cleaned up directory: [localDir]
== [[creating-instance]] Creating Instance
ExternalShuffleService requires a ROOT:SparkConf.md[SparkConf] and spark-security.md[SecurityManager].
When created, it reads ROOT:configuration-properties.md#spark.shuffle.service.enabled[spark.shuffle.service.enabled] configuration property (disabled by default) and <
7337) configuration settings. It also checks whether authentication is enabled.
CAUTION: FIXME Review
ExternalShuffleService creates a network:TransportConf.md (as
It creates a deploy:ExternalShuffleBlockHandler.md (as
CAUTION: FIXME TransportContext?
server) is created.
== [[start]] Starting ExternalShuffleService
start starts an ExternalShuffleService.
When start is executed, you should see the following INFO message in the logs:
Starting shuffle service on port [port] with useSasl = [useSasl]¶
useSasl is enabled, a
SaslServerBootstrap is created.
CAUTION: FIXME SaslServerBootstrap?
server reference (a
TransportServer) is created (which will attempt to bind to
port is <
7337 when ExternalShuffleService is created>>.
== [[stop]] Stopping ExternalShuffleService
stop closes the internal
server reference and clears it (i.e. sets it to
== [[logging]] Logging
ALL logging level for
org.apache.spark.deploy.ExternalShuffleService logger to see what happens inside.
Add the following line to
Refer to ROOT:spark-logging.md[Logging].