YarnClusterScheduler — TaskScheduler for Cluster Deploy Mode

YarnClusterScheduler is the TaskScheduler for Spark on YARN in cluster deploy mode.

It is a custom YarnScheduler that makes sure that appropriate initialization of ApplicationMaster is performed, i.e. SparkContext is initialized and stopped.

While being created, you should see the following INFO message in the logs:

INFO YarnClusterScheduler: Created YarnClusterScheduler

Enable INFO logging level for org.apache.spark.scheduler.cluster.YarnClusterScheduler to see what happens inside YarnClusterScheduler.

Add the following line to conf/log4j.properties:

log4j.logger.org.apache.spark.scheduler.cluster.YarnClusterScheduler=INFO

Refer to Logging.

postStartHook Callback

You should see the following INFO message in the logs:

INFO YarnClusterScheduler: YarnClusterScheduler.postStartHook done

Stopping YarnClusterScheduler (stop method)