Causing Stage to Fail
== Exercise: Causing Stage to Fail
The example shows how Spark re-executes a stage in case of stage failure.
=== Recipe
Start a Spark cluster, e.g. 1-node Hadoop YARN.
start-yarn.sh
// 2-stage job -- it _appears_ that a stage can be failed only when there is a shuffle
sc.parallelize(0 to 3e3.toInt, 2).map(n => (n % 2, n)).groupByKey.count
Use 2 executors at least so you can kill one and keep the application up and running (on one executor).
YARN_CONF_DIR=hadoop-conf ./bin/spark-shell --master yarn \
-c spark.shuffle.service.enabled=true \
--num-executors 2