Skip to content


== [[RDDBarrier]] RDDBarrier

RDDBarrier is used to mark the current stage as a <> in <>.

RDDBarrier is <> exclusively as the result of <> transformation (which is new in Spark 2.4.0).

[source, scala]

barrier(): RDDBarrier[T]

[[creating-instance]] [[rdd]] RDDBarrier takes a single[RDD] to be created and gives the single <> transformation (on the RDD) that simply changes the regular[RDD.mapPartitions] transformation to create a[MapPartitionsRDD] with the[isFromBarrier] flag enabled.

[[mapPartitions]] [source, scala]

mapPartitionsS: ClassTag: RDD[S]


val rdd = sc.parallelize(0 to 3, 1)

scala> :type rdd.barrier

val barrierRdd = rdd
scala> :type barrierRdd

scala> println(barrierRdd.toDebugString)
(1) MapPartitionsRDD[5] at mapPartitions at <console>:26 []
 |  ParallelCollectionRDD[3] at parallelize at <console>:25 []

// MapPartitionsRDD is private[spark]
// so is RDD.isBarrier
// Use org.apache.spark package then
// :paste -raw the following code in spark-shell / Scala REPL
package org.apache.spark
object IsBarrier {
  import org.apache.spark.rdd.RDD
  implicit class BypassPrivateSpark[T](rdd: RDD[T]) {
    def myIsBarrier = rdd.isBarrier
// END

import org.apache.spark.IsBarrier._