Skip to content


== [[InterruptibleIterator]] InterruptibleIterator -- Iterator With Support For Task Cancellation

InterruptibleIterator is a custom Scala[Iterator] that supports task cancellation, i.e. <>.

Quoting the official Scala[Iterator] documentation:

Iterators are data structures that allow to iterate over a sequence of elements. They have a hasNext method for checking if there is a next element available, and a next method which returns the next element and discards it from the iterator.

InterruptibleIterator is <> when:

  • RDD is requested to[get or compute a RDD partition]

  • CoGroupedRDD,[HadoopRDD],[NewHadoopRDD],[ParallelCollectionRDD] are requested to compute a partition

  • BlockStoreShuffleReader is requested to[read combined key-value records for a reduce task]

  • PairRDDFunctions is requested to[combineByKeyWithClassTag]

  • Spark SQL's DataSourceRDD and JDBCRDD are requested to compute a partition

  • Spark SQL's RangeExec physical operator is requested to doExecute

  • PySpark's BasePythonRunner is requested to compute

[[creating-instance]] InterruptibleIterator takes the following when created:

  • [[context]] TaskContext
  • [[delegate]] Scala Iterator[T]

NOTE: InterruptibleIterator is a Developer API which is a lower-level, unstable API intended for Spark developers that may change or be removed in minor versions of Apache Spark.

=== [[hasNext]] hasNext Method

[source, scala]

hasNext: Boolean

NOTE: hasNext is part of ++[Iterator Contract] to test whether this iterator can provide another element.

hasNext requests the <> to kill the task if interrupted (that simply throws a TaskKilledException that in turn breaks the task execution).

In the end, hasNext requests the <> to hasNext.

=== [[next]] next Method

[source, scala]

next(): T

NOTE: next is part of ++[Iterator Contract] to produce the next element of this iterator.

next simply requests the <> to next.