A ResultStage is the final stage in a job that applies a function on one or many partitions of the target RDD to compute the result of an action.

dagscheduler job resultstage
Figure 1. Job creates ResultStage as the first stage

The partitions are given as a collection of partition ids (partitions) and the function func: (TaskContext, Iterator[_]) ⇒ _.

dagscheduler resultstage partitions
Figure 2. ResultStage and partitions
Read about TaskContext in TaskContext.

Finding Missing Partitions

findMissingPartitions(): Seq[Int]
findMissingPartitions is part of the Stage abstraction.


resultstage findMissingPartitions
Figure 3. ResultStage.findMissingPartitions and ActiveJob

In the above figure, partitions 1 and 2 are not finished (F is false while T is true).

func Property


setActiveJob Method


removeActiveJob Method


activeJob Method

activeJob: Option[ActiveJob]

activeJob returns the optional ActiveJob associated with a ResultStage.

FIXME When/why would that be NONE (empty)?