Skip to content

ShuffleQueryStageExec Adaptive Leaf Physical Operator

ShuffleQueryStageExec is a QueryStageExec with either a ShuffleExchangeExec or a ReusedExchangeExec child operators.

Creating Instance

ShuffleQueryStageExec takes the following to be created:

ShuffleQueryStageExec is created when:


shuffle: ShuffleExchangeLike

ShuffleQueryStageExec initializes the shuffle internal registry when created.

ShuffleQueryStageExec assumes that the given physical operator is either a ShuffleExchangeLike or a ReusedExchangeExec and extracts the ShuffleExchangeLike.

If not, ShuffleQueryStageExec throws an IllegalStateException:

wrong plan for shuffle stage:

shuffle is used when:

Shuffle MapOutputStatistics Future

shuffleFuture: Future[MapOutputStatistics]

shuffleFuture requests the ShuffleExchangeLike to submit a shuffle job (and eventually produce a MapOutputStatistics (Apache Spark)).

Lazy Value

shuffleFuture is a Scala lazy value to guarantee that the code to initialize it is executed once only (when accessed for the first time) and the computed value never changes afterwards.

Learn more in the Scala Language Specification.

shuffleFuture is used when:


doMaterialize(): Future[Any]

doMaterialize returns the Shuffle MapOutputStatistics Future.

doMaterialize is part of the QueryStageExec abstraction.


cancel(): Unit

cancel cancels the Shuffle MapOutputStatistics Future (unless already completed).

cancel is part of the QueryStageExec abstraction.


  newStageId: Int,
  newOutput: Seq[Attribute]): QueryStageExec

newReuseInstance is...FIXME

newReuseInstance is part of the QueryStageExec abstraction.


mapStats: Option[MapOutputStatistics]

mapStats assumes that the MapOutputStatistics is already available or throws an AssertionError:

assertion failed: ShuffleQueryStageExec should already be ready

mapStats returns the MapOutputStatistics.

mapStats is used when: