Skip to content

CollectLimitExec Physical Operator

CollectLimitExec is a unary physical operator that represents GlobalLimit unary logical operator at execution time.

Creating Instance

CollectLimitExec takes the following to be created:

CollectLimitExec is created when SpecialLimits execution planning strategy is executed (and plans a GlobalLimit unary logical operator).

Executing Physical Operator

doExecute(): RDD[InternalRow]

doExecute requests the child operator to execute and (maps over every partition to) takes the given number of rows from every partition. That gives a RDD[InternalRow].

doExecute prepares a ShuffleDependency (for the RDD[InternalRow] and SinglePartition partitioning) and creates a ShuffledRowRDD.

In the end, doExecute (maps over every partition to) takes the given number of rows from the single partition.

doExecute is part of the SparkPlan abstraction.