CollectLimitExec Physical Operator¶
CollectLimitExec is a unary physical operator that represents GlobalLimit unary logical operator at execution time.
CollectLimitExec takes the following to be created:
- Number of rows (to collect from the child operator)
- Physical operator
CollectLimitExec is created when SpecialLimits execution planning strategy is executed (and plans a GlobalLimit unary logical operator).
Executing Physical Operator¶
doExecute requests the child operator to execute and (maps over every partition to) takes the given number of rows from every partition. That gives a
doExecute prepares a ShuffleDependency (for the
SinglePartition partitioning) and creates a ShuffledRowRDD.
In the end,
doExecute (maps over every partition to) takes the given number of rows from the single partition.
doExecute is part of the SparkPlan abstraction.