ProjectExec Unary Physical Operator¶
ProjectExec is a unary physical operator with support for Java code generation that represents Project logical operator at execution.
Creating Instance¶
ProjectExec takes the following to be created:
- NamedExpressions
- Child Physical Operator
ProjectExec is created when:
- BasicOperators execution planning strategy is executed (to plan Project logical operator)
SparkPlanneris requested to pruneFilterProject- DataSourceStrategy execution planning strategy is executed
- FileSourceStrategy execution planning strategy is executed
- DataSourceV2Strategy execution planning strategy is executed
FileFormatWriteris requested to write
Executing Physical Operator¶
doExecute(): RDD[InternalRow]
doExecute is part of the SparkPlan abstraction.
doExecute requests the child physical plan to execute and mapPartitionsWithIndexInternal.
mapPartitionsWithIndexInternal¶
doExecute uses RDD.mapPartitionsWithIndexInternal.
mapPartitionsWithIndexInternal[U](
f: (Int, Iterator[T]) => Iterator[U],
preservesPartitioning: Boolean = false)
doExecute creates an UnsafeProjection for the named expressions and (the output of) the child physical operator.
doExecute requests the UnsafeProjection to initialize and maps over the internal rows (of a partition) using the projection.
Output Attributes¶
output: Seq[Attribute]
output is part of the QueryPlan abstraction.
output is the NamedExpressions converted to Attributes.