ProjectExec Unary Physical Operator¶
ProjectExec
is a unary physical operator with support for Java code generation that represents Project logical operator at execution.
Creating Instance¶
ProjectExec
takes the following to be created:
- NamedExpressions
- Child Physical Operator
ProjectExec
is created when:
- BasicOperators execution planning strategy is executed (to plan Project logical operator)
SparkPlanner
is requested to pruneFilterProject- DataSourceStrategy execution planning strategy is executed
- FileSourceStrategy execution planning strategy is executed
- DataSourceV2Strategy execution planning strategy is executed
FileFormatWriter
is requested to write
Executing Physical Operator¶
doExecute(): RDD[InternalRow]
doExecute
is part of the SparkPlan abstraction.
doExecute
requests the child physical plan to execute and mapPartitionsWithIndexInternal.
mapPartitionsWithIndexInternal¶
doExecute
uses RDD.mapPartitionsWithIndexInternal
.
mapPartitionsWithIndexInternal[U](
f: (Int, Iterator[T]) => Iterator[U],
preservesPartitioning: Boolean = false)
doExecute
creates an UnsafeProjection for the named expressions and (the output of) the child physical operator.
doExecute
requests the UnsafeProjection
to initialize and maps over the internal rows (of a partition) using the projection.
Output Attributes¶
output: Seq[Attribute]
output
is part of the QueryPlan abstraction.
output
is the NamedExpressions converted to Attributes.