BaseAggregateExec Unary Physical Operators¶
BaseAggregateExec
is an extension of the UnaryExecNode abstraction for aggregate unary physical operators.
BaseAggregateExec
is a PartitioningPreservingUnaryExecNode physical operator.
Contract¶
Aggregate Attributes¶
aggregateAttributes: Seq[Attribute]
Aggregate Attributes
Used when:
AggregateCodegenSupport
is requested to doProduceWithoutKeysBaseAggregateExec
is requested to verboseStringWithOperatorId, producedAttributes, toSortAggregate
Aggregate Functions¶
aggregateExpressions: Seq[AggregateExpression]
Grouping Keys¶
groupingExpressions: Seq[NamedExpression]
NamedExpressions of the grouping keys
See:
initialInputBufferOffset¶
initialInputBufferOffset: Int
isStreaming¶
isStreaming: Boolean
Used when:
BaseAggregateExec
is requested to requiredChildDistribution, toSortAggregate
numShufflePartitions¶
numShufflePartitions: Option[Int]
Used when:
BaseAggregateExec
is requested to requiredChildDistribution, toSortAggregate
Required Child Distribution Expressions¶
requiredChildDistributionExpressions: Option[Seq[Expression]]
Used when:
BaseAggregateExec
is requested for the requiredChildDistribution- DisableUnnecessaryBucketedScan physical optimization is executed
Result Expressions¶
resultExpressions: Seq[NamedExpression]
NamedExpressions of the result
Implementations¶
PartitioningPreservingUnaryExecNode¶
BaseAggregateExec
is an PartitioningPreservingUnaryExecNode.
Detailed Description (with Operator Id)¶
verboseStringWithOperatorId(): String
verboseStringWithOperatorId
is part of the QueryPlan abstraction.
verboseStringWithOperatorId
returns the following text (with the formattedNodeName and the others):
[formattedNodeName]
Input [size]: [output]
Keys [size]: [groupingExpressions]
Functions [size]: [aggregateExpressions]
Aggregate Attributes [size]: [aggregateAttributes]
Results [size]: [resultExpressions]
Field | Description |
---|---|
formattedNodeName | (operatorId) nodeName [codegen id : $id] |
Input | Output schema of the single child operator |
Keys | Grouping Keys |
Functions | Aggregate Functions |
Aggregate Attributes | Aggregate Attributes |
Results | Result Expressions |
Required Child Output Distribution¶
requiredChildDistribution: List[Distribution]
requiredChildDistribution
is part of the SparkPlan abstraction.
requiredChildDistribution
...FIXME
Produced Attributes (Schema)¶
producedAttributes: AttributeSet
producedAttributes
is part of the QueryPlan abstraction.
producedAttributes
is the following:
- Aggregate Attributes
- Result Expressions that are not Grouping Keys
- Aggregate Buffer Attributes
- inputAggBufferAttributes without the output attributes of the single child operator
Aggregate Buffer Attributes (Schema)¶
aggregateBufferAttributes: Seq[AttributeReference]
aggregateBufferAttributes
is the aggBufferAttributes of the AggregateFunctions of all the Aggregate Functions.
aggregateBufferAttributes
is used when:
AggregateCodegenSupport
is requested to supportCodegen, doProduceWithoutKeysBaseAggregateExec
is requested for the produced attributes
Converting This Node to SortAggregateExec¶
toSortAggregate: SortAggregateExec
toSortAggregate
creates a SortAggregateExec physical operator (for the same arguments and hence to get the same result as this node).
toSortAggregate
is used when:
- ReplaceHashWithSortAgg physical optimization is executed (and replaceHashAgg)