Aggregate Logical Operator¶
Aggregate is a unary logical operator for Aggregation Queries and can represent the following high-level operators in a logical query plan:
AstBuilderis requested to visitCommonSelectQueryClausePlan (HAVINGclause withoutGROUP BY) and parse GROUP BY clauseKeyValueGroupedDatasetis requested to agg (and aggUntyped)RelationalGroupedDatasetis requested to toDF
Internal Use
Aggregate logical operator is also used internally as part of logical and physical optimizations.
Creating Instance¶
Aggregate takes the following to be created:
- Grouping Expressions
- Aggregate NamedExpressions
- Child LogicalPlan
Aggregate is created when:
ResolveGroupingAnalyticsis requested to constructAggregateResolvePivotlogical resolution rule is executedGlobalAggregateslogical resolution rule is executed- Catalyst DSL's groupBy operator is used
DecorrelateInnerQueryis requested torewriteDomainJoinsInjectRuntimeFilteris requested to injectBloomFilter and injectInSubqueryFilterReplaceDistinctWithAggregatelogical optimization is executedReplaceDeduplicateWithAggregatelogical optimization is executed- RewriteExceptAll logical optimization is executed
RewriteIntersectAlllogical optimization is executedRewriteAsOfJoinlogical optimization is executedAstBuilderis requested to visitCommonSelectQueryClausePlan (for a global aggregate, i.e.HAVINGwithoutGROUP BY) and withAggregationClauseKeyValueGroupedDatasetis requested to aggUntypedRelationalGroupedDatasetis requested to toDF- PlanAdaptiveDynamicPruningFilters physical optimization is executed
- PlanDynamicPruningFilters physical optimization is executed
CommandUtilsis requested to computeColumnStats and computePercentilesRowLevelOperationRuntimeGroupFilteringlogical optimization is executed- others
Output Schema¶
output is the Attributes of the aggregate expressions.
Metadata Output Schema¶
metadataOutput is empty (Nil).
Node Patterns¶
nodePatterns is AGGREGATE.
Checking Requirements for HashAggregateExec¶
supportsHashAggregate(
aggregateBufferAttributes: Seq[Attribute]): Boolean
supportsHashAggregate builds a StructType for the given aggregateBufferAttributes.
In the end, supportsHashAggregate isAggregateBufferMutable.
supportsHashAggregate is used when:
MergeScalarSubqueriesis requested tosupportedAggregateMergeAggUtilsis requested to create a physical operator for aggregationHashAggregateExecphysical operator is created (to assert that the aggregateBufferAttributes are supported)
isAggregateBufferMutable¶
isAggregateBufferMutable(
schema: StructType): Boolean
isAggregateBufferMutable is enabled (true) when the type of all the fields (in the given schema) are mutable.
isAggregateBufferMutable is used when:
Aggregateis requested to check the requirements for HashAggregateExecUnsafeFixedWidthAggregationMapis requested to supportsAggregationBufferSchema