Aggregate Logical Operator¶
Aggregate
is a unary logical operator for Aggregation Queries and can represent the following high-level operators in a logical query plan:
AstBuilder
is requested to visitCommonSelectQueryClausePlan (HAVING
clause withoutGROUP BY
) and parse GROUP BY clauseKeyValueGroupedDataset
is requested to agg (and aggUntyped)RelationalGroupedDataset
is requested to toDF
Internal Use
Aggregate
logical operator is also used internally as part of logical and physical optimizations.
Creating Instance¶
Aggregate
takes the following to be created:
- Grouping Expressions
- Aggregate NamedExpressions
- Child LogicalPlan
Aggregate
is created when:
ResolveGroupingAnalytics
is requested to constructAggregateResolvePivot
logical resolution rule is executedGlobalAggregates
logical resolution rule is executed- Catalyst DSL's groupBy operator is used
DecorrelateInnerQuery
is requested torewriteDomainJoins
InjectRuntimeFilter
is requested to injectBloomFilter and injectInSubqueryFilterReplaceDistinctWithAggregate
logical optimization is executedReplaceDeduplicateWithAggregate
logical optimization is executed- RewriteExceptAll logical optimization is executed
RewriteIntersectAll
logical optimization is executedRewriteAsOfJoin
logical optimization is executedAstBuilder
is requested to visitCommonSelectQueryClausePlan (for a global aggregate, i.e.HAVING
withoutGROUP BY
) and withAggregationClauseKeyValueGroupedDataset
is requested to aggUntypedRelationalGroupedDataset
is requested to toDF- PlanAdaptiveDynamicPruningFilters physical optimization is executed
- PlanDynamicPruningFilters physical optimization is executed
CommandUtils
is requested to computeColumnStats and computePercentilesRowLevelOperationRuntimeGroupFiltering
logical optimization is executed- others
Output Schema¶
output
is the Attributes of the aggregate expressions.
Metadata Output Schema¶
metadataOutput
is empty (Nil
).
Node Patterns¶
nodePatterns
is AGGREGATE.
Checking Requirements for HashAggregateExec¶
supportsHashAggregate(
aggregateBufferAttributes: Seq[Attribute]): Boolean
supportsHashAggregate
builds a StructType for the given aggregateBufferAttributes
.
In the end, supportsHashAggregate
isAggregateBufferMutable.
supportsHashAggregate
is used when:
MergeScalarSubqueries
is requested tosupportedAggregateMerge
AggUtils
is requested to create a physical operator for aggregationHashAggregateExec
physical operator is created (to assert that the aggregateBufferAttributes are supported)
isAggregateBufferMutable¶
isAggregateBufferMutable(
schema: StructType): Boolean
isAggregateBufferMutable
is enabled (true
) when the type of all the fields (in the given schema
) are mutable.
isAggregateBufferMutable
is used when:
Aggregate
is requested to check the requirements for HashAggregateExecUnsafeFixedWidthAggregationMap
is requested to supportsAggregationBufferSchema