AggregateExpression¶
AggregateExpression
is an unevaluable expression that acts as a container (wrapper) for an AggregateFunction.
Creating Instance¶
AggregateExpression
takes the following to be created:
- AggregateFunction
- AggregateMode
-
isDistinct
flag - (optional) Filter Expression
- Result
ExprId
AggregateExpression
is created using apply utility.
AggregateMode¶
AggregateExpression
is given an AggregateMode
when created.
-
For
PartialMerge
orFinal
modes, the input to the AggregateFunction is immutable input aggregation buffers, and the actual children of theAggregateFunction
is not used -
AggregateExpressions of a AggregationIterator cannot have more than 2 distinct modes nor the modes be among
Partial
andPartialMerge
orFinal
andComplete
mode pairs -
Partial
andComplete
orPartialMerge
andFinal
pairs are supported
Complete¶
Final¶
Partial¶
- Partial aggregation
partial_
prefix (in toString)
PartialMerge¶
merge_
prefix (in toString)
Creating AggregateExpression¶
apply(
aggregateFunction: AggregateFunction,
mode: AggregateMode,
isDistinct: Boolean,
filter: Option[Expression] = None): AggregateExpression
apply
creates an AggregateExpression with a new autogenerated ExprId
.
apply
is used when:
AggregateFunction
is requested to toAggregateExpression- others
Human-Friendly Textual Representation¶
toString: String
toString
returns the following text:
[prefix][name]([args]) FILTER (WHERE [predicate])
toString
converts the mode to a prefix.
mode | prefix |
---|---|
Partial | partial_ |
PartialMerge | merge_ |
Final or Complete | (empty) |
toString
requests the AggregateFunction for the toAggString (with the isDistinct flag).
In the end, toString
adds FILTER (WHERE [predicate])
based on the optional filter expression.
Review Me¶
AggregateExpression
contains the following:
- [[aggregateFunction]] AggregateFunction
- [[mode]]
AggregateMode
- [[isDistinct]]
isDistinct
flag indicating whether this aggregation is distinct or not (e.g. whether SQL'sDISTINCT
keyword was used for the aggregate function) - [[resultId]]
ExprId
AggregateExpression
is created when:
-
Analyzer
is requested to resolve AggregateFunctions (and creates anAggregateExpression
withComplete
aggregate mode for the functions) -
UserDefinedAggregateFunction
is created withisDistinct
flag disabled or enabled -
AggUtils
is requested to planAggregateWithOneDistinct (and createsAggregateExpressions
withPartial
andFinal
aggregate modes for the functions) -
Aggregator
is requested for a TypedColumn (usingAggregator.toColumn
) -
AggregateFunction
is spark-sql-Expression-AggregateFunction.md#toAggregateExpression[wrapped in a AggregateExpression]
[[toString-prefixes]] .toString's Prefixes per AggregateMode [cols="1,2",options="header",width="100%"] |=== | Prefix | AggregateMode
| partial_
| Partial
| merge_
| PartialMerge
| (empty) | Final
or Complete
|===
[[properties]] .AggregateExpression's Properties [width="100%",cols="1,2",options="header"] |=== | Name | Description
| canonicalized
| AggregateExpression
with <canonicalized
with the special ExprId
as 0
.
| children
| <AggregateExpression
was created).
| dataType
| DataType of AggregateFunction expression
| foldable
| Disabled (i.e. false
)
| nullable
| Whether or not <
| [[references]] references
a| AttributeSet
with the following:
-
references
of <> when < > is Partial
orComplete
-
spark-sql-Expression-AggregateFunction.md#aggBufferAttributes[aggBufferAttributes] of <
> when PartialMerge
orFinal
| resultAttribute
a|
spark-sql-Expression-Attribute.md[Attribute] that is:
-
AttributeReference
when <> is itself resolved -
UnresolvedAttribute
otherwise
| sql
| Requests <
| toString
| <toAggString
(with <