ScalaAggregator¶
ScalaAggregator[IN, BUF, OUT] is a TypedImperativeAggregate (of BUF values).
ScalaAggregator is a UserDefinedExpression.
Creating Instance¶
ScalaAggregator takes the following to be created:
- Children Expressions
- Aggregator
- Input ExpressionEncoder (of
INs) - Buffer ExpressionEncoder (of
BUFs) -
nullableflag (default:false) -
isDeterministicflag (default:true) -
mutableAggBufferOffset(default:0) -
inputAggBufferOffset(default:0) - Aggregator Name (default: undefined)
ScalaAggregator is created when:
UserDefinedAggregatoris requested to scalaAggregator
Aggregator¶
ScalaAggregator is given an Aggregator when created.
| ScalaAggregator | Aggregator |
|---|---|
| outputEncoder | outputEncoder |
| createAggregationBuffer | zero |
| update | reduce |
| merge | merge |
| Interpreted Execution | finish |
| name | Simple class name |
Interpreted Execution¶
eval(
buffer: BUF): Any
eval is part of the TypedImperativeAggregate abstraction.
eval requests the Aggregator to finish with the given (reduction) buffer.
eval requests the outputSerializer to convert the result (of type OUT to an InternalRow).
In the end, eval returns one of the following:
- The row if isSerializedAsStruct (per the outputEncoder)
- The object at the 0th index (that is assumed to be of DataType)
Logical Analysis¶
The input and buffer encoders are resolved and bound using ResolveEncodersInScalaAgg logical resolution rule.
Execution Planning¶
ScalaAggregator (as a TypedImperativeAggregate) uses aggBufferAttributes with BinaryType.
BinaryType is among unsupported types of HashAggregateExec and makes the physical operator out of scope for aggregation planning.
Because of this BinaryType (in an aggregation buffer) ScalaAggregator will always be planned as ObjectHashAggregateExec or SortAggregateExec physical operators.