AggregationIterators¶
AggregationIterator
is an abstraction of aggregation iterators of UnsafeRows.
abstract class AggregationIterator(...)
extends Iterator[UnsafeRow]
From scala.collection.Iterator:
Iterators are data structures that allow to iterate over a sequence of elements. They have a
hasNext
method for checking if there is a next element available, and anext
method which returns the next element and discards it from the iterator.
Implementations¶
Creating Instance¶
AggregationIterator
takes the following to be created:
- Partition ID
- Grouping NamedExpressions
- Input Attributes
- AggregateExpressions
- Aggregate Attributes
- Initial input buffer offset
- Result NamedExpressions
- Function to create a new
MutableProjection
given expressions and attributes ((Seq[Expression], Seq[Attribute]) => MutableProjection
)
Abstract Class
AggregationIterator
is an abstract class and cannot be created directly. It is created indirectly for the concrete AggregationIterators.
AggregateModes¶
When created, AggregationIterator
makes sure that there are at most 2 distinct AggregateMode
s of the AggregateExpressions.
The AggregateMode
s have to be a subset of the following mode pairs:
Partial
andPartialMerge
Final
andComplete
AggregateFunctions¶
aggregateFunctions: Array[AggregateFunction]
When created, AggregationIterator
initializes AggregateFunctions in the aggregateExpressions (with initialInputBufferOffset).
initializeAggregateFunctions¶
initializeAggregateFunctions(
expressions: Seq[AggregateExpression],
startingInputBufferOffset: Int): Array[AggregateFunction]
initializeAggregateFunctions
...FIXME
initializeAggregateFunctions
is used when:
AggregationIterator
is requested for the aggregateFunctionsObjectAggregationIterator
is requested for the mergeAggregationBuffersTungstenAggregationIterator
is requested to switchToSortBasedAggregation
Generating Process Row Function¶
generateProcessRow(
expressions: Seq[AggregateExpression],
functions: Seq[AggregateFunction],
inputAttributes: Seq[Attribute]): (InternalRow, InternalRow) => Unit
generateProcessRow
is a procedure
generateProcessRow
creates a Scala function (procedure) that takes two InternalRows and produces no output.
def generateProcessRow(currentBuffer: InternalRow, row: InternalRow): Unit = {
...
}
generateProcessRow
creates a mutable JoinedRow
(of two InternalRows).
generateProcessRow
branches off based on the given AggregateExpressions (expressions
).
With no AggregateExpressions (expressions
), generateProcessRow
creates a function that does nothing (and "swallows" the input).
functions
Argument
generateProcessRow
works differently based on the type of the given AggregateFunctions:
Otherwise, with some AggregateExpressions (expressions
), generateProcessRow
...FIXME
generateProcessRow
is used when:
AggregationIterator
is requested for the processRow functionObjectAggregationIterator
is requested for the mergeAggregationBuffers functionTungstenAggregationIterator
is requested to switchToSortBasedAggregation
generateOutput¶
generateOutput: (UnsafeRow, InternalRow) => UnsafeRow
When created, AggregationIterator
creates a ResultProjection function.
generateOutput
is used when:
ObjectAggregationIterator
is requested for the next element and to outputForEmptyGroupingKeyWithoutInputSortBasedAggregationIterator
is requested for the next element and to outputForEmptyGroupingKeyWithoutInputTungstenAggregationIterator
is requested for the next element and to outputForEmptyGroupingKeyWithoutInput
generateResultProjection¶
generateResultProjection(): (UnsafeRow, InternalRow) => UnsafeRow
generateResultProjection
...FIXME
initializeBuffer¶
initializeBuffer(
buffer: InternalRow): Unit
initializeBuffer
requests the expressionAggInitialProjection to store an execution result of an empty row in the given InternalRow (buffer
).
initializeBuffer
requests all the ImperativeAggregate functions to initialize with the buffer
internal row.
initializeBuffer
is used when:
MergingSessionsIterator
is requested tonewBuffer
,initialize
,next
,outputForEmptyGroupingKeyWithoutInput
SortBasedAggregationIterator
is requested to newBuffer, initialize, next and outputForEmptyGroupingKeyWithoutInput