BloomFilterAggregate is a TypedImperativeAggregate expression that uses BloomFilter for an aggregation buffer.
BloomFilterAggregate takes the following to be created:
- Child Expression
- Estimated Number of Items
- Number of Bits
- Mutable Agg Buffer Offset (default:
- Input Agg Buffer Offset (default:
BloomFilterAggregate is created when:
InjectRuntimeFilterlogical optimization is requested to inject a BloomFilter
Estimated Number of Items Expression¶
BloomFilterAggregate can be given Estimated Number of Items (as an Expression) when created.
BloomFilterAggregate uses spark.sql.optimizer.runtime.bloomFilter.expectedNumItems configuration property.
Number of Bits Expression¶
BloomFilterAggregate can be given Number of Bits (as an Expression) when created.
The number of bits expression must be a constant literal (i.e., foldable) that evaluates to a long value.
The maximum value for the number of bits is spark.sql.optimizer.runtime.bloomFilter.maxNumBits configuration property.
The number of bits expression is the third expression (in this
TernaryLike tree node).
Number of Bits¶
numBits is a Scala lazy value to guarantee that the code to initialize it is executed once only (when accessed for the first time) and the computed value never changes afterwards.
Learn more in the Scala Language Specification.
numBits value to be either the value of the numBitsExpression (after evaluating it to a number) or spark.sql.optimizer.runtime.bloomFilter.maxNumBits, whatever smaller.
numBits value must be a positive value.
numBits is used to create an aggregation buffer.
Creating Aggregation Buffer¶
createAggregationBuffer is part of the TypedImperativeAggregate abstraction.
createAggregationBuffer creates a BloomFilter (with the estimated number of items and the number of bits).
eval( buffer: BloomFilter): Any
eval is part of the TypedImperativeAggregate abstraction.