StatefulOpClusteredDistribution¶
StatefulOpClusteredDistribution
is a Distribution
(Spark SQL).
StatefulOpClusteredDistribution
requires the Expressions are specified or throws an exception:
The expressions for hash of a StatefulOpClusteredDistribution should not be Nil.
An AllTuples should be used to represent a distribution that only has a single partition.
Creating Instance¶
StatefulOpClusteredDistribution
takes the following to be created:
-
Expression
s (Spark SQL) - Required number of partitions
StatefulOpClusteredDistribution
is created when:
StatefulOperatorPartitioning
is requested to getCompatibleDistributionStreamingSymmetricHashJoinExec
is requested for the required child output distribution
Required Number of Partitions¶
StatefulOpClusteredDistribution
is given a required number of partitions when created.
requiredNumPartitions
is part of the Distribution
(Spark SQL) abstraction.
Partitioning¶
createPartitioning(
numPartitions: Int): Partitioning
createPartitioning
is part of the Distribution
(Spark SQL) abstraction.
createPartitioning
asserts that the given numPartitions
is exactly the required number of partitions or throws an exception otherwise:
This StatefulOpClusteredDistribution requires [requiredNumPartitions] partitions,
but the actual number of partitions is [numPartitions].
createPartitioning
creates a HashPartitioning
(Spark SQL) (with the expressions and the numPartitions).