Skip to content

StatefulOpClusteredDistribution

StatefulOpClusteredDistribution is a Distribution (Spark SQL).

StatefulOpClusteredDistribution requires the Expressions are specified or throws an exception:

The expressions for hash of a StatefulOpClusteredDistribution should not be Nil.
An AllTuples should be used to represent a distribution that only has a single partition.

Creating Instance

StatefulOpClusteredDistribution takes the following to be created:

StatefulOpClusteredDistribution is created when:

Required Number of Partitions

StatefulOpClusteredDistribution is given a required number of partitions when created.

requiredNumPartitions is part of the Distribution (Spark SQL) abstraction.

Partitioning

createPartitioning(
  numPartitions: Int): Partitioning

createPartitioning is part of the Distribution (Spark SQL) abstraction.


createPartitioning asserts that the given numPartitions is exactly the required number of partitions or throws an exception otherwise:

This StatefulOpClusteredDistribution requires [requiredNumPartitions] partitions,
but the actual number of partitions is [numPartitions].

createPartitioning creates a HashPartitioning (Spark SQL) (with the expressions and the numPartitions).