HashClusteredDistribution¶
HashClusteredDistribution is a Distribution.md[Distribution] that <
[[requiredNumPartitions]] HashClusteredDistribution specifies None for the Distribution.md#requiredNumPartitions[required number of partitions].
Note
None for the required number of partitions indicates to use any number of partitions (possibly spark.sql.shuffle.partitions configuration property).
HashClusteredDistribution is <CoGroupExec, ShuffledHashJoinExec.md[ShuffledHashJoinExec], SortMergeJoinExec.md[SortMergeJoinExec] and Spark Structured Streaming's StreamingSymmetricHashJoinExec).
[[creating-instance]][[expressions]] HashClusteredDistribution takes hash expressions/Expression.md[expressions] when created.
HashClusteredDistribution requires that the <Nil).
HashClusteredDistribution is used when:
-
EnsureRequirements is executed (for Adaptive Query Execution)
-
HashPartitioningis requested tosatisfies
=== [[createPartitioning]] createPartitioning Method
[source, scala]¶
createPartitioning( numPartitions: Int): Partitioning
createPartitioning creates a HashPartitioning for the <numPartitions.
createPartitioning is part of the Distribution abstraction.