Repartition Logical Operator¶
Repartition is a concrete RepartitionOperation.
Creating Instance¶
Repartition takes the following to be created:
- Number of partitions
- shuffle flag
- Child LogicalPlan
Repartition is created for the following:
- Dataset.coalesce and Dataset.repartition operators (with shuffle disabled and enabled, respectively)
COALESCEandREPARTITIONhints (via ResolveCoalesceHints logical analysis rule, with shuffle disabled and enabled, respectively)
Query Planning¶
Repartition is planned to the following physical operators based on shuffle flag:
- ShuffleExchangeExec with
shuffleenabled - CoalesceExec with
shuffledisabled
Catalyst DSL¶
Catalyst DSL defines the following operators to create a Repartition logical operator:
- coalesce (with shuffle disabled)
- repartition (with shuffle enabled)
Partitioning¶
partitioning: Partitioning
partitioning uses the numPartitions to determine the Partitioning:
- SinglePartition for
1 - RoundRobinPartitioning otherwise
partitioning requires that the shuffle flag is enabled or throws an exception:
Partitioning can only be used in shuffle.
partitioning is part of the RepartitionOperation abstraction.