Repartition Logical Operator¶
Repartition
is a concrete RepartitionOperation.
Creating Instance¶
Repartition
takes the following to be created:
- Number of partitions
- shuffle flag
- Child LogicalPlan
Repartition
is created for the following:
- Dataset.coalesce and Dataset.repartition operators (with shuffle disabled and enabled, respectively)
COALESCE
andREPARTITION
hints (via ResolveCoalesceHints logical analysis rule, with shuffle disabled and enabled, respectively)
Query Planning¶
Repartition
is planned to the following physical operators based on shuffle flag:
- ShuffleExchangeExec with
shuffle
enabled - CoalesceExec with
shuffle
disabled
Catalyst DSL¶
Catalyst DSL defines the following operators to create a Repartition
logical operator:
- coalesce (with shuffle disabled)
- repartition (with shuffle enabled)
Partitioning¶
partitioning: Partitioning
partitioning
uses the numPartitions to determine the Partitioning:
- SinglePartition for
1
- RoundRobinPartitioning otherwise
partitioning
requires that the shuffle flag is enabled or throws an exception:
Partitioning can only be used in shuffle.
partitioning
is part of the RepartitionOperation abstraction.