Skip to content

CoalesceExec Unary Physical Operator

CoalesceExec is a unary physical operator to...FIXME...with numPartitions number of partitions and a child spark plan.

CoalesceExec represents Repartition logical operator at execution (when shuffle was disabled -- see BasicOperators execution planning strategy). When executed, it executes the input child and calls spark-rdd-partitions.md#coalesce[coalesce] on the result RDD (with shuffle disabled).

Please note that since physical operators present themselves without the suffix Exec, CoalesceExec is the Coalesce in the Physical Plan section in the following example:

[source, scala]

scala> df.rdd.getNumPartitions res6: Int = 8

scala> df.coalesce(1).rdd.getNumPartitions res7: Int = 1

scala> df.coalesce(1).explain(extended = true) == Parsed Logical Plan == Repartition 1, false +- LocalRelation [value#1]

== Analyzed Logical Plan == value: int Repartition 1, false +- LocalRelation [value#1]

== Optimized Logical Plan == Repartition 1, false +- LocalRelation [value#1]

== Physical Plan == Coalesce 1 +- LocalTableScan [value#1]


output collection of spark-sql-Expression-Attribute.md[Attribute] matches the child's (since CoalesceExec is about changing the number of partitions not the internal representation).

outputPartitioning returns a SinglePartition when the input numPartitions is 1 while a UnknownPartitioning partitioning scheme for the other cases.