AQEUtils¶
Getting Required Distribution¶
getRequiredDistribution(
p: SparkPlan): Option[Distribution]
getRequiredDistribution
determines the Distribution for the given SparkPlan (if there are any user-specified repartition hints):
-
For ShuffleExchangeExec physical operators with HashPartitioning and REPARTITION_BY_COL or REPARTITION_BY_NUM shuffle origins,
getRequiredDistribution
returns a HashClusteredDistribution -
For FilterExec, (non-global) SortExec and CollectMetricsExec physical operators,
getRequiredDistribution
skips them and determines the required distribution using their child operator -
For ProjectExec physical operators,
getRequiredDistribution
finds a HashClusteredDistribution using the child -
For all other operators,
getRequiredDistribution
returns the UnspecifiedDistribution
getRequiredDistribution
is used when:
AdaptiveSparkPlanExec
physical operator is requested for the required distribution