PartitionPruning Logical Optimization¶
PartitionPruning
is a logical optimization for Dynamic Partition Pruning.
PartitionPruning
is a Rule[LogicalPlan]
(a rule for logical operators).
PartitionPruning
is part of the PartitionPruning batch of the SparkOptimizer.
Executing Rule¶
apply(
plan: LogicalPlan): LogicalPlan
For Subquery
operators that are correlated
, apply
simply does nothing and gives it back unmodified.
apply
does nothing when the spark.sql.optimizer.dynamicPartitionPruning.enabled configuration property is disabled (false
).
For all other cases, apply
applies prune optimization.
apply
is part of the Rule abstraction.
prune Internal Method¶
prune(
plan: LogicalPlan): LogicalPlan
prune
transforms up all logical operators in the given logical query plan.
prune
leaves Join operators unmodified when either operators are Filter
s with DynamicPruningSubquery condition.
prune
transforms Join operators of the following "shape":
-
EqualTo join conditions
-
Any expressions are attributes of a LogicalRelation over a HadoopFsRelation
-
The join type is one of
Inner
,LeftSemi
,RightOuter
,LeftOuter
More Work Needed
prune
needs more love and would benefit from more insight on how it works.
prune
is used when PartitionPruning
is executed.