PushPredicateThroughJoin Logical Optimization¶
PushPredicateThroughJoin
is a catalyst/Rule.md[Catalyst rule] for transforming spark-sql-LogicalPlan.md[logical plans] (i.e. Rule[LogicalPlan]
).
When <PushPredicateThroughJoin
...FIXME
PushPredicateThroughJoin
is a part of the Operator Optimization before Inferring Filters and Operator Optimization after Inferring Filters fixed-point rule batches of the base Logical Optimizer.
[[demo]] .Demo: PushPredicateThroughJoin
import org.apache.spark.sql.catalyst.dsl.expressions._
import org.apache.spark.sql.catalyst.dsl.plans._
// Using hacks to disable two Catalyst DSL implicits
implicit def symbolToColumn(ack: ThatWasABadIdea) = ack
implicit class StringToColumn(val sc: StringContext) {}
import org.apache.spark.sql.catalyst.plans.logical.LocalRelation
val t1 = LocalRelation('a.int, 'b.int)
val t2 = LocalRelation('C.int, 'D.int).where('C > 10)
val plan = t1.join(t2)...FIXME
import org.apache.spark.sql.catalyst.optimizer.PushPredicateThroughJoin
val optimizedPlan = PushPredicateThroughJoin(plan)
scala> println(optimizedPlan.numberedTreeString)
...FIXME
=== [[apply]] Executing Rule -- apply
Method
[source, scala]¶
apply( plan: LogicalPlan): LogicalPlan
apply
...FIXME
apply
is part of the Rule abstraction.
=== [[split]] split
Internal Method
[source, scala]¶
split( condition: Seq[Expression], left: LogicalPlan, right: LogicalPlan): (Seq[Expression], Seq[Expression], Seq[Expression])
split
splits (partitions) the given condition expressions into expressions/Expression.md#deterministic[deterministic] or not.
split
further splits (partitions) the deterministic expressions (pushDownCandidates) into expressions that reference the catalyst/QueryPlan.md#outputSet[output expressions] of the left logical operator (leftEvaluateCondition) or not (rest).
split
further splits (partitions) the expressions that do not reference left output expressions into expressions that reference the catalyst/QueryPlan.md#outputSet[output expressions] of the right logical operator (rightEvaluateCondition) or not (commonCondition).
In the end, split
returns the leftEvaluateCondition, rightEvaluateCondition, and commonCondition with the non-deterministic condition expressions.
NOTE: split
is used when PushPredicateThroughJoin
is <