Intersect Logical Operator¶
Intersect
is a SetOperation
binary logical operator that represents the following high-level operators in a logical plan:
- INTERSECT SQL statement
- Dataset.intersect and Dataset.intersectAll operators
Intersect
is replaced at logical optimization phase (based on isAll flag):
Logical Operators | Logical Optimization | isAll |
---|---|---|
Left Semi Join | ReplaceIntersectWithSemiJoin | disabled |
Generate over Aggregate over Union | RewriteIntersectAll | enabled |
Spark Structured Streaming Unsupported
Intersect
is not supported on streaming DataFrames/Datasets (that is enforced by UnsupportedOperationChecker
at QueryExecution).
Creating Instance¶
Intersect
takes the following to be created:
- Left LogicalPlan
- Right LogicalPlan
-
isAll
flag
Intersect
is created when:
AstBuilder
is requested to parse INTERSECT statement- Dataset.intersect operator is used (isAll is
false
) - Dataset.intersectAll operator is used (isAll is
true
)
Catalyst DSL¶
Catalyst DSL comes with intersect operator to create an Intersect
operator.
intersect(
otherPlan: LogicalPlan,
isAll: Boolean): LogicalPlan