Intersect Logical Operator¶
Intersect is a SetOperation binary logical operator that represents the following high-level operators in a logical plan:
- INTERSECT SQL statement
- Dataset.intersect and Dataset.intersectAll operators
Intersect is replaced at logical optimization phase (based on isAll flag):
| Logical Operators | Logical Optimization | isAll |
|---|---|---|
| Left Semi Join | ReplaceIntersectWithSemiJoin | disabled |
| Generate over Aggregate over Union | RewriteIntersectAll | enabled |
Spark Structured Streaming Unsupported
Intersect is not supported on streaming DataFrames/Datasets (that is enforced by UnsupportedOperationChecker at QueryExecution).
Creating Instance¶
Intersect takes the following to be created:
- Left LogicalPlan
- Right LogicalPlan
-
isAllflag
Intersect is created when:
AstBuilderis requested to parse INTERSECT statement- Dataset.intersect operator is used (isAll is
false) - Dataset.intersectAll operator is used (isAll is
true)
Catalyst DSL¶
Catalyst DSL comes with intersect operator to create an Intersect operator.
intersect(
otherPlan: LogicalPlan,
isAll: Boolean): LogicalPlan