Union Logical Operator¶
Union is a logical operator that represents the following high-level operators in a logical plan:
- UNION SQL statement
- Dataset.union, Dataset.unionAll and Dataset.unionByName operators
Union is resolved using ResolveUnion logical analysis rule.
Union is resolved into UnionExec physical operator by BasicOperators execution planning strategy.
Creating Instance¶
Union takes the following to be created:
- Child LogicalPlans
-
byNameflag (default:false) -
allowMissingColflag (default:false)
Note
allowMissingCol can be true only with byName being true.
Union is created (possibly using apply utility) when:
AstBuilderis requested to visitFromStatement and visitMultiInsertQueryDatasetis requested to flattenUnion (for Dataset.union and Dataset.unionByName operators)- Dataset.unionByName operator is used
Creating Union¶
apply(
left: LogicalPlan,
right: LogicalPlan): Union
apply creates a Union logical operator (with the left and right plans as the children operators).
apply is used when:
ResolveUnionlogical resolution rule is executedRewriteUpdateTableis requested tobuildReplaceDataWithUnionPlan- RewriteExceptAll logical optimization is executed
RewriteIntersectAlllogical optimization is executedAstBuilderis requested to parse UNION SQL statement- Dataset.union operator is used
Maximum Number of Records¶
maxRows is the total of the maxRows of all the children.
Node Patterns¶
nodePatterns is UNION.
Metadata Output Attributes¶
metadataOutput is empty.
Catalyst DSL¶
Catalyst DSL comes with union operator to create an Union operator.
union(
otherPlan: LogicalPlan): LogicalPlan
Logical Optimizations¶
EliminateUnions- CombineUnions