InSubquery Expression¶
InSubquery
is a Predicate that represents the following IN SQL predicate in a logical query plan:
NOT? IN '(' query ')'
InSubquery
can also be used internally for other use cases (e.g., Runtime Filtering, Dynamic Partition Pruning).
Creating Instance¶
InSubquery
takes the following to be created:
- Values (Expressions)
- ListQuery
InSubquery
is created when:
- InjectRuntimeFilter logical optimization is executed (and injectInSubqueryFilter)
AstBuilder
is requested to withPredicate (forNOT? IN '(' query ')'
SQL predicate)- PlanDynamicPruningFilters physical optimization is executed (with spark.sql.optimizer.dynamicPartitionPruning.enabled enabled)
RowLevelOperationRuntimeGroupFiltering
logical optimization is executed
Unevaluable¶
InSubquery
is an Unevaluable expression.
InSubquery
can be converted to a Join operator at logical optimization using RewritePredicateSubquery:
- Left-Semi Join unless it is a
NOT IN
that becomes a Left-Anti Join (among the other less important cases)
InSubquery
can also be converted to InSubqueryExec expression (over a SubqueryExec) in PlanSubqueries physical optimization.
Logical Analysis¶
InSubquery
is resolved using the following logical analysis rules:
- ResolveSubquery
InConversion
Logical Optimization¶
InSubquery
is optimized using the following logical optimizations:
- NullPropagation (so
null
values givenull
results) - InjectRuntimeFilter
- RewritePredicateSubquery
Physical Optimization¶
InSubquery
is optimized using the following physical optimizations:
Catalyst DSL¶
InSubquery
can be created using in operator using Catalyst DSL (via ImplicitOperators
).
nodePatterns¶
nodePatterns
is IN_SUBQUERY.