InSubquery Expression¶
InSubquery
is a Predicate that represents the following IN SQL predicate in a logical query plan:
NOT? IN '(' query ')'
InSubquery
can also be used internally for other use cases (e.g., Runtime Filtering, Dynamic Partition Pruning).
InSubquery
is an Unevaluable expression.
Creating Instance¶
InSubquery
takes the following to be created:
- Values (Expressions)
- ListQuery
InSubquery
is created when:
- InjectRuntimeFilter logical optimization is executed (and injectInSubqueryFilter)
AstBuilder
is requested to withPredicate (forNOT? IN '(' query ')'
SQL predicate)- PlanDynamicPruningFilters physical optimization is executed (with spark.sql.optimizer.dynamicPartitionPruning.enabled enabled)
RowLevelOperationRuntimeGroupFiltering
logical optimization is executed
Unevaluable¶
InSubquery
is an Unevaluable expression.
InSubquery
can be converted to a Join operator at logical optimization using RewritePredicateSubquery:
- Left-Semi Join unless it is a
NOT IN
that becomes a Left-Anti Join (among the other less important cases)
InSubquery
can also be converted to InSubqueryExec expression (over a SubqueryExec) in PlanSubqueries physical optimization.
Logical Analysis¶
InSubquery
is resolved using the following logical analysis rules:
- ResolveSubquery
InConversion
Logical Optimization¶
InSubquery
is optimized using the following logical optimizations:
- NullPropagation (so
null
values givenull
results) - InjectRuntimeFilter
- RewritePredicateSubquery
Physical Optimization¶
InSubquery
is optimized using the following physical optimizations:
Catalyst DSL¶
InSubquery
can be created using in operator using Catalyst DSL (via ImplicitOperators
).
nodePatterns¶
nodePatterns
is IN_SUBQUERY.