ExtractSingleColumnNullAwareAntiJoin Scala Extractor¶
ExtractSingleColumnNullAwareAntiJoin is a Scala extractor to destructure a Join logical operator into a tuple of the following elements:
- Streamed Side Keys (Catalyst expressions)
- Build Side Keys (Catalyst expressions)
ExtractSingleColumnNullAwareAntiJoin is used to support single-column NULL-aware anti joins (described in the VLDB paper) that will almost certainly be planned as a very time-consuming Broadcast Nested Loop join (
O(M*N) calculation). If it's a single column case this expensive calculation could be optimized into
O(M) using hash lookup instead of loop lookup. Refer to SPARK-32290.
ExtractSingleColumnNullAwareAntiJoin uses the spark.sql.optimizeNullAwareAntiJoin configuration property.
Destructuring Join Logical Plan¶
type ReturnType = // streamedSideKeys, buildSideKeys (Seq[Expression], Seq[Expression]) unapply( join: Join): Option[ReturnType]
unapply matches Join logical operators with LeftAnti join type and the following condition:
unapply is used when: