Generate Unary Logical Operator¶
Generate
is a spark-sql-LogicalPlan.md#UnaryNode[unary logical operator] that is <
-
expressions/Generator.md[Generator] or
GeneratorOuter
expressions (by ExtractGenerator logical evaluation rule) -
SQL's sql/AstBuilder.md#withGenerate[LATERAL VIEW] clause (in
SELECT
orFROM
clauses)
[[resolved]] resolved
flag is...FIXME
NOTE: resolved
is part of spark-sql-LogicalPlan.md#resolved[LogicalPlan Contract] to...FIXME.
[[producedAttributes]] producedAttributes
...FIXME
[[output]] The catalyst/QueryPlan.md#output[output schema] of a Generate
is...FIXME
Note
Generate
logical operator is resolved to GenerateExec.md[GenerateExec] unary physical operator in BasicOperators execution planning strategy.
[TIP]¶
Use generate
operator from Catalyst DSL to create a Generate
logical operator, e.g. for testing or Spark SQL internals exploration.
[source, scala]¶
import org.apache.spark.sql.catalyst.plans.logical._ import org.apache.spark.sql.types._ val lr = LocalRelation('key.int, 'values.array(StringType))
// JsonTuple generator import org.apache.spark.sql.catalyst.expressions.JsonTuple import org.apache.spark.sql.catalyst.dsl.expressions._ import org.apache.spark.sql.catalyst.expressions.Expression val children: Seq[Expression] = Seq("e") val json_tuple = JsonTuple(children)
import org.apache.spark.sql.catalyst.dsl.plans._ // ← gives generate val plan = lr.generate( generator = json_tuple, join = true, outer = true, alias = Some("alias"), outputNames = Seq.empty) scala> println(plan.numberedTreeString) 00 'Generate json_tuple(e), true, true, alias 01 +- LocalRelation
====
=== [[creating-instance]] Creating Generate Instance
Generate
takes the following when created:
- [[generator]] expressions/Generator.md[Generator] expression
- [[join]]
join
flag...FIXME - [[outer]]
outer
flag...FIXME - [[qualifier]] Optional qualifier
- [[generatorOutput]] Output spark-sql-Expression-Attribute.md[attributes]
- [[child]] Child spark-sql-LogicalPlan.md[logical plan]
Generate
initializes the <