Skip to content


GenerateUnsafeProjection is a CodeGenerator to generate the bytecode for creating an UnsafeProjection from the given Expressions (i.e. CodeGenerator[Seq[Expression], UnsafeProjection]).

GenerateUnsafeProjection: Seq[Expression] => UnsafeProjection

Binding Expressions to Schema

  in: Seq[Expression],
  inputSchema: Seq[Attribute]): Seq[Expression]

bind binds the given in expressions to the given inputSchema.

bind is part of the CodeGenerator abstraction.

Creating UnsafeProjection (for Expressions)

  references: Seq[Expression]): UnsafeProjection // (1)!
  expressions: Seq[Expression],
  subexpressionEliminationEnabled: Boolean): UnsafeProjection
  1. subexpressionEliminationEnabled flag is false

create creates a new CodegenContext.


create is part of the CodeGenerator abstraction.


Enable ALL logging level for org.apache.spark.sql.catalyst.expressions.codegen.GenerateUnsafeProjection logger to see what happens inside.

Add the following line to conf/

Refer to Logging.

Review Me

=== [[generate]] Generating UnsafeProjection -- generate Method

[source, scala]

generate( expressions: Seq[Expression], subexpressionEliminationEnabled: Boolean): UnsafeProjection

generate <> the input expressions followed by <> for the expressions.

generate is used when:

=== [[canonicalize]] canonicalize Method

[source, scala]

canonicalize(in: Seq[Expression]): Seq[Expression]

canonicalize removes unnecessary Alias expressions.

Internally, canonicalize uses ExpressionCanonicalizer rule executor (that in turn uses just one CleanExpressions expression rule).

=== [[create]] Generating JVM Bytecode For UnsafeProjection For Given Expressions (With Optional Subexpression Elimination) -- create Method

[source, scala]

create( expressions: Seq[Expression], subexpressionEliminationEnabled: Boolean): UnsafeProjection create(references: Seq[Expression]): UnsafeProjection // <1>

<1> Calls the former create with subexpressionEliminationEnabled flag off

create first creates a CodegenContext and an <> for the input expressions.

create creates a code body with public java.lang.Object generate(Object[] references) method that creates a SpecificUnsafeProjection.

[source, java]

public java.lang.Object generate(Object[] references) { return new SpecificUnsafeProjection(references); }

class SpecificUnsafeProjection extends UnsafeProjection { ... }

create creates a CodeAndComment with the code body and comment placeholders.

You should see the following DEBUG message in the logs:

DEBUG GenerateUnsafeProjection: code for [expressions]:


Enable DEBUG logging level for org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator logger to see the message above.

See CodeGenerator.

create requests CodeGenerator to compile the Java source code to JVM bytecode (using Janino).

create requests CodegenContext for references and requests the compiled class to create a SpecificUnsafeProjection for the input references that in the end is the final UnsafeProjection.

(Single-argument) create is part of the CodeGenerator abstraction.

=== [[createCode]] Creating ExprCode for Expressions (With Optional Subexpression Elimination) -- createCode Method

[source, scala]

createCode( ctx: CodegenContext, expressions: Seq[Expression], useSubexprElimination: Boolean = false): ExprCode

createCode requests the input CodegenContext to generate a Java source code for code-generated evaluation of every expression in the input expressions.


[source, scala]

import org.apache.spark.sql.catalyst.expressions.codegen.CodegenContext val ctx = new CodegenContext

// Use Catalyst DSL import org.apache.spark.sql.catalyst.dsl.expressions._ val expressions = "hello""world") :: "hello""world") :: Nil

import org.apache.spark.sql.catalyst.expressions.codegen.GenerateUnsafeProjection val eval = GenerateUnsafeProjection.createCode(ctx, expressions, useSubexprElimination = true)

scala> println(eval.code)


    mutableStateArray2[0].write(0, ((UTF8String) references[0] /* literal */));

        mutableStateArray2[0].write(1, ((UTF8String) references[1] /* literal */));

scala> println(eval.value) mutableStateArray[0]

createCode is used when: