Skip to content

ApplyColumnarRulesAndInsertTransitions Physical Optimization

ApplyColumnarRulesAndInsertTransitions is a physical query plan optimization.

ApplyColumnarRulesAndInsertTransitions is a Catalyst rule for transforming physical plans (Rule[SparkPlan]).

ApplyColumnarRulesAndInsertTransitions is very similar (in how it optimizes physical query plans) to CollapseCodegenStages physical optimization for Whole-Stage Java Code Generation.

Creating Instance

ApplyColumnarRulesAndInsertTransitions takes the following to be created:

ApplyColumnarRulesAndInsertTransitions is created when:

Executing Rule

Signature
apply(
  plan: SparkPlan): SparkPlan

apply is part of the Rule abstraction.

apply...FIXME

Inserting ColumnarToRowExec Transitions

insertTransitions(
  plan: SparkPlan): SparkPlan

insertTransitions creates a ColumnarToRowExec physical operator for the given SparkPlan that supportsColumnar. The child of the ColumnarToRowExec operator is created using insertRowToColumnar.

Inserting RowToColumnarExec Transitions

insertRowToColumnar(
  plan: SparkPlan): SparkPlan

insertRowToColumnar does nothing (and returns the given SparkPlan) when the following all happen:

  1. The given physical operator supportsColumnar
  2. The given physical operator is RowToColumnarTransition (e.g., RowToColumnarExec)

If the given physical operator does not supportsColumnar, insertRowToColumnar creates a RowToColumnarExec physical operator for the given SparkPlan. The child of the RowToColumnarExec operator is created using insertTransitions (with outputsColumnar flag disabled).

If the given physical operator does supportsColumnar but it is not a RowToColumnarTransition, insertRowToColumnar replaces the child operators (of the physical operator) to insertRowToColumnar recursively.