Skip to content

SparkSessionExtensions

SparkSessionExtensions is an Injection API for Spark SQL developers to extend the capabilities of a SparkSession.

Spark SQL developers use Builder.withExtensions method or register custom extensions using spark.sql.extensions configuration property.

SparkSessionExtensions is an integral part of SparkSession.

injectedFunctions

SparkSessionExtensions uses a collection of 3-element tuples with the following:

  1. FunctionIdentifier
  2. ExpressionInfo
  3. Seq[Expression] => Expression

Injection API

injectCheckRule

type CheckRuleBuilder = SparkSession => LogicalPlan => Unit
injectCheckRule(
  builder: CheckRuleBuilder): Unit

injectCheckRule injects an check analysis Rule builder into a SparkSession.

The injected rules will be executed after the analysis phase. A check analysis rule is used to detect problems with a LogicalPlan and should throw an exception when a problem is found.

injectColumnar

type ColumnarRuleBuilder = SparkSession => ColumnarRule
injectColumnar(
  builder: ColumnarRuleBuilder): Unit

Injects a ColumnarRule to a SparkSession

injectFunction

type FunctionDescription = (FunctionIdentifier, ExpressionInfo, FunctionBuilder)
injectFunction(
  functionDescription: FunctionDescription): Unit

injectFunction...FIXME

injectOptimizerRule

type RuleBuilder = SparkSession => Rule[LogicalPlan]
injectOptimizerRule(
  builder: RuleBuilder): Unit

injectOptimizerRule registers a custom logical optimization rules builder.

injectParser

type ParserBuilder = (SparkSession, ParserInterface) => ParserInterface
injectParser(
  builder: ParserBuilder): Unit

injectParser...FIXME

injectPlannerStrategy

type StrategyBuilder = SparkSession => Strategy
injectPlannerStrategy(
  builder: StrategyBuilder): Unit

injectPlannerStrategy...FIXME

injectPostHocResolutionRule

type RuleBuilder = SparkSession => Rule[LogicalPlan]
injectPostHocResolutionRule(
  builder: RuleBuilder): Unit

injectPostHocResolutionRule...FIXME

injectQueryPostPlannerStrategyRule

injectQueryPostPlannerStrategyRule(
  builder: QueryPostPlannerStrategyBuilder): Unit

injectQueryPostPlannerStrategyRule adds a new SparkSession => Rule[SparkPlan] builder to the queryPostPlannerStrategyRuleBuilders internal registry.

injectQueryStagePrepRule

type QueryStagePrepRuleBuilder = SparkSession => Rule[SparkPlan]
injectQueryStagePrepRule(
  builder: QueryStagePrepRuleBuilder): Unit

injectQueryStagePrepRule registers a QueryStagePrepRuleBuilder (that can build a query stage preparation rule).

injectResolutionRule

type RuleBuilder = SparkSession => Rule[LogicalPlan]
injectResolutionRule(
  builder: RuleBuilder): Unit

injectResolutionRule...FIXME

injectTableFunction

type TableFunctionBuilder = Seq[Expression] => LogicalPlan
type TableFunctionDescription = (FunctionIdentifier, ExpressionInfo, TableFunctionBuilder)
injectTableFunction(
  functionDescription: TableFunctionDescription): Unit

injectTableFunction registers a new Table-Valued Functions.


injectTableFunction adds the given TableFunctionDescription to the injectedTableFunctions internal registry.

Registering Custom Logical Optimization Rules

buildOptimizerRules(
  session: SparkSession): Seq[Rule[LogicalPlan]]

buildOptimizerRules gives the optimizerRules logical rules given the input SparkSession.

buildOptimizerRules is used when:

Logical Optimizer Rules (Builder)

optimizerRules: Buffer[SparkSession => Rule[LogicalPlan]]

optimizerRules are functions (builders) that take a SparkSession and return logical optimizer rules (Rule[LogicalPlan]).

optimizerRules is added a new rule when SparkSessionExtensions is requested to injectOptimizerRule.

buildColumnarRules Internal Method

buildColumnarRules(
  session: SparkSession): Seq[ColumnarRule]

buildColumnarRules...FIXME


buildColumnarRules is used when:

buildCheckRules

buildCheckRules(
  session: SparkSession): Seq[LogicalPlan => Unit]

buildCheckRules...FIXME


buildCheckRules is used when:

Building Query Stage Preparation Rules

buildQueryStagePrepRules(
  session: SparkSession): Seq[Rule[SparkPlan]]

buildQueryStagePrepRules executes the queryStagePrepRuleBuilders (to build query stage preparation rules).


buildQueryStagePrepRules is used when:

registerTableFunctions

registerTableFunctions(
  tableFunctionRegistry: TableFunctionRegistry): TableFunctionRegistry

registerTableFunctions requests the given TableFunctionRegistry to register all the injected table functions.


registerTableFunctions is used when:

injectedTableFunctions

injectedTableFunctions: Buffer[TableFunctionDescription]

SparkSessionExtensions creates an empty injectedTableFunctions mutable collection of TableFunctionDescriptions:

type TableFunctionBuilder = Seq[Expression] => LogicalPlan
type TableFunctionDescription = (FunctionIdentifier, ExpressionInfo, TableFunctionBuilder)

A new TableFunctionDescription tuple is added using injectTableFunction injector.

TableFunctionDescriptions are registered when SparkSessionExtensions is requested to registerTableFunctions.

Adaptive Query Post Planner Strategy Rules Builders

SparkSessionExtensions uses queryPostPlannerStrategyRuleBuilders internal registry of the builder functions of Adaptive Query Post Planner Strategy Rules.

The rule builders are registered using injectQueryPostPlannerStrategyRule.

The rule builders are executed using buildQueryPostPlannerStrategyRules.

buildQueryPostPlannerStrategyRules

buildQueryPostPlannerStrategyRules(
  session: SparkSession): Seq[Rule[SparkPlan]]

buildQueryPostPlannerStrategyRules executes the Adaptive Query Post Planner Strategy Rules builders (with the given SparkSession).


buildQueryPostPlannerStrategyRules is used when: