Skip to content

BaseSessionStateBuilder — Generic Builder of SessionState

BaseSessionStateBuilder is an abstraction of builders that can produce a new BaseSessionStateBuilder to create a SessionState.

spark.sql.catalogImplementation Configuration Property

BaseSessionStateBuilder and spark.sql.catalogImplementation configuration property allow for Hive and non-Hive Spark deployments.

assert(spark.sessionState.isInstanceOf[org.apache.spark.sql.internal.SessionState])

BaseSessionStateBuilder holds properties that (together with newBuilder) are used to create a SessionState.

Contract

newBuilder

newBuilder: (SparkSession, Option[SessionState]) => BaseSessionStateBuilder

Produces a new BaseSessionStateBuilder for given SparkSession.md[SparkSession] and optional SessionState.md[SessionState]

Used when BaseSessionStateBuilder is requested to <>

Implementations

Creating Instance

BaseSessionStateBuilder takes the following to be created:

BaseSessionStateBuilder is created when SparkSession is requested to instantiateSessionState.

Session-Specific Registries

The following registries are Scala lazy values which are created once and on demand (when accessed for the first time).

Analyzer

analyzer: Analyzer

Logical Analyzer

SessionCatalog

catalog: SessionCatalog

SessionCatalog

Note

HiveSessionStateBuilder manages its own Hive-aware HiveSessionCatalog.

CatalogManager

catalogManager: CatalogManager

CatalogManager that is created for the session-specific SQLConf, V2SessionCatalog and SessionCatalog.

catalogManager is used when:

SQLConf

SQLConf

ExperimentalMethods

ExperimentalMethods

FunctionRegistry

FunctionRegistry

SessionResourceLoader

resourceLoader: SessionResourceLoader

SessionResourceLoader

ParserInterface

sqlParser: ParserInterface

ParserInterface

TableFunctionRegistry

tableFunctionRegistry: TableFunctionRegistry

TableFunctionRegistry


When requested for the first time (as a lazy val), tableFunctionRegistry requests the parent SessionState (if available) to clone the tableFunctionRegistry or requests the SparkSessionExtensions to register the built-in function expressions.

tableFunctionRegistry is used when:

V2SessionCatalog

v2SessionCatalog: V2SessionCatalog

V2SessionCatalog that is created for the session-specific SessionCatalog and SQLConf.

v2SessionCatalog is used when BaseSessionStateBuilder is requested for the CatalogManager.

Custom Operator Optimization Rules

customOperatorOptimizationRules: Seq[Rule[LogicalPlan]]

Custom operator optimization rules to add to the base Operator Optimization batch.

When requested for the custom rules, customOperatorOptimizationRules simply requests the SparkSessionExtensions to buildOptimizerRules.

customOperatorOptimizationRules is used when BaseSessionStateBuilder is requested for an Optimizer.

SparkSessionExtensions

extensions: SparkSessionExtensions

SparkSessionExtensions

ExecutionListenerManager

listenerManager: ExecutionListenerManager

ExecutionListenerManager

Optimizer

optimizer: Optimizer

optimizer creates a SparkOptimizer for the CatalogManager, SessionCatalog and ExperimentalMethods.

The SparkOptimizer uses the following extension methods:

optimizer is used when BaseSessionStateBuilder is requested to build a SessionState (as the optimizerBuilder function to build a logical query plan optimizer on demand).

SparkPlanner

planner: SparkPlanner

SparkPlanner

StreamingQueryManager

streamingQueryManager: StreamingQueryManager

Spark Structured Streaming's StreamingQueryManager

UDFRegistration

udfRegistration: UDFRegistration

UDFRegistration

Creating Clone of SessionState

createClone: (SparkSession, SessionState) => SessionState

createClone creates a SessionState using newBuilder followed by build.

createClone is used when BaseSessionStateBuilder is requested for a SessionState.

Building SessionState

build(): SessionState

build creates a SessionState with the following:

  • SparkSession.md#sharedState[SharedState] of the <>
  • <>
  • <>
  • <>
  • <>
  • <>
  • <>
  • <>
  • <>
  • <>
  • <>
  • <>
  • <>
  • <>
  • <>

build is used when:

Getting Function to Create QueryExecution For LogicalPlan

createQueryExecution: LogicalPlan => QueryExecution

createQueryExecution simply returns a function that takes a LogicalPlan and creates a QueryExecution with the SparkSession and the logical plan.

createQueryExecution is used when BaseSessionStateBuilder is requested to create a SessionState instance.

ColumnarRules

columnarRules: Seq[ColumnarRule]

columnarRules requests the SparkSessionExtensions to buildColumnarRules.


columnarRules is used when:

customCheckRules

customCheckRules: Seq[LogicalPlan => Unit]

customCheckRules requests the SparkSessionExtensions to buildCheckRules on the SparkSession.


customCheckRules is used when:

  • BaseSessionStateBuilder is requested for an Analyzer
  • HiveSessionStateBuilder is requested for an Analyzer

Adaptive Rules

adaptiveRulesHolder: AdaptiveRulesHolder

adaptiveRulesHolder creates a new AdaptiveRulesHolder with the user-defined AQE rules built using the SparkSessionExtensions: