SessionState — State Separation Layer Between SparkSessions¶
SessionState is a state separation layer between Spark SQL sessions, including SQL configuration, tables, functions, UDFs, SQL parser, and everything else that depends on a SQLConf.
Attributes¶
Adaptive Rules¶
adaptiveRulesHolder: AdaptiveRulesHolder
User-Defined Adaptive Query Rules
adaptiveRulesHolder is given when SessionState is created.
adaptiveRulesHolder is used when AdaptiveSparkPlanExec physical operator is requested for the following:
The AdaptiveRulesHolder is used when AdaptiveSparkPlanExec physical operator is requested for the following:
- Executing AQE Query Post Planner Strategy Rules
- Adaptive Logical Optimizer
- Adaptive Query Stage Physical Optimizations
- Adaptive Query Stage Physical Preparation Rules
ColumnarRules¶
columnarRules: Seq[ColumnarRule]
ExecutionListenerManager¶
listenerManager: ExecutionListenerManager
ExperimentalMethods¶
experimentalMethods: ExperimentalMethods
FunctionRegistry¶
functionRegistry: FunctionRegistry
Logical Analyzer¶
analyzer: Analyzer
Initialized lazily (only when requested the first time) using the analyzerBuilder factory function.
Logical Optimizer¶
optimizer: Optimizer
Logical Optimizer that is created using the optimizerBuilder function (and cached for later usage)
Used when:
QueryExecutionis requested to create an optimized logical plan- (Structured Streaming)
IncrementalExecutionis requested to create an optimized logical plan
ParserInterface¶
sqlParser: ParserInterface
SessionCatalog¶
catalog: SessionCatalog
SessionCatalog that is created using the catalogBuilder function (and cached for later usage).
SessionResourceLoader¶
resourceLoader: SessionResourceLoader
Spark Query Planner¶
planner: SparkPlanner
SQLConf¶
conf: SQLConf
StreamingQueryManager¶
streamingQueryManager: StreamingQueryManager
span id="UDFRegistration"> UDFRegistration¶
udfRegistration: UDFRegistration
SessionState is given an UDFRegistration when created.
AQE QueryStage Physical Preparation Rules¶
queryStagePrepRules: Seq[Rule[SparkPlan]]
SessionState can be given a collection of physical optimizations (Rule[SparkPlan]s) when created.
queryStagePrepRules is given when BaseSessionStateBuilder is requested to build a SessionState based on queryStagePrepRules (from a SparkSessionExtensions).
queryStagePrepRules is used to extend the built-in QueryStage Physical Preparation Rules in Adaptive Query Execution.
Creating Instance¶
SessionState takes the following to be created:
- SQLConf
- ExperimentalMethods
- FunctionRegistry
- UDFRegistration
- Function to build a SessionCatalog (
() => SessionCatalog) - ParserInterface
- Function to build a Analyzer (
() => Analyzer) - Function to build a Logical Optimizer (
() => Optimizer) - SparkPlanner
- Function to build a
StreamingQueryManager(() => StreamingQueryManager) - ExecutionListenerManager
- Function to build a
SessionResourceLoader(() => SessionResourceLoader) - Function to build a QueryExecution (
LogicalPlan => QueryExecution) -
SessionStateClone Function ((SparkSession, SessionState) => SessionState) - ColumnarRules
- AQE Rules
- planNormalizationRules
SessionState is created when:
SparkSessionis requested to instantiateSessionState (when requested for the SessionState per spark.sql.catalogImplementation configuration property)

When requested for the SessionState, SparkSession uses spark.sql.catalogImplementation configuration property to load and create a BaseSessionStateBuilder that is then requested to create a SessionState instance.
There are two BaseSessionStateBuilders available:
- (default) SessionStateBuilder for
in-memorycatalog - HiveSessionStateBuilder for
hivecatalog
hive catalog is set when the SparkSession was created with the Hive support enabled (using Builder.enableHiveSupport).
Creating QueryExecution For LogicalPlan¶
executePlan(
plan: LogicalPlan): QueryExecution
executePlan uses the createQueryExecution function to create a QueryExecution for the given LogicalPlan.
Creating New Hadoop Configuration¶
newHadoopConf(): Configuration
newHadoopConf returns a new Hadoop Configuration (with the SparkContext.hadoopConfiguration and all the configuration properties of the SQLConf).
Creating New Hadoop Configuration With Extra Options¶
newHadoopConfWithOptions(
options: Map[String, String]): Configuration
newHadoopConfWithOptions creates a new Hadoop Configuration with the input options set (except path and paths options that are skipped).
newHadoopConfWithOptions is used when:
TextBasedFileFormatis requested toisSplitableFileSourceScanExecphysical operator is requested for the input RDD- InsertIntoHadoopFsRelationCommand logical command is executed
PartitioningAwareFileIndexis requested for the Hadoop Configuration
Accessing SessionState¶
SessionState is available using SparkSession.sessionState.
import org.apache.spark.sql.SparkSession
assert(spark.isInstanceOf[SparkSession])
// object SessionState in package org.apache.spark.sql.internal cannot be accessed directly
scala> :type spark.sessionState
org.apache.spark.sql.internal.SessionState