Skip to content

StaticSQLConf — Static Configuration Properties

StaticSQLConf holds cross-session, immutable and static SQL configuration properties.

assert(sc.isInstanceOf[org.apache.spark.SparkContext])

import org.apache.spark.sql.internal.StaticSQLConf
sc.getConf.get(StaticSQLConf.SPARK_SESSION_EXTENSIONS.key)

StaticSQLConf configuration properties can only be queried and can never be changed once the first SparkSession is created (unlike the regular configuration properties).

import org.apache.spark.sql.internal.StaticSQLConf
scala> val metastoreName = spark.conf.get(StaticSQLConf.CATALOG_IMPLEMENTATION.key)
metastoreName: String = hive

scala> spark.conf.set(StaticSQLConf.CATALOG_IMPLEMENTATION.key, "hive")
org.apache.spark.sql.AnalysisException: Cannot modify the value of a static config: spark.sql.catalogImplementation;
  at org.apache.spark.sql.RuntimeConfig.requireNonStaticConf(RuntimeConfig.scala:144)
  at org.apache.spark.sql.RuntimeConfig.set(RuntimeConfig.scala:41)
  ... 50 elided

cache.serializer

spark.sql.cache.serializer

codegen.cache.maxEntries

spark.sql.codegen.cache.maxEntries

(internal) When non-zero, enable caching of generated classes for operators and expressions. All jobs share the cache that can use up to the specified number for generated classes.

Default: 100

Use SQLConf.codegenCacheMaxEntries to access the current value

Used when:

  • CodeGenerator is loaded (and creates the cache)

spark.sql.broadcastExchange.maxThreadThreshold

(internal) The maximum degree of parallelism to fetch and broadcast the table. If we encounter memory issue like frequently full GC or OOM when broadcast table we can decrease this number in order to reduce memory usage. Notice the number should be carefully chosen since decreasing parallelism might cause longer waiting for other broadcasting. Also, increasing parallelism may cause memory problem.

The threshold must be in (0,128]

Default: 128

spark.sql.catalogImplementation

(internal) Configures in-memory (default) or hive-related BaseSessionStateBuilder and ExternalCatalog

Builder.enableHiveSupport is used to enable Hive support for a SparkSession.

Used when:

spark.sql.debug

(internal) Only used for internal debugging when HiveExternalCatalog is requested to restoreTableMetadata.

Default: false

Not all functions are supported when enabled.

spark.sql.defaultUrlStreamHandlerFactory.enabled

(internal) When true, register Hadoop's FsUrlStreamHandlerFactory to support ADD JAR against HDFS locations. It should be disabled when a different stream protocol handler should be registered to support a particular protocol type, or if Hadoop's FsUrlStreamHandlerFactory conflicts with other protocol types such as http or https. See also SPARK-25694 and HADOOP-14598.

Default: true

spark.sql.event.truncate.length

Threshold of SQL length beyond which it will be truncated before adding to event. Defaults to no truncation. If set to 0, callsite will be logged instead.

Must be set greater or equal to zero.

Default: Int.MaxValue

spark.sql.extensions

A comma-separated list of SQL extension configuration classes to configure SparkSessionExtensions:

  1. The classes must implement SparkSessionExtensions => Unit
  2. The classes must have a no-args constructor
  3. If multiple extensions are specified, they are applied in the specified order.
  4. For the case of rules and planner strategies, they are applied in the specified order.
  5. For the case of parsers, the last parser is used and each parser can delegate to its predecessor
  6. For the case of function name conflicts, the last registered function name is used

Default: (empty)

Used when:

spark.sql.filesourceTableRelationCacheSize

(internal) The maximum size of the cache that maps qualified table names to table relation plans. Must not be negative.

Default: 1000

spark.sql.globalTempDatabase

(internal) Name of the Spark-owned internal database of global temporary views

Default: global_temp

The name of the internal database cannot conflict with the names of any database that is already available in ExternalCatalog.

Used to create a GlobalTempViewManager when SharedState is first requested for one.

spark.sql.hive.thriftServer.singleSession

When enabled (true), Hive Thrift server is running in a single session mode. All the JDBC/ODBC connections share the temporary views, function registries, SQL configuration and the current database.

Default: false

spark.sql.legacy.sessionInitWithConfigDefaults

Flag to revert to legacy behavior where a cloned SparkSession receives SparkConf defaults, dropping any overrides in its parent SparkSession.

Default: false

spark.sql.queryExecutionListeners

Class names of QueryExecutionListeners that will be automatically registered (with new SparkSessions)

Default: (empty)

The classes should have either a no-arg constructor, or a constructor that expects a SparkConf argument.

spark.sql.sources.schemaStringLengthThreshold

(internal) The maximum length allowed in a single cell when storing additional schema information in Hive's metastore

Default: 4000

spark.sql.streaming.ui.enabled

Whether to run the Structured Streaming Web UI for the Spark application when the Spark Web UI is enabled.

Default: true

spark.sql.streaming.ui.retainedProgressUpdates

The number of progress updates to retain for a streaming query for Structured Streaming UI.

Default: 100

spark.sql.streaming.ui.retainedQueries

The number of inactive queries to retain for Structured Streaming UI.

Default: 100

spark.sql.ui.retainedExecutions

Number of executions to retain in the Spark UI.

Default: 1000

spark.sql.warehouse.dir

Directory of a Spark warehouse

Default: spark-warehouse