SQLConf¶
SQLConf
is an internal configuration store of the configuration properties and hints used in Spark SQL.
Important
SQLConf
is an internal part of Spark SQL and is not supposed to be used directly. Spark SQL configuration is available through the developer-facing RuntimeConfig.
SQLConf
offers methods to get
, set
, unset
or clear
values of the configuration properties and hints as well as to read the current values.
Accessing SQLConf¶
You can access a SQLConf
using:
-
SQLConf.get
(preferred) - theSQLConf
of the current activeSparkSession
-
SessionState - direct access through SessionState of the
SparkSession
of your choice (that gives more flexibility on whatSparkSession
is used that can be different from the current activeSparkSession
)
import org.apache.spark.sql.internal.SQLConf
// Use type-safe access to configuration properties
// using SQLConf.get.getConf
val parallelFileListingInStatsComputation = SQLConf.get.getConf(SQLConf.PARALLEL_FILE_LISTING_IN_STATS_COMPUTATION)
// or even simpler
SQLConf.get.parallelFileListingInStatsComputation
scala> :type spark
org.apache.spark.sql.SparkSession
// Direct access to the session SQLConf
val sqlConf = spark.sessionState.conf
scala> :type sqlConf
org.apache.spark.sql.internal.SQLConf
scala> println(sqlConf.offHeapColumnVectorEnabled)
false
// Or simply import the conf value
import spark.sessionState.conf
// accessing properties through accessor methods
scala> conf.numShufflePartitions
res1: Int = 200
// Prefer SQLConf.get (over direct access)
import org.apache.spark.sql.internal.SQLConf
val cc = SQLConf.get
scala> cc == conf
res4: Boolean = true
// setting properties using aliases
import org.apache.spark.sql.internal.SQLConf.SHUFFLE_PARTITIONS
conf.setConf(SHUFFLE_PARTITIONS, 2)
scala> conf.numShufflePartitions
res2: Int = 2
// unset aka reset properties to the default value
conf.unsetConf(SHUFFLE_PARTITIONS)
scala> conf.numShufflePartitions
res3: Int = 200
ADAPTIVE_AUTO_BROADCASTJOIN_THRESHOLD¶
spark.sql.adaptive.autoBroadcastJoinThreshold
Used when:
JoinSelectionHelper
is requested to canBroadcastBySize
ADAPTIVE_EXECUTION_FORCE_APPLY¶
spark.sql.adaptive.forceApply configuration property
Used when:
- InsertAdaptiveSparkPlan physical optimization is executed
adaptiveExecutionEnabled¶
The value of spark.sql.adaptive.enabled configuration property
Used when:
- InsertAdaptiveSparkPlan physical optimization is executed
SQLConf
is requested for the numShufflePartitions
adaptiveExecutionLogLevel¶
The value of spark.sql.adaptive.logLevel configuration property
Used when AdaptiveSparkPlanExec physical operator is executed
ADAPTIVE_MAX_SHUFFLE_HASH_JOIN_LOCAL_MAP_THRESHOLD¶
spark.sql.adaptive.maxShuffledHashJoinLocalMapThreshold configuration property
Used when:
DynamicJoinSelection
is requested to preferShuffledHashJoin
ADAPTIVE_OPTIMIZER_EXCLUDED_RULES¶
spark.sql.adaptive.optimizer.excludedRules
ADVISORY_PARTITION_SIZE_IN_BYTES¶
spark.sql.adaptive.advisoryPartitionSizeInBytes configuration property
Used when:
- CoalesceShufflePartitions and OptimizeSkewedJoin physical optimizations are executed
autoBroadcastJoinThreshold¶
The value of spark.sql.autoBroadcastJoinThreshold configuration property
Used when:
- JoinSelection execution planning strategy is executed
autoBucketedScanEnabled¶
The value of spark.sql.sources.bucketing.autoBucketedScan.enabled configuration property
Used when:
- DisableUnnecessaryBucketedScan physical optimization is executed
allowStarWithSingleTableIdentifierInCount¶
spark.sql.legacy.allowStarWithSingleTableIdentifierInCount
Used when:
ResolveReferences
logical resolution rule is executed
arrowPySparkSelfDestructEnabled¶
spark.sql.execution.arrow.pyspark.selfDestruct.enabled
Used when:
PandasConversionMixin
is requested totoPandas
allowAutoGeneratedAliasForView¶
spark.sql.legacy.allowAutoGeneratedAliasForView
Used when:
ViewHelper
utility is used toverifyAutoGeneratedAliasesNotExists
allowNonEmptyLocationInCTAS¶
spark.sql.legacy.allowNonEmptyLocationInCTAS
Used when:
DataWritingCommand
utility is used to assertEmptyRootPath
allowNonEmptyLocationInCTAS¶
spark.sql.adaptive.optimizeSkewsInRebalancePartitions.enabled
Used when:
OptimizeSkewInRebalancePartitions
physical optimization is executed
ADAPTIVE_CUSTOM_COST_EVALUATOR_CLASS¶
spark.sql.adaptive.customCostEvaluatorClass
autoSizeUpdateEnabled¶
The value of spark.sql.statistics.size.autoUpdate.enabled configuration property
Used when:
CommandUtils
is requested for updating existing table statisticsAlterTableAddPartitionCommand
logical command is executed
avroCompressionCodec¶
The value of spark.sql.avro.compression.codec configuration property
Used when AvroOptions
is requested for the compression configuration property (and it was not set explicitly)
broadcastTimeout¶
The value of spark.sql.broadcastTimeout configuration property
Used in BroadcastExchangeExec (for broadcasting a table to executors)
bucketingEnabled¶
The value of spark.sql.sources.bucketing.enabled configuration property
Used when FileSourceScanExec
physical operator is requested for the input RDD and to determine output partitioning and ordering
cacheVectorizedReaderEnabled¶
The value of spark.sql.inMemoryColumnarStorage.enableVectorizedReader configuration property
Used when InMemoryTableScanExec
physical operator is requested for supportsBatch flag.
CAN_CHANGE_CACHED_PLAN_OUTPUT_PARTITIONING¶
spark.sql.optimizer.canChangeCachedPlanOutputPartitioning
Used when:
CacheManager
is requested to getOrCloneSessionWithConfigsOff
caseSensitiveAnalysis¶
The value of spark.sql.caseSensitive configuration property
cboEnabled¶
The value of spark.sql.cbo.enabled configuration property
Used in:
- ReorderJoin logical plan optimization (and indirectly in
StarSchemaDetection
forreorderStarJoins
) - CostBasedJoinReorder logical plan optimization
cliPrintHeader¶
Used when:
SparkSQLCLIDriver
is requested toprocessCmd
coalesceBucketsInJoinEnabled¶
The value of spark.sql.bucketing.coalesceBucketsInJoin.enabled configuration property
Used when:
- CoalesceBucketsInJoin physical optimization is executed
COALESCE_PARTITIONS_MIN_PARTITION_SIZE¶
spark.sql.adaptive.coalescePartitions.minPartitionSize configuration property
Used when:
- CoalesceShufflePartitions physical optimization is executed
COALESCE_PARTITIONS_PARALLELISM_FIRST¶
spark.sql.adaptive.coalescePartitions.parallelismFirst configuration property
Used when:
- CoalesceShufflePartitions physical optimization is executed
coalesceShufflePartitionsEnabled¶
The value of spark.sql.adaptive.coalescePartitions.enabled configuration property
Used when:
- CoalesceShufflePartitions and EnsureRequirements physical optimizations are executed
codegenCacheMaxEntries¶
spark.sql.codegen.cache.maxEntries
columnBatchSize¶
The value of spark.sql.inMemoryColumnarStorage.batchSize configuration property
Used when:
CacheManager
is requested to cache a structured queryRowToColumnarExec
physical operator is requested to doExecuteColumnar
constraintPropagationEnabled¶
The value of spark.sql.constraintPropagation.enabled configuration property
Used when:
- InferFiltersFromConstraints logical optimization is executed
QueryPlanConstraints
is requested for the constraints
CONVERT_METASTORE_ORC¶
The value of spark.sql.hive.convertMetastoreOrc configuration property
Used when RelationConversions logical post-hoc evaluation rule is executed (and requested to isConvertible)
CONVERT_METASTORE_PARQUET¶
The value of spark.sql.hive.convertMetastoreParquet configuration property
Used when RelationConversions logical post-hoc evaluation rule is executed (and requested to isConvertible)
csvExpressionOptimization¶
spark.sql.optimizer.enableCsvExpressionOptimization
Used when:
OptimizeCsvJsonExprs
logical optimization is executed
dataFramePivotMaxValues¶
The value of spark.sql.pivotMaxValues configuration property
Used in pivot operator.
dataFrameRetainGroupColumns¶
decorrelateInnerQueryEnabled¶
spark.sql.optimizer.decorrelateInnerQuery.enabled
Used when:
CheckAnalysis
is requested to checkCorrelationsInSubquery (with a Project unary logical operator)- PullupCorrelatedPredicates logical optimization is executed
DEFAULT_CATALOG¶
The value of spark.sql.defaultCatalog configuration property
Used when CatalogManager
is requested for the current CatalogPlugin
defaultDataSourceName¶
defaultSizeInBytes¶
Used when:
DetermineTableStats
logical resolution rule could not compute the table size or spark.sql.statistics.fallBackToHdfs is disabled- ExternalRDD, LogicalRDD and DataSourceV2Relation are requested to compute stats
- (Spark Structured Streaming)
StreamingRelation
,StreamingExecutionRelation
,StreamingRelationV2
andContinuousExecutionRelation
are requested for statistics (i.e.computeStats
) DataSource
creates a HadoopFsRelation for FileFormat data source (and builds a CatalogFileIndex when no table statistics are available)BaseRelation
is requested for an estimated size of this relation (in bytes)
dynamicPartitionPruningEnabled¶
spark.sql.optimizer.dynamicPartitionPruning.enabled
dynamicPartitionPruningFallbackFilterRatio¶
The value of spark.sql.optimizer.dynamicPartitionPruning.fallbackFilterRatio configuration property
Used when:
- PartitionPruning logical optimization rule is executed
dynamicPartitionPruningPruningSideExtraFilterRatio¶
The value of spark.sql.optimizer.dynamicPartitionPruning.pruningSideExtraFilterRatio configuration property
Used when:
- PartitionPruning logical optimization rule is executed
dynamicPartitionPruningReuseBroadcastOnly¶
spark.sql.optimizer.dynamicPartitionPruning.reuseBroadcastOnly
dynamicPartitionPruningUseStats¶
spark.sql.optimizer.dynamicPartitionPruning.useStats
ENABLE_FULL_OUTER_SHUFFLED_HASH_JOIN_CODEGEN¶
spark.sql.codegen.join.fullOuterShuffledHashJoin.enabled
enableDefaultColumns¶
spark.sql.defaultColumn.enabled
enableRadixSort¶
spark.sql.sort.enableRadixSort
Used when:
SortExec
physical operator is requested to create an UnsafeExternalRowSorter.
enableTwoLevelAggMap¶
spark.sql.codegen.aggregate.map.twolevel.enabled
enableVectorizedHashMap¶
spark.sql.codegen.aggregate.map.vectorized.enable
exchangeReuseEnabled¶
Used when:
-
AdaptiveSparkPlanExec physical operator is requested to createQueryStages
-
PartitionPruning logical optimization rule is executed.
-
PlanDynamicPruningFilters
and ReuseExchange physical optimizations are executed
fallBackToHdfsForStatsEnabled¶
spark.sql.statistics.fallBackToHdfs
Used when DetermineTableStats
logical resolution rule is executed.
fastHashAggregateRowMaxCapacityBit¶
spark.sql.codegen.aggregate.fastHashMap.capacityBit
fetchShuffleBlocksInBatch¶
The value of spark.sql.adaptive.fetchShuffleBlocksInBatch configuration property
Used when ShuffledRowRDD is created
fileCommitProtocolClass¶
spark.sql.sources.commitProtocolClass
fileCompressionFactor¶
The value of spark.sql.sources.fileCompressionFactor configuration property
Used when:
HadoopFsRelation
is requested for a sizeFileScan
is requested to estimate statistics
filesMaxPartitionBytes¶
spark.sql.files.maxPartitionBytes
filesMinPartitionNum¶
spark.sql.files.minPartitionNum
filesOpenCostInBytes¶
spark.sql.files.openCostInBytes
filesourcePartitionFileCacheSize¶
spark.sql.hive.filesourcePartitionFileCacheSize
histogramEnabled¶
The value of spark.sql.statistics.histogram.enabled configuration property
Used when AnalyzeColumnCommand logical command is executed.
histogramNumBins¶
spark.sql.statistics.histogram.numBins
Used when AnalyzeColumnCommand
is AnalyzeColumnCommand.md#run[executed] with configuration-properties.md#spark.sql.statistics.histogram.enabled[spark.sql.statistics.histogram.enabled] turned on (and AnalyzeColumnCommand.md#computePercentiles[calculates percentiles]).
HIVE_TABLE_PROPERTY_LENGTH_THRESHOLD¶
spark.sql.hive.tablePropertyLengthThreshold
Used when:
CatalogTable
is requested to splitLargeTableProp
hugeMethodLimit¶
spark.sql.codegen.hugeMethodLimit
ignoreCorruptFiles¶
The value of spark.sql.files.ignoreCorruptFiles configuration property
Used when:
AvroUtils
utility is requested toinferSchema
OrcFileFormat
is requested toinferSchema
andbuildReader
FileScanRDD
is created (and then to compute a partition)SchemaMergeUtils
utility is requested tomergeSchemasInParallel
OrcUtils
utility is requested toreadSchema
FilePartitionReader
is requested toignoreCorruptFiles
ignoreMissingFiles¶
The value of spark.sql.files.ignoreMissingFiles configuration property
Used when:
FileScanRDD
is created (and then to compute a partition)InMemoryFileIndex
utility is requested to bulkListLeafFilesFilePartitionReader
is requested toignoreMissingFiles
inMemoryPartitionPruning¶
spark.sql.inMemoryColumnarStorage.partitionPruning
isParquetBinaryAsString¶
spark.sql.parquet.binaryAsString
isParquetINT96AsTimestamp¶
spark.sql.parquet.int96AsTimestamp
isParquetINT96TimestampConversion¶
spark.sql.parquet.int96TimestampConversion
Used when ParquetFileFormat
is requested to build a data reader with partition column values appended.
isParquetSchemaMergingEnabled¶
isParquetSchemaRespectSummaries¶
spark.sql.parquet.respectSummaryFiles
Used when:
ParquetUtils
is used to inferSchema
joinReorderEnabled¶
spark.sql.cbo.joinReorder.enabled
Used in CostBasedJoinReorder logical plan optimization
legacyIntervalEnabled¶
spark.sql.legacy.interval.enabled
Used when:
SubtractTimestamps
expression is createdSubtractDates
expression is createdAstBuilder
is requested to visitTypeConstructor and visitInterval
limitScaleUpFactor¶
Used when a physical operator is requested the first n rows as an array.
LOCAL_SHUFFLE_READER_ENABLED¶
spark.sql.adaptive.localShuffleReader.enabled
Used when:
- OptimizeShuffleWithLocalRead adaptive physical optimization is executed
manageFilesourcePartitions¶
spark.sql.hive.manageFilesourcePartitions
maxConcurrentOutputFileWriters¶
The value of spark.sql.maxConcurrentOutputFileWriters configuration property
Used when:
FileFormatWriter
is requested to write out a query result
maxMetadataStringLength¶
spark.sql.maxMetadataStringLength
Used when:
DataSourceScanExec
is requested for simpleStringFileScan
is requested for description and metadataHiveTableRelation
is requested for simpleString
maxRecordsPerFile¶
spark.sql.files.maxRecordsPerFile
Used when:
FileFormatWriter
utility is used to write out a query resultFileWrite
is requested for a BatchWrite
maxToStringFields¶
The value of spark.sql.debug.maxToStringFields configuration property
metastorePartitionPruning¶
spark.sql.hive.metastorePartitionPruning
Used when HiveTableScanExec physical operator is executed with a partitioned table (and requested for rawPartitions)
methodSplitThreshold¶
spark.sql.codegen.methodSplitThreshold
Used when:
Expression
is requested to reduceCodeSizeCodegenContext
is requested to buildCodeBlocks and subexpressionEliminationForWholeStageCodegenExpandExec
physical operator is requested todoConsume
HashAggregateExec
physical operator is requested to generateEvalCodeForAggFuncs
minNumPostShufflePartitions¶
spark.sql.adaptive.minNumPostShufflePartitions
Used when EnsureRequirements physical optimization is executed (for Adaptive Query Execution).
nestedSchemaPruningEnabled¶
The value of spark.sql.optimizer.nestedSchemaPruning.enabled configuration property
Used when SchemaPruning, ColumnPruning and V2ScanRelationPushDown logical optimizations are executed
nonEmptyPartitionRatioForBroadcastJoin¶
The value of spark.sql.adaptive.nonEmptyPartitionRatioForBroadcastJoin configuration property
Used when:
- DynamicJoinSelection adaptive logical optimization is executed (and shouldDemoteBroadcastHashJoin)
numShufflePartitions¶
offHeapColumnVectorEnabled¶
spark.sql.columnVector.offheap.enabled
rangeExchangeSampleSizePerPartition¶
The value of spark.sql.execution.rangeExchange.sampleSizePerPartition configuration property
Used when:
- ShuffleExchangeExec physical operator is executed
REMOVE_REDUNDANT_SORTS_ENABLED¶
The value of spark.sql.execution.removeRedundantSorts configuration property
Used when:
- RemoveRedundantSorts physical optimization is executed
REPLACE_HASH_WITH_SORT_AGG_ENABLED¶
spark.sql.execution.replaceHashWithSortAgg
runtimeFilterBloomFilterEnabled¶
spark.sql.optimizer.runtime.bloomFilter.enabled
RUNTIME_BLOOM_FILTER_MAX_NUM_BITS¶
spark.sql.optimizer.runtime.bloomFilter.maxNumBits
RUNTIME_FILTER_NUMBER_THRESHOLD¶
spark.sql.optimizer.runtimeFilter.number.threshold
runtimeFilterSemiJoinReductionEnabled¶
spark.sql.optimizer.runtimeFilter.semiJoinReduction.enabled
SKEW_JOIN_SKEWED_PARTITION_FACTOR¶
spark.sql.adaptive.skewJoin.skewedPartitionFactor configuration property
Used when:
- OptimizeSkewedJoin physical optimization is executed
SKEW_JOIN_SKEWED_PARTITION_THRESHOLD¶
spark.sql.adaptive.skewJoin.skewedPartitionThresholdInBytes configuration property
Used when:
- OptimizeSkewedJoin physical optimization is executed
SKEW_JOIN_ENABLED¶
spark.sql.adaptive.skewJoin.enabled configuration property
Used when:
- OptimizeSkewedJoin physical optimization is executed
objectAggSortBasedFallbackThreshold¶
spark.sql.objectHashAggregate.sortBased.fallbackThreshold
offHeapColumnVectorEnabled¶
spark.sql.columnVector.offheap.enabled
Used when:
InMemoryTableScanExec
is requested for the vectorTypes and the input RDDOrcFileFormat
is requested tobuildReaderWithPartitionValues
ParquetFileFormat
is requested for vectorTypes and build a data reader with partition column values appended
OPTIMIZE_ONE_ROW_RELATION_SUBQUERY¶
spark.sql.optimizer.optimizeOneRowRelationSubquery
Used when:
OptimizeOneRowRelationSubquery
logical optimization is executed
optimizeNullAwareAntiJoin¶
spark.sql.optimizeNullAwareAntiJoin configuration property
Used when:
- ExtractSingleColumnNullAwareAntiJoin Scala extractor is executed
optimizerExcludedRules¶
The value of spark.sql.optimizer.excludedRules configuration property
Used when Optimizer
is requested for the batches
optimizerInSetConversionThreshold¶
spark.sql.optimizer.inSetConversionThreshold
Used when OptimizeIn logical query optimization is executed
orcVectorizedReaderNestedColumnEnabled¶
spark.sql.orc.enableNestedColumnVectorizedReader
Used when:
OrcFileFormat
is requested tosupportBatchForNestedColumn
OUTPUT_COMMITTER_CLASS¶
spark.sql.sources.outputCommitterClass
Used when:
SQLHadoopMapReduceCommitProtocol
is requested to setupCommitterParquetFileFormat
is requested to prepareWriteParquetWrite
is requested to prepareWrite
parallelFileListingInStatsComputation¶
spark.sql.statistics.parallelFileListingInStatsComputation.enabled
Used when CommandUtils
helper object is requested to calculate the total size of a table (with partitions) (for AnalyzeColumnCommand and AnalyzeTableCommand commands)
parquetAggregatePushDown¶
spark.sql.parquet.aggregatePushdown
parquetCompressionCodec¶
spark.sql.parquet.compression.codec
Used when:
ParquetOptions
is requested for compressionCodecClassName
parquetFilterPushDown¶
spark.sql.parquet.filterPushdown
parquetFilterPushDownDate¶
spark.sql.parquet.filterPushdown.date
Used when:
ParquetFileFormat
is requested to build a data reader (with partition column values appended)
parquetFilterPushDownDecimal¶
spark.sql.parquet.filterPushdown.decimal
Used when:
ParquetFileFormat
is requested to build a data reader (with partition column values appended)ParquetPartitionReaderFactory
is requested to buildReaderBaseParquetScanBuilder
is requested for pushedParquetFilters
parquetFilterPushDownInFilterThreshold¶
spark.sql.parquet.pushdown.inFilterThreshold
Used when:
ParquetFileFormat
is requested to build a data reader (with partition column values appended)ParquetPartitionReaderFactory
is requested to buildReaderBaseParquetScanBuilder
is requested for pushedParquetFilters
parquetFilterPushDownStringPredicate¶
spark.sql.parquet.filterPushdown.stringPredicate
parquetFilterPushDownStringStartWith¶
spark.sql.parquet.filterPushdown.string.startsWith
parquetFilterPushDownTimestamp¶
spark.sql.parquet.filterPushdown.timestamp
Used when:
ParquetFileFormat
is requested to build a data reader (with partition column values appended)ParquetPartitionReaderFactory
is requested to buildReaderBaseParquetScanBuilder
is requested for pushedParquetFilters
parquetOutputCommitterClass¶
spark.sql.parquet.output.committer.class
Used when:
ParquetFileFormat
is requested to prepareWriteParquetWrite
is requested to prepareWrite
parquetOutputTimestampType¶
spark.sql.parquet.outputTimestampType
Used when:
ParquetFileFormat
is requested to prepareWriteSparkToParquetSchemaConverter
is createdParquetWriteSupport
is requested to initParquetWrite
is requested to prepareWrite
parquetRecordFilterEnabled¶
spark.sql.parquet.recordLevelFilter.enabled
Used when ParquetFileFormat
is requested to build a data reader (with partition column values appended).
parquetVectorizedReaderBatchSize¶
spark.sql.parquet.columnarReaderBatchSize
parquetVectorizedReaderEnabled¶
spark.sql.parquet.enableVectorizedReader
Used when:
FileSourceScanExec
is requested for needsUnsafeRowConversion flagParquetFileFormat
is requested for supportBatch flag and build a data reader with partition column values appended
parquetVectorizedReaderNestedColumnEnabled¶
spark.sql.parquet.enableNestedColumnVectorizedReader
partitionOverwriteMode¶
The value of spark.sql.sources.partitionOverwriteMode configuration property
Used when InsertIntoHadoopFsRelationCommand logical command is executed
planChangeLogLevel¶
The value of spark.sql.planChangeLog.level configuration property
Used when:
- PlanChangeLogger is created
planChangeBatches¶
The value of spark.sql.planChangeLog.batches configuration property
Used when:
PlanChangeLogger
is requested to logBatch
planChangeRules¶
The value of spark.sql.planChangeLog.rules configuration property
Used when:
PlanChangeLogger
is requested to logRule
preferSortMergeJoin¶
spark.sql.join.preferSortMergeJoin
Used in JoinSelection execution planning strategy to prefer sort merge join over shuffle hash join.
LEAF_NODE_DEFAULT_PARALLELISM¶
spark.sql.leafNodeDefaultParallelism
Used when:
SparkSession
is requested for the leafNodeDefaultParallelism
LEGACY_CTE_PRECEDENCE_POLICY¶
spark.sql.legacy.ctePrecedencePolicy
PROPAGATE_DISTINCT_KEYS_ENABLED¶
spark.sql.optimizer.propagateDistinctKeys.enabled
replaceDatabricksSparkAvroEnabled¶
spark.sql.legacy.replaceDatabricksSparkAvro.enabled
replaceExceptWithFilter¶
spark.sql.optimizer.replaceExceptWithFilter
Used when ReplaceExceptWithFilter logical optimization is executed
runSQLonFile¶
Used when:
ResolveSQLOnFile
is requested to maybeSQLFile
RUNTIME_BLOOM_FILTER_EXPECTED_NUM_ITEMS¶
spark.sql.optimizer.runtime.bloomFilter.expectedNumItems
runtimeRowLevelOperationGroupFilterEnabled¶
spark.sql.optimizer.runtime.rowLevelOperationGroupFilter.enabled
sessionLocalTimeZone¶
sessionWindowBufferInMemoryThreshold¶
spark.sql.sessionWindow.buffer.in.memory.threshold
Used when:
UpdatingSessionsExec
unary physical operator is executed
sessionWindowBufferSpillThreshold¶
spark.sql.sessionWindow.buffer.spill.threshold
Used when:
UpdatingSessionsExec
unary physical operator is executed
sortBeforeRepartition¶
The value of spark.sql.execution.sortBeforeRepartition configuration property
Used when ShuffleExchangeExec physical operator is executed
starSchemaDetection¶
spark.sql.cbo.starSchemaDetection
Used in ReorderJoin logical optimization (and indirectly in StarSchemaDetection
)
stringRedactionPattern¶
spark.sql.redaction.string.regex
Used when:
DataSourceScanExec
is requested to redact sensitive information (in text representations)QueryExecution
is requested to redact sensitive information (in text representations)
subexpressionEliminationEnabled¶
spark.sql.subexpressionElimination.enabled
Used when SparkPlan
is requested for subexpressionEliminationEnabled flag.
subqueryReuseEnabled¶
spark.sql.execution.reuseSubquery
Used when:
- ReuseAdaptiveSubquery adaptive physical optimization is executed
- ReuseExchangeAndSubquery physical optimization is executed
supportQuotedRegexColumnName¶
spark.sql.parser.quotedRegexColumnNames
Used when:
- Dataset.col operator is used
AstBuilder
is requested to parse a dereference and column reference in a SQL statement
targetPostShuffleInputSize¶
spark.sql.adaptive.shuffle.targetPostShuffleInputSize
Used when EnsureRequirements physical optimization is executed (for Adaptive Query Execution)
THRIFTSERVER_FORCE_CANCEL¶
spark.sql.thriftServer.interruptOnCancel
Used when:
SparkExecuteStatementOperation
is created (forceCancel
)
truncateTableIgnorePermissionAcl¶
spark.sql.truncateTable.ignorePermissionAcl.enabled
Used when TruncateTableCommand logical command is executed
useCompression¶
The value of spark.sql.inMemoryColumnarStorage.compressed configuration property
Used when CacheManager
is requested to cache a structured query
useObjectHashAggregation¶
spark.sql.execution.useObjectHashAggregateExec
Used when Aggregation execution planning strategy is executed (and uses AggUtils
to create an aggregation physical operator).
v2BucketingPartiallyClusteredDistributionEnabled¶
spark.sql.sources.v2.bucketing.partiallyClusteredDistribution.enabled
v2BucketingPushPartValuesEnabled¶
spark.sql.sources.v2.bucketing.pushPartValues.enabled
variableSubstituteEnabled¶
Used when:
VariableSubstitution
is requested to substitute variables in a SQL command
wholeStageEnabled¶
Used in:
- CollapseCodegenStages to control codegen
- ParquetFileFormat to control row batch reading
wholeStageFallback¶
wholeStageMaxNumFields¶
Used in:
- CollapseCodegenStages to control codegen
- ParquetFileFormat to control row batch reading
wholeStageSplitConsumeFuncByOperator¶
spark.sql.codegen.splitConsumeFuncByOperator
Used when CodegenSupport
is requested to consume
wholeStageUseIdInClassName¶
spark.sql.codegen.useIdInClassName
Used when WholeStageCodegenExec
is requested to generate the Java source code for the child physical plan subtree (when created)
windowExecBufferInMemoryThreshold¶
spark.sql.windowExec.buffer.in.memory.threshold
Used when:
- WindowExec unary physical operator is executed
windowExecBufferSpillThreshold¶
spark.sql.windowExec.buffer.spill.threshold
Used when:
- WindowExec unary physical operator is executed