SQLConf — Internal Configuration Store¶
SQLConf
is an internal configuration store for parameters and hints used to configure a Spark Structured Streaming application (and Spark SQL applications in general).
Tip
Find out more on SQLConf in The Internals of Spark SQL
streamingFileCommitProtocolClass¶
spark.sql.streaming.commitProtocolClass configuration property
Used when FileStreamSink
is requested to "add" a batch of data
streamingMetricsEnabled¶
spark.sql.streaming.metricsEnabled configuration property
Used when StreamExecution
is requested to runStream
fileSinkLogCleanupDelay¶
spark.sql.streaming.fileSink.log.cleanupDelay configuration property
Used when FileStreamSinkLog is created
fileSinkLogDeletion¶
spark.sql.streaming.fileSink.log.deletion configuration property
Used when FileStreamSinkLog is created
fileSinkLogCompactInterval¶
spark.sql.streaming.fileSink.log.compactInterval configuration property
Used when FileStreamSinkLog is created
minBatchesToRetain¶
spark.sql.streaming.minBatchesToRetain configuration property
Used when:
-
CompactibleFileStreamLog
is created -
StreamExecution is created
-
StateStoreConf
is created
[[accessor-methods]] .SQLConf's Property Accessor Methods [cols="1,1",options="header",width="100%"] |=== | Method Name / Property | Description
| continuousStreamingExecutorQueueSize
spark.sql.streaming.continuous.executorQueueSize
a| [[continuousStreamingExecutorQueueSize]] Used when:
-
DataSourceV2ScanExec
leaf physical operator is requested for the input RDDs (and creates a <>) -
ContinuousCoalesceExec
unary physical operator is requested to execute
| continuousStreamingExecutorPollIntervalMs
spark.sql.streaming.continuous.executorPollIntervalMs
a| [[continuousStreamingExecutorPollIntervalMs]] Used exclusively when DataSourceV2ScanExec
leaf physical operator is requested for the input RDDs (and creates a <
| disabledV2StreamingMicroBatchReaders
spark.sql.streaming.disabledV2MicroBatchReaders
a| [[disabledV2StreamingMicroBatchReaders]] Used exclusively when MicroBatchExecution
is requested for the <
| fileSourceLogDeletion
spark.sql.streaming.fileSource.log.deletion
a| [[fileSourceLogDeletion]][[FILE_SOURCE_LOG_DELETION]] Used exclusively when FileStreamSourceLog
is requested for the isDeletingExpiredLog
| fileSourceLogCleanupDelay
spark.sql.streaming.fileSource.log.cleanupDelay
a| [[fileSourceLogCleanupDelay]][[FILE_SOURCE_LOG_CLEANUP_DELAY]] Used exclusively when FileStreamSourceLog
is requested for the fileCleanupDelayMs
| fileSourceLogCompactInterval
spark.sql.streaming.fileSource.log.compactInterval
a| [[fileSourceLogCompactInterval]][[FILE_SOURCE_LOG_COMPACT_INTERVAL]] Used exclusively when FileStreamSourceLog
is requested for the default compaction interval
| FLATMAPGROUPSWITHSTATE_STATE_FORMAT_VERSION
spark.sql.streaming.flatMapGroupsWithState.stateFormatVersion
a| [[FLATMAPGROUPSWITHSTATE_STATE_FORMAT_VERSION]] Used when:
-
FlatMapGroupsWithStateStrategy execution planning strategy is requested to plan a streaming query (and creates a FlatMapGroupsWithStateExec physical operator for every FlatMapGroupsWithState logical operator)
-
Among the checkpointed properties
| SHUFFLE_PARTITIONS
spark.sql.shuffle.partitions
a| [[SHUFFLE_PARTITIONS]] See https://jaceklaskowski.gitbooks.io/mastering-spark-sql/spark-sql-properties.html#spark.sql.shuffle.partitions[spark.sql.shuffle.partitions] in The Internals of Spark SQL.
| stateStoreMinDeltasForSnapshot
spark.sql.streaming.stateStore.minDeltasForSnapshot
a| [[stateStoreMinDeltasForSnapshot]] Used (as StateStoreConf.minDeltasForSnapshot) exclusively when HDFSBackedStateStoreProvider
is requested to doSnapshot
| stateStoreProviderClass
spark.sql.streaming.stateStore.providerClass
a| [[stateStoreProviderClass]] Used when:
-
StateStoreWriter
is requested to stateStoreCustomMetrics (whenStateStoreWriter
is requested for the metrics and getProgress) -
StateStoreConf
is created
| STREAMING_AGGREGATION_STATE_FORMAT_VERSION
spark.sql.streaming.aggregation.stateFormatVersion
a| [[STREAMING_AGGREGATION_STATE_FORMAT_VERSION]] Used when:
-
StatefulAggregationStrategy execution planning strategy is executed
-
OffsetSeqMetadata
is requested for the relevantSQLConfs and the relevantSQLConfDefaultValues
| STREAMING_CHECKPOINT_FILE_MANAGER_CLASS
spark.sql.streaming.checkpointFileManagerClass a| [[STREAMING_CHECKPOINT_FILE_MANAGER_CLASS]] Used exclusively when CheckpointFileManager
helper object is requested to create a CheckpointFileManager
| streamingMetricsEnabled
spark.sql.streaming.metricsEnabled
a| [[streamingMetricsEnabled]] Used exclusively when StreamExecution
is requested for runStream (to control whether to register a metrics reporter for a streaming query)
| STREAMING_MULTIPLE_WATERMARK_POLICY
spark.sql.streaming.multipleWatermarkPolicy
a| [[STREAMING_MULTIPLE_WATERMARK_POLICY]]
| streamingNoDataMicroBatchesEnabled
spark.sql.streaming.noDataMicroBatches.enabled
a| [[streamingNoDataMicroBatchesEnabled]][[STREAMING_NO_DATA_MICRO_BATCHES_ENABLED]] Used exclusively when MicroBatchExecution
stream execution engine is requested to <
| streamingNoDataProgressEventInterval
spark.sql.streaming.noDataProgressEventInterval
a| [[streamingNoDataProgressEventInterval]] Used exclusively for ProgressReporter
| streamingPollingDelay
spark.sql.streaming.pollingDelay
a| [[streamingPollingDelay]][[STREAMING_POLLING_DELAY]] Used exclusively when StreamExecution is created
| streamingProgressRetention
spark.sql.streaming.numRecentProgressUpdates
a| [[streamingProgressRetention]][[STREAMING_PROGRESS_RETENTION]] Used exclusively when ProgressReporter
is requested to update progress of streaming query (and possibly remove an excess)
|===