Configuration Properties¶
spark.sql.pipelines is a family of the Configuration Properties for Spark Declarative Pipelines.
execution.streamstate.pollingInterval¶
spark.sql.pipelines.execution.streamstate.pollingInterval
(internal) How often (in seconds) the stream state is polled for changes. This is used to check if the stream has failed and needs to be restarted.
Default: 1
Use SQLConf.PIPELINES_STREAM_STATE_POLLING_INTERVAL to reference the name.
Use SQLConf.streamStatePollingInterval method to access the current value.
Used when:
TriggeredGraphExecutionis requested to topologicalExecution
execution.watchdog.minRetryTime¶
spark.sql.pipelines.execution.watchdog.minRetryTime
(internal) Initial duration (in seconds) between the time when we notice a flow has failed and when we try to restart the flow. The interval between flow restarts doubles with every stream failure up to the maximum value set in spark.sql.pipelines.execution.watchdog.maxRetryTime.
Default: 5 (seconds)
Must be at least 1 second
Use SQLConf.PIPELINES_WATCHDOG_MIN_RETRY_TIME_IN_SECONDS to reference the name.
Use SQLConf.watchdogMinRetryTimeInSeconds method to access the current value.
Used when:
TriggeredGraphExecutionis requested to backoffStrategy
execution.watchdog.maxRetryTime¶
spark.sql.pipelines.execution.watchdog.maxRetryTime
(internal) Maximum time interval (in seconds) at which flows will be restarted
Default: 3600 (seconds)
Must be greater than or equal to spark.sql.pipelines.execution.watchdog.minRetryTime
Use SQLConf.PIPELINES_WATCHDOG_MAX_RETRY_TIME_IN_SECONDS to reference the name.
Use SQLConf.watchdogMaxRetryTimeInSeconds method to access the current value.
Used when:
TriggeredGraphExecutionis requested to backoffStrategy
execution.maxConcurrentFlows¶
spark.sql.pipelines.execution.maxConcurrentFlows
(internal) Maximum number of flows to execute at once. Used to tune performance for triggered pipelines. Has no effect on continuous pipelines.
Default: 16
Use SQLConf.PIPELINES_MAX_CONCURRENT_FLOWS to reference the name.
Use SQLConf.maxConcurrentFlows method to access the current value.
Used when:
TriggeredGraphExecutionis requested for the concurrencyLimit and to topologicalExecution
timeoutMsForTerminationJoinAndLock¶
spark.sql.pipelines.timeoutMsForTerminationJoinAndLock
(internal) Timeout (in ms) to grab a lock for stopping update - default is 1hr.
Default: 60 * 60 * 1000 (1 hour)
Must be at least 1 millisecond
Use SQLConf.PIPELINES_TIMEOUT_MS_FOR_TERMINATION_JOIN_AND_LOCK to reference the name.
Use SQLConf.timeoutMsForTerminationJoinAndLock method to access the current value.
Used when:
GraphExecutionis requested to stopThread
maxFlowRetryAttempts¶
spark.sql.pipelines.maxFlowRetryAttempts
Maximum number of times a flow can be retried. Can be set at the pipeline or flow level
Default: 2
Use SQLConf.PIPELINES_MAX_FLOW_RETRY_ATTEMPTS to reference the name.
Use SQLConf.maxFlowRetryAttempts method to access the current value.
Used when:
GraphExecutionis requested to maxRetryAttemptsForFlow
event.queue.capacity¶
spark.sql.pipelines.event.queue.capacity
(internal) Capacity of the event queue used in pipelined execution. When the queue is full, non-terminal FlowProgressEvents will be dropped.
Default: 1000
Must be positive
Use SQLConf.PIPELINES_EVENT_QUEUE_CAPACITY to reference the name.
Used when:
PipelineEventSenderis created