Spark Configuration Properties of External Shuffle Service¶
The following are configuration properties of External Shuffle Service.
spark.shuffle.service.db.enabled¶
Whether to use db in ExternalShuffleService. Note that this only affects standalone mode.
Default: true
Used when:
ExternalShuffleService
is requested for an ExternalBlockHandlerWorker
(Spark Standalone) is requested to handle aWorkDirCleanup
message
spark.shuffle.service.enabled¶
Controls whether to use the External Shuffle Service
Default: false
Note
LocalSparkCluster
turns this property off explicitly when started.
Used when:
BlacklistTracker
is requested to updateBlacklistForFetchFailureExecutorMonitor
is createdExecutorAllocationManager
is requested to validateSettingsSparkEnv
utility is requested to create a "base" SparkEnvExternalShuffleService
is created and startedWorker
(Spark Standalone) is requested to handle aWorkDirCleanup
message or startedExecutorRunnable
(Spark on YARN) is requested tostartContainer
spark.shuffle.service.fetch.rdd.enabled¶
Enables ExternalShuffleService for fetching disk persisted RDD blocks.
When enabled with Dynamic Resource Allocation executors having only disk persisted blocks are considered idle after spark.dynamicAllocation.executorIdleTimeout and will be released accordingly.
Default: false
Used when:
ExternalShuffleBlockResolver
is createdSparkEnv
utility is requested to create a "base" SparkEnvExecutorMonitor
is created
spark.shuffle.service.port¶
Port of the external shuffle service
Default: 7337
Used when:
ExternalShuffleService
is createdStorageUtils
utility is requested for the port of an external shuffle service