FairSchedulableBuilder¶
FairSchedulableBuilder is a <FAIR).
[[creating-instance]] FairSchedulableBuilder takes the following to be created:
- [[rootPool]] <
> - [[conf]] SparkConf.md[]
Once <TaskSchedulerImpl requests the FairSchedulableBuilder to <
[[DEFAULT_SCHEDULER_FILE]] FairSchedulableBuilder uses the pools defined in an <
TIP: Use conf/fairscheduler.xml.template as a template for the <
[[DEFAULT_POOL_NAME]] FairSchedulableBuilder always has the default pool defined (and <
[[FAIR_SCHEDULER_PROPERTIES]] [[spark.scheduler.pool]] FairSchedulableBuilder uses spark.scheduler.pool local property for the name of the pool to use when requested to <
Note
SparkContext.setLocalProperty lets you set local properties per thread to group jobs in logical groups, e.g. to allow FairSchedulableBuilder to use spark.scheduler.pool property and to group jobs from different threads to be submitted for execution on a non-<
[source, scala]¶
scala> :type sc org.apache.spark.SparkContext
sc.setLocalProperty("spark.scheduler.pool", "production")
// whatever is executed afterwards is submitted to production pool¶
[[logging]] [TIP] ==== Enable ALL logging level for org.apache.spark.scheduler.FairSchedulableBuilder logger to see what happens inside.
Add the following line to conf/log4j.properties:
log4j.logger.org.apache.spark.scheduler.FairSchedulableBuilder=ALL
Refer to <>.¶
=== [[allocations-file]] Allocation Pools Configuration File
The allocation pools configuration file is an XML file.
The default conf/fairscheduler.xml.template is as follows:
[source, xml]¶
TIP: The top-level element's name allocations can be anything. Spark does not insist on allocations and accepts any name.
=== [[buildPools]] Building (Tree of) Pools of Schedulables -- buildPools Method
[source, scala]¶
buildPools(): Unit¶
NOTE: buildPools is part of the <
buildPools <
buildPools prints out the following INFO message to the logs when the configuration file (per the configuration-properties.md#spark.scheduler.allocation.file[spark.scheduler.allocation.file] configuration property) could be read:
Creating Fair Scheduler pools from [file]
buildPools prints out the following INFO message to the logs when the configuration-properties.md#spark.scheduler.allocation.file[spark.scheduler.allocation.file] configuration property was not used to define the configuration file and the <
Creating Fair Scheduler pools from default file: [DEFAULT_SCHEDULER_FILE]
When neither configuration-properties.md#spark.scheduler.allocation.file[spark.scheduler.allocation.file] configuration property nor the <buildPools prints out the following WARN message to the logs:
Fair Scheduler configuration file not found so jobs will be scheduled in FIFO order. To use fair scheduling, configure pools in [DEFAULT_SCHEDULER_FILE] or set spark.scheduler.allocation.file to a file that contains the configuration.
=== [[addTaskSetManager]] addTaskSetManager Method
[source, scala]¶
addTaskSetManager(manager: Schedulable, properties: Properties): Unit¶
NOTE: addTaskSetManager is part of the <
addTaskSetManager finds the pool by name (in the given Properties) under the <
addTaskSetManager then requests the <
Unless found, addTaskSetManager creates a new <addTaskSetManager prints out the following WARN message to the logs:
A job was submitted with scheduler pool [poolName], which has not been configured. This can happen when the file that pools are read from isn't set, or when that file doesn't contain [poolName]. Created [poolName] with default configuration (schedulingMode: [mode], minShare: [minShare], weight: [weight])
addTaskSetManager then requests the pool (found or newly-created) to <
In the end, addTaskSetManager prints out the following INFO message to the logs:
Added task set [name] tasks to pool [poolName]
=== [[buildDefaultPool]] Registering Default Pool -- buildDefaultPool Method
[source, scala]¶
buildDefaultPool(): Unit¶
buildDefaultPool requests the <
Unless already available, buildDefaultPool creates a <
-
<
> pool name -
FIFOscheduling mode -
0for the initial minimum share -
1for the initial weight
In the end, buildDefaultPool requests the <
Created default pool: [name], schedulingMode: [mode], minShare: [minShare], weight: [weight]
NOTE: buildDefaultPool is used exclusively when FairSchedulableBuilder is requested to <
=== [[buildFairSchedulerPool]] Building Pools from XML Allocations File -- buildFairSchedulerPool Internal Method
[source, scala]¶
buildFairSchedulerPool( is: InputStream, fileName: String): Unit
buildFairSchedulerPool starts by loading the XML file from the given InputStream.
For every pool element, buildFairSchedulerPool creates a <
-
Pool name per name attribute
-
Scheduling mode per schedulingMode element (case-insensitive with
FIFOas the default) -
Initial minimum share per minShare element (default:
0) -
Initial weight per weight element (default:
1)
In the end, buildFairSchedulerPool requests the <
Created pool: [name], schedulingMode: [mode], minShare: [minShare], weight: [weight]
NOTE: buildFairSchedulerPool is used exclusively when FairSchedulableBuilder is requested to <