FairSchedulableBuilder¶
FairSchedulableBuilder
is a <FAIR
).
[[creating-instance]] FairSchedulableBuilder
takes the following to be created:
- [[rootPool]] <
> - [[conf]] SparkConf.md[]
Once <TaskSchedulerImpl
requests the FairSchedulableBuilder
to <
[[DEFAULT_SCHEDULER_FILE]] FairSchedulableBuilder
uses the pools defined in an <
TIP: Use conf/fairscheduler.xml.template as a template for the <
[[DEFAULT_POOL_NAME]] FairSchedulableBuilder
always has the default pool defined (and <
[[FAIR_SCHEDULER_PROPERTIES]] [[spark.scheduler.pool]] FairSchedulableBuilder
uses spark.scheduler.pool local property for the name of the pool to use when requested to <
Note
SparkContext.setLocalProperty lets you set local properties per thread to group jobs in logical groups, e.g. to allow FairSchedulableBuilder
to use spark.scheduler.pool
property and to group jobs from different threads to be submitted for execution on a non-<
[source, scala]¶
scala> :type sc org.apache.spark.SparkContext
sc.setLocalProperty("spark.scheduler.pool", "production")
// whatever is executed afterwards is submitted to production pool¶
[[logging]] [TIP] ==== Enable ALL
logging level for org.apache.spark.scheduler.FairSchedulableBuilder
logger to see what happens inside.
Add the following line to conf/log4j.properties
:
log4j.logger.org.apache.spark.scheduler.FairSchedulableBuilder=ALL
Refer to <>.¶
=== [[allocations-file]] Allocation Pools Configuration File
The allocation pools configuration file is an XML file.
The default conf/fairscheduler.xml.template
is as follows:
[source, xml]¶
TIP: The top-level element's name allocations
can be anything. Spark does not insist on allocations
and accepts any name.
=== [[buildPools]] Building (Tree of) Pools of Schedulables -- buildPools
Method
[source, scala]¶
buildPools(): Unit¶
NOTE: buildPools
is part of the <
buildPools
<
buildPools
prints out the following INFO message to the logs when the configuration file (per the configuration-properties.md#spark.scheduler.allocation.file[spark.scheduler.allocation.file] configuration property) could be read:
Creating Fair Scheduler pools from [file]
buildPools
prints out the following INFO message to the logs when the configuration-properties.md#spark.scheduler.allocation.file[spark.scheduler.allocation.file] configuration property was not used to define the configuration file and the <
Creating Fair Scheduler pools from default file: [DEFAULT_SCHEDULER_FILE]
When neither configuration-properties.md#spark.scheduler.allocation.file[spark.scheduler.allocation.file] configuration property nor the <buildPools
prints out the following WARN message to the logs:
Fair Scheduler configuration file not found so jobs will be scheduled in FIFO order. To use fair scheduling, configure pools in [DEFAULT_SCHEDULER_FILE] or set spark.scheduler.allocation.file to a file that contains the configuration.
=== [[addTaskSetManager]] addTaskSetManager
Method
[source, scala]¶
addTaskSetManager(manager: Schedulable, properties: Properties): Unit¶
NOTE: addTaskSetManager
is part of the <
addTaskSetManager
finds the pool by name (in the given Properties
) under the <
addTaskSetManager
then requests the <
Unless found, addTaskSetManager
creates a new <addTaskSetManager
prints out the following WARN message to the logs:
A job was submitted with scheduler pool [poolName], which has not been configured. This can happen when the file that pools are read from isn't set, or when that file doesn't contain [poolName]. Created [poolName] with default configuration (schedulingMode: [mode], minShare: [minShare], weight: [weight])
addTaskSetManager
then requests the pool (found or newly-created) to <
In the end, addTaskSetManager
prints out the following INFO message to the logs:
Added task set [name] tasks to pool [poolName]
=== [[buildDefaultPool]] Registering Default Pool -- buildDefaultPool
Method
[source, scala]¶
buildDefaultPool(): Unit¶
buildDefaultPool
requests the <
Unless already available, buildDefaultPool
creates a <
-
<
> pool name -
FIFO
scheduling mode -
0
for the initial minimum share -
1
for the initial weight
In the end, buildDefaultPool
requests the <
Created default pool: [name], schedulingMode: [mode], minShare: [minShare], weight: [weight]
NOTE: buildDefaultPool
is used exclusively when FairSchedulableBuilder
is requested to <
=== [[buildFairSchedulerPool]] Building Pools from XML Allocations File -- buildFairSchedulerPool
Internal Method
[source, scala]¶
buildFairSchedulerPool( is: InputStream, fileName: String): Unit
buildFairSchedulerPool
starts by loading the XML file from the given InputStream
.
For every pool element, buildFairSchedulerPool
creates a <
-
Pool name per name attribute
-
Scheduling mode per schedulingMode element (case-insensitive with
FIFO
as the default) -
Initial minimum share per minShare element (default:
0
) -
Initial weight per weight element (default:
1
)
In the end, buildFairSchedulerPool
requests the <
Created pool: [name], schedulingMode: [mode], minShare: [minShare], weight: [weight]
NOTE: buildFairSchedulerPool
is used exclusively when FairSchedulableBuilder
is requested to <