FairSchedulableBuilder takes the following to be created:
FairSchedulableBuilder uses the pools defined in an allocation pools configuration file that is assumed to be the value of the spark.scheduler.allocation.file configuration property or the default fairscheduler.xml (that is expected to be available on a Spark application’s class path).
|Use conf/fairscheduler.xml.template as a template for the allocation pools configuration file.|
Use SparkContext.setLocalProperty to set properties per thread (aka local properties) to group jobs in logical groups, e.g. to allow
scala> :type sc org.apache.spark.SparkContext sc.setLocalProperty("spark.scheduler.pool", "production") // whatever is executed afterwards is submitted to production pool
Add the following line to
Refer to Logging.
The allocation pools configuration file is an XML file.
conf/fairscheduler.xml.template is as follows:
<?xml version="1.0"?> <allocations> <pool name="production"> <schedulingMode>FAIR</schedulingMode> <weight>1</weight> <minShare>2</minShare> </pool> <pool name="test"> <schedulingMode>FIFO</schedulingMode> <weight>2</weight> <minShare>3</minShare> </pool> </allocations>
The top-level element’s name
buildPools creates Fair Scheduler pools from a configuration file if available and then builds the default pool.
buildPools prints out the following INFO message to the logs when the configuration file (per the spark.scheduler.allocation.file configuration property) could be read:
Creating Fair Scheduler pools from [file]
buildPools prints out the following INFO message to the logs when the spark.scheduler.allocation.file configuration property was not used to define the configuration file and the default configuration file is used instead:
Creating Fair Scheduler pools from default file: [DEFAULT_SCHEDULER_FILE]
Fair Scheduler configuration file not found so jobs will be scheduled in FIFO order. To use fair scheduling, configure pools in [DEFAULT_SCHEDULER_FILE] or set spark.scheduler.allocation.file to a file that contains the configuration.
addTaskSetManager(manager: Schedulable, properties: Properties): Unit
addTaskSetManager creates a new Pool with the default configuration (as if the default pool were used) and requests the Pool to register it. In the end,
addTaskSetManager prints out the following WARN message to the logs:
A job was submitted with scheduler pool [poolName], which has not been configured. This can happen when the file that pools are read from isn't set, or when that file doesn't contain [poolName]. Created [poolName] with default configuration (schedulingMode: [mode], minShare: [minShare], weight: [weight])
In the end,
addTaskSetManager prints out the following INFO message to the logs:
Added task set [name] tasks to pool [poolName]
Unless already available,
buildDefaultPool creates a schedulable pool with the following:
default pool name
0for the initial minimum share
1for the initial weight
Created default pool: [name], schedulingMode: [mode], minShare: [minShare], weight: [weight]
buildFairSchedulerPool( is: InputStream, fileName: String): Unit
buildFairSchedulerPool starts by loading the XML file from the given
For every pool element,
buildFairSchedulerPool creates a schedulable pool with the following:
Pool name per name attribute
Scheduling mode per schedulingMode element (case-insensitive with
FIFOas the default)
Initial minimum share per minShare element (default:
Initial weight per weight element (default:
Created pool: [name], schedulingMode: [mode], minShare: [minShare], weight: [weight]