ResourceProfileManager¶
ResourceProfileManager manages ResourceProfiles.
Creating Instance¶
ResourceProfileManager takes the following to be created:
ResourceProfileManager is created when:
SparkContextis created
Accessing ResourceProfileManager¶
ResourceProfileManager is available to other Spark services using SparkContext.
Registered ResourceProfiles¶
resourceProfileIdToResourceProfile: HashMap[Int, ResourceProfile]
ResourceProfileManager creates resourceProfileIdToResourceProfile registry of ResourceProfiles by their ID.
A new ResourceProfile is added when addResourceProfile.
ResourceProfiles are resolved (looked up) using resourceProfileFromId.
ResourceProfiles can be equivalent when they specify the same resources.
resourceProfileIdToResourceProfile is used when:
Default ResourceProfile¶
ResourceProfileManager gets or creates the default ResourceProfile when created and registers it immediately.
The default profile is available as defaultResourceProfile.
Accessing Default ResourceProfile¶
defaultResourceProfile: ResourceProfile
defaultResourceProfile returns the default ResourceProfile.
defaultResourceProfile is used when:
ExecutorAllocationManageris createdSparkContextis requested to requestTotalExecutors and createTaskSchedulerDAGScheduleris requested to mergeResourceProfilesForStageCoarseGrainedSchedulerBackendis requested to requestExecutorsStandaloneSchedulerBackend(Spark Standalone) is createdKubernetesClusterSchedulerBackend(Spark on Kubernetes) is createdMesosCoarseGrainedSchedulerBackend(Spark on Mesos) is created
Registering ResourceProfile¶
addResourceProfile(
rp: ResourceProfile): Unit
addResourceProfile checks if the given ResourceProfile is supported.
addResourceProfile registers the given ResourceProfile (in the resourceProfileIdToResourceProfile registry) unless done earlier (by ResourceProfile ID).
With a new ResourceProfile, addResourceProfile requests the given ResourceProfile for the limiting resource (for no reason but to calculate it upfront) and prints out the following INFO message to the logs:
Added ResourceProfile id: [id]
In the end (for a new ResourceProfile), addResourceProfile requests the LiveListenerBus to post a SparkListenerResourceProfileAdded.
addResourceProfile is used when:
- RDD.withResources operator is used
ResourceProfileManageris created (and registers the default profile)DAGScheduleris requested to mergeResourceProfilesForStage
Dynamic Allocation¶
ResourceProfileManager initializes dynamicEnabled flag to be isDynamicAllocationEnabled when created.
dynamicEnabled flag is used when:
isSupported¶
isSupported(
rp: ResourceProfile): Boolean
isSupported...FIXME
canBeScheduled¶
canBeScheduled(
taskRpId: Int,
executorRpId: Int): Boolean
canBeScheduled asserts that the given taskRpId and executorRpId are valid ResourceProfile IDs or throws an AssertionError:
Tasks and executors must have valid resource profile id
canBeScheduled finds the ResourceProfile.
canBeScheduled holds positive (true) when either holds:
- The given
taskRpIdandexecutorRpIdare the same - Dynamic Allocation is disabled and the
ResourceProfileis a TaskResourceProfile
canBeScheduled is used when:
TaskSchedulerImplis requested to resourceOfferSingleTaskSet and calculateAvailableSlots
Logging¶
Enable ALL logging level for org.apache.spark.resource.ResourceProfileManager logger to see what happens inside.
Add the following line to conf/log4j2.properties:
logger.ResourceProfileManager.name = org.apache.spark.resource.ResourceProfileManager
logger.ResourceProfileManager.level = all
Refer to Logging.