Skip to content

ResourceUtils

Registering Task Resource Requests (from SparkConf)

addTaskResourceRequests(
  sparkConf: SparkConf,
  treqs: TaskResourceRequests): Unit

addTaskResourceRequests registers all task resource requests in the given SparkConf with the given TaskResourceRequests.


addTaskResourceRequests listResourceIds with spark.task component name in the given SparkConf.

For every ResourceID discovered, addTaskResourceRequests does the following:

  1. Finds all the settings with the confPrefix
  2. Looks up amount setting (or throws a SparkException)
  3. Registers the resourceName with the amount in the given TaskResourceRequests

addTaskResourceRequests is used when:

Listing All Configured Resources

listResourceIds(
  sparkConf: SparkConf,
  componentName: String): Seq[ResourceID]

listResourceIds requests the given SparkConf to find all Spark settings with the keys with the prefix of the following pattern:

[componentName].resource.
Internals

listResourceIds gets resource-related settings (from SparkConf) with the prefix removed (e.g., spark.my_component.resource.gpu.amount becomes just gpu.amount).

Example
// Use the following to start spark-shell
// ./bin/spark-shell -c spark.my_component.resource.gpu.amount=5

val sparkConf = sc.getConf

// Component names must start with `spark.` prefix
// Spark assumes valid Spark settings start with `spark.` prefix
val componentName = "spark.my_component"

// this is copied verbatim from ResourceUtils.listResourceIds
// Note that `resource` is hardcoded
sparkConf.getAllWithPrefix(s"$componentName.resource.").foreach(println)

// (gpu.amount,5)

listResourceIds asserts that resource settings include a . (dot) to separate their resource names from configs or throws the following SparkException:

You must specify an amount config for resource: [key] config: [componentName].resource.[key]
SPARK-43947

Although the exception says You must specify an amount config for resource, only the dot is checked.

// Use the following to start spark-shell
// 1. No amount config specified
// 2. spark.driver is a Spark built-in resource
// ./bin/spark-shell -c spark.driver.resource.gpu=5

Reported as SPARK-43947.

In the end, listResourceIds creates a ResourceID for every resource (with the givencomponentName and resource names discovered).


listResourceIds is used when:

parseAllResourceRequests

parseAllResourceRequests(
  sparkConf: SparkConf,
  componentName: String): Seq[ResourceRequest]

parseAllResourceRequests...FIXME

When componentName
ResourceProfile spark.executor
ResourceUtils
KubernetesUtils (Spark on Kubernetes)

parseAllResourceRequests is used when:

getOrDiscoverAllResources

getOrDiscoverAllResources(
  sparkConf: SparkConf,
  componentName: String,
  resourcesFileOpt: Option[String]): Map[String, ResourceInformation]

getOrDiscoverAllResources...FIXME

When componentName resourcesFileOpt
SparkContext spark.driver spark.driver.resourcesFile
Worker (Spark Standalone) spark.worker spark.worker.resourcesFile

getOrDiscoverAllResources is used when:

parseAllocatedOrDiscoverResources

parseAllocatedOrDiscoverResources(
  sparkConf: SparkConf,
  componentName: String,
  resourcesFileOpt: Option[String]): Seq[ResourceAllocation]

parseAllocatedOrDiscoverResources...FIXME

parseResourceRequirements (Spark Standalone)

parseResourceRequirements(
  sparkConf: SparkConf,
  componentName: String): Seq[ResourceRequirement]

parseResourceRequirements...FIXME

componentName

componentName seems to be always spark.driver for the use cases that seems to be Spark Standalone only.


parseResourceRequirements is used when: