TaskMetrics

TaskMetrics is a collection of metrics (as AccumulatorV2s) tracked during execution of a Task.

TaskMetrics is created when:

TaskMetrics takes no arguments to be created.

TaskMetrics is available using TaskContext.taskMetrics.

Use SparkListener.onTaskEnd to intercept SparkListenerTaskEnd events to access the TaskMetrics of a task that has finished successfully.
Use StatsReportListener for summary statistics at runtime (after a stage completes).
Use EventLoggingListener for post-execution (history) statistics.

TaskMetrics uses accumulators to represent the metrics and offers "increment" methods to increment them.

The local values of the accumulators for a task (as accumulated while the task runs) are sent from the executor to the driver when the task completes (and DAGScheduler re-creates TaskMetrics).
Table 1. Metrics
Property Name Type Description

_memoryBytesSpilled

internal.metrics.memoryBytesSpilled

LongAccumulator

Used in memoryBytesSpilled, incMemoryBytesSpilled

_updatedBlockStatuses

internal.metrics.updatedBlockStatuses

CollectionAccumulator[(BlockId, BlockStatus)]

Used in updatedBlockStatuses, recording updated BlockStatus for a Block, setUpdatedBlockStatuses

Table 2. TaskMetrics’s Internal Registries and Counters
Name Description

nameToAccums

Internal accumulators indexed by their names.

Used when TaskMetrics re-creates TaskMetrics from AccumulatorV2s, …​FIXME

NOTE: nameToAccums is a transient and lazy value.

internalAccums

Collection of internal AccumulatorV2 objects.

Used when…​FIXME

NOTE: internalAccums is a transient and lazy value.

externalAccums

Collection of external AccumulatorV2 objects.

Used when TaskMetrics re-creates TaskMetrics from AccumulatorV2s, …​FIXME

NOTE: externalAccums is a transient and lazy value.

accumulators Method

FIXME

mergeShuffleReadMetrics Method

FIXME

memoryBytesSpilled Method

FIXME

updatedBlockStatuses Method

FIXME

setExecutorCpuTime Method

FIXME

setResultSerializationTime Method

FIXME

setJvmGCTime Method

FIXME

setExecutorRunTime Method

FIXME

setExecutorDeserializeCpuTime Method

FIXME

setExecutorDeserializeTime Method

FIXME

setUpdatedBlockStatuses Method

FIXME

Re-Creating TaskMetrics From AccumulatorV2s — fromAccumulators Method

fromAccumulators(accums: Seq[AccumulatorV2[_, _]]): TaskMetrics

fromAccumulators creates a new TaskMetrics and registers accums as internal and external task metrics (using nameToAccums internal registry).

Internally, fromAccumulators creates a new TaskMetrics. It then splits accums into internal and external task metrics collections (using nameToAccums internal registry).

For every internal task metrics, fromAccumulators finds the metrics in nameToAccums internal registry (of the new TaskMetrics instance), copies metadata, and merges state.

fromAccumulators is used exclusively when DAGScheduler gets notified that a task has finished (and re-creates TaskMetrics).

Increasing Memory Bytes Spilled — incMemoryBytesSpilled Method

incMemoryBytesSpilled(v: Long): Unit

incMemoryBytesSpilled adds v to _memoryBytesSpilled task metrics.

Recording Updated BlockStatus For Block — incUpdatedBlockStatuses Method

incUpdatedBlockStatuses(v: (BlockId, BlockStatus)): Unit

incUpdatedBlockStatuses adds v in _updatedBlockStatuses internal registry.

incUpdatedBlockStatuses is used exclusively when BlockManager does addUpdatedBlockStatusToTaskMetrics.

Registering Internal Accumulators — register Method

register(sc: SparkContext): Unit

register registers the internal accumulators (from nameToAccums internal registry) with countFailedValues enabled (true).

register is used exclusively when Stage is requested for its new attempt.

empty Factory Method

empty: TaskMetrics

empty…​FIXME

empty is used when:

registered Factory Method

registered: TaskMetrics

registered…​FIXME

registered is used exclusively when Task is created.

fromAccumulatorInfos Factory Method

fromAccumulatorInfos(infos: Seq[AccumulableInfo]): TaskMetrics

fromAccumulatorInfos…​FIXME

fromAccumulatorInfos is used exclusively when AppStatusListener is requested to onExecutorMetricsUpdate (for Spark History Server only).

fromAccumulators Factory Method

fromAccumulators(accums: Seq[AccumulatorV2[_, _]]): TaskMetrics

fromAccumulators…​FIXME

fromAccumulators is used exclusively when DAGScheduler is requested to postTaskEnd.