ShuffleWriteMetrics

ShuffleWriteMetrics is a collection of accumulators that represents task metrics about writing shuffle data.

ShuffleWriteMetrics tracks the following task metrics:

Accumulators allow tasks (running on executors) to communicate with the driver.
Table 1. ShuffleWriteMetrics’s Accumulators
Name Description

_bytesWritten

Accumulator to track how many shuffle bytes were written in a shuffle task.

Used when ShuffleWriteMetrics is requested the shuffle bytes written and to increment or decrement it.

NOTE: _bytesWritten is available as internal.metrics.shuffle.write.bytesWritten (internally shuffleWrite.BYTES_WRITTEN) in TaskMetrics.

_writeTime

Accumulator to track shuffle write time (as 64-bit integer) of a shuffle task.

Used when ShuffleWriteMetrics is requested the shuffle write time and to increment it.

NOTE: _writeTime is available as internal.metrics.shuffle.write.writeTime (internally shuffleWrite.WRITE_TIME) in TaskMetrics.

_recordsWritten

Accumulator to track how many shuffle records were written in a shuffle task.

Used when ShuffleWriteMetrics is requested the shuffle records written and to increment or decrement it.

NOTE: _recordsWritten is available as internal.metrics.shuffle.write.recordsWritten (internally shuffleWrite.RECORDS_WRITTEN) in TaskMetrics.

decRecordsWritten Method

FIXME

decBytesWritten Method

FIXME

writeTime Method

FIXME

recordsWritten Method

FIXME

Returning Number of Shuffle Bytes Written — bytesWritten Method

bytesWritten: Long

bytesWritten represents the shuffle bytes written metrics of a shuffle task.

Internally, bytesWritten returns the sum of _bytesWritten internal accumulator.

Incrementing Shuffle Bytes Written Metrics — incBytesWritten Method

incBytesWritten(v: Long): Unit

incBytesWritten simply adds v to _bytesWritten internal accumulator.

Incrementing Shuffle Write Time Metrics — incWriteTime Method

incWriteTime(v: Long): Unit

incWriteTime simply adds v to _writeTime internal accumulator.

incWriteTime is used when:

  1. SortShuffleWriter stops.

  2. BypassMergeSortShuffleWriter writes records (i.e. when it initializes DiskBlockObjectWriter partition writers) and later when concatenates per-partition files into a single file.

  3. UnsafeShuffleWriter does mergeSpillsWithTransferTo.

  4. DiskBlockObjectWriter does commitAndGet (but only when syncWrites flag is enabled that forces outstanding writes to disk).

  5. JsonProtocol creates TaskMetrics from JSON

  6. TimeTrackingOutputStream does its operation (after all it is an output stream to track shuffle write time).

Incrementing Shuffle Records Written Metrics — incRecordsWritten Method

incRecordsWritten(v: Long): Unit

incRecordsWritten simply adds v to _recordsWritten internal accumulator.