MapOutputStatistics¶
MapOutputStatistics
holds statistics about the output partition sizes in a map stage.
MapOutputStatistics
is the result of executing the following (currently internal APIs):
SparkContext
is requested to submitMapStageDAGScheduler
is requested to submitMapStage
Creating Instance¶
MapOutputStatistics
takes the following to be created:
- Shuffle Id (of a ShuffleDependency)
- Output Partition Sizes (
Array[Long]
)
MapOutputStatistics
is created when:
MapOutputTrackerMaster
is requested for the statistics (of a ShuffleDependency)