MapOutputStatistics¶
MapOutputStatistics holds statistics about the output partition sizes in a map stage.
MapOutputStatistics is the result of executing the following (currently internal APIs):
SparkContextis requested to submitMapStageDAGScheduleris requested to submitMapStage
Creating Instance¶
MapOutputStatistics takes the following to be created:
- Shuffle Id (of a ShuffleDependency)
- Output Partition Sizes (
Array[Long])
MapOutputStatistics is created when:
MapOutputTrackerMasteris requested for the statistics (of a ShuffleDependency)