Skip to content


ShuffleStatus is a registry of MapStatuses per Partition of a ShuffleMapStage.

ShuffleStatus is used by MapOutputTrackerMaster.

Creating Instance

ShuffleStatus takes the following to be created:

ShuffleStatus is created when:

MapStatuses per Partition

ShuffleStatus creates a mapStatuses internal registry of MapStatuses per partition (using the numPartitions) when created.

A missing partition is when there is no MapStatus for a partition (null at the index of the partition ID) and can be requested using findMissingPartitions.

mapStatuses is all null (for every partition) initially (and so all partitions are missing / uncomputed).

A new MapStatus is added in addMapOutput and updateMapOutput.

A MapStatus is removed (nulled) in removeMapOutput and removeOutputsByFilter.

The number of available MapStatuses is tracked by _numAvailableMapOutputs internal counter.

Used when:

Registering Shuffle Map Output

  mapIndex: Int,
  status: MapStatus): Unit

addMapOutput adds the MapStatus to the mapStatuses internal registry.

In case the mapStatuses internal registry had no MapStatus for the mapIndex already available, addMapOutput increments the _numAvailableMapOutputs internal counter and invalidateSerializedMapOutputStatusCache.

addMapOutput is used when:

Deregistering Shuffle Map Output

  mapIndex: Int,
  bmAddress: BlockManagerId): Unit


removeMapOutput is used when:

Missing Partitions

findMissingPartitions(): Seq[Int]


findMissingPartitions is used when:

Serializing Shuffle Map Output Statuses

  broadcastManager: BroadcastManager,
  isLocal: Boolean,
  minBroadcastSize: Int,
  conf: SparkConf): Array[Byte]


serializedMapStatus is used when:


Enable ALL logging level for org.apache.spark.ShuffleStatus logger to see what happens inside.

Add the following line to conf/

Refer to Logging.