ShuffleStatus

ShuffleStatus is a registry of shuffle map outputs (of a shuffle stage).

ShuffleStatus is managed by MapOutputTrackerMaster to keep track of shuffle map outputs across shuffle stages.

Creating Instance

ShuffleStatus takes a single number of partitions to be created.

Registering Shuffle Map Output

addMapOutput(
  mapId: Int,
  status: MapStatus): Unit

addMapOutput…​FIXME

addMapOutput is used when MapOutputTrackerMaster is requested to register a shuffle map output.

Deregistering Shuffle Map Output

removeMapOutput(
  mapId: Int,
  bmAddress: BlockManagerId): Unit

removeMapOutput…​FIXME

removeMapOutput is used when MapOutputTrackerMaster is requested to unregister a shuffle map output.

Serializing Shuffle Map Output Statuses

serializedMapStatus(
  broadcastManager: BroadcastManager,
  isLocal: Boolean,
  minBroadcastSize: Int): Array[Byte]

serializedMapStatus…​FIXME

serializedMapStatus is used when MapOutputTrackerMaster is requested to send the map output locations of a shuffle (on the MessageLoop dispatcher thread).

Finding Missing Partitions

findMissingPartitions(): Seq[Int]

findMissingPartitions…​FIXME

findMissingPartitions is used when MapOutputTrackerMaster is requested for missing partitions (that need to be computed).

Invalidating Serialized Map Output Status Cache

invalidateSerializedMapOutputStatusCache(): Unit

invalidateSerializedMapOutputStatusCache…​FIXME

invalidateSerializedMapOutputStatusCache is used when:

Deregistering Shuffle Map Outputs by Filter

removeOutputsByFilter(
  f: (BlockManagerId) => Boolean): Unit

removeOutputsByFilter…​FIXME

removeOutputsByFilter is used when:

Deregistering Shuffle Map Outputs Associated with Executor

removeOutputsOnExecutor(
  execId: String): Unit

removeOutputsOnExecutor…​FIXME

removeOutputsOnExecutor is used when MapOutputTrackerMaster is requested to delete shuffle outputs associated with an executor.

Deregistering Shuffle Map Outputs Associated with Host

removeOutputsOnHost(
  host: String): Unit

removeOutputsOnHost…​FIXME

removeOutputsOnHost is used when MapOutputTrackerMaster is requested to delete shuffle outputs associated with a host.