BroadcastManager

BroadcastManager is a Spark service to manage Broadcast Variables in a Spark application.

BroadcastManager
Figure 1. BroadcastManager, SparkEnv and BroadcastFactory

BroadcastManager assigns unique identifiers to broadcast variables.

BroadcastManager is used to create a MapOutputTrackerMaster

Creating Instance

BroadcastManager takes the following to be created:

When created, BroadcastManager initializes.

BroadcastManager is created when SparkEnv is created (for the driver and executors and hence the need for the isDriver flag).

isDriver Flag

BroadcastManager is given isDriver flag when created.

The isDriver flag indicates whether the initialization happens on the driver (true) or executors (false).

BroadcastManager uses the flag when requested to initialize for the TorrentBroadcastFactory to initialize.

TorrentBroadcastFactory

BroadcastManager manages a BroadcastFactory:

  • It is created and initialized in initialize

  • It is stopped in stop (and that is all it does)

BroadcastManager uses the BroadcastFactory when requested to newBroadcast and unbroadcast.

cachedValues Registry

cachedValues: ReferenceMap

Unique Identifiers of Broadcast Variables

BroadcastManager tracks broadcast variables and controls their identifiers.

Every newBroadcast is given a new and unique identifier.

Initializing BroadcastManager

initialize(): Unit

initialize creates a TorrentBroadcastFactory and requests it to initialize.

initialize turns initialized internal flag on to guard against multiple initializations. With the initialized flag already enabled, initialize does nothing.

initialize is used once when BroadcastManager is created.

Stopping BroadcastManager

stop(): Unit

stop requests the BroadcastFactory to stop.

Creating Broadcast Variable

newBroadcast[T](
  value_ : T,
  isLocal: Boolean): Broadcast[T]

The BroadcastFactory is created when BroadcastManager is initialized.

newBroadcast is used when: