ExternalAppendOnlyMap

ExternalAppendOnlyMap is a Spillable of SizeTrackers.

ExternalAppendOnlyMap[K, V, C] is a parameterized type of K keys, V values, and C combiner (partial) values.

Creating Instance

ExternalAppendOnlyMap takes the following to be created:

ExternalAppendOnlyMap is created when:

SizeTrackingAppendOnlyMap

ExternalAppendOnlyMap manages a SizeTrackingAppendOnlyMap.

A SizeTrackingAppendOnlyMap is created immediately when ExternalAppendOnlyMap is and every time when insertAll and forceSpill spilled to disk.

SizeTrackingAppendOnlyMap are dereferenced (nulled) for the memory to be garbage-collected when forceSpill and freeCurrentMap.

SizeTrackingAppendOnlyMap is used when insertAll, spill, forceSpill and iterator.

Inserting All Key-Value Pairs (from Iterator)

insertAll(
  entries: Iterator[Product2[K, V]]): Unit

insertAll creates an update function that uses the mergeValue function for an existing value or the createCombiner function for a new value.

For every key-value pair (from the input iterator), insertAll does the following:

Usage

insertAll is used when:

Requirements

insertAll throws an IllegalStateException when the currentMap internal registry is null:

Cannot insert new elements into a map after calling iterator

Iterator of "Combined" Pairs

iterator: Iterator[(K, C)]

iterator…​FIXME

iterator is used when…​FIXME

Spilling to Disk if Necessary

spill(
  collection: SizeTracker): Unit

spill…​FIXME

spill is used when…​FIXME

Forcing Disk Spilling

forceSpill(): Boolean

forceSpill returns a flag to indicate whether spilling to disk has really happened (true) or not (false).

forceSpill branches off per the current state it is in (and should rather use a state-aware implementation).

When a SpillableIterator is in use, forceSpill requests it to spill and, if it did, dereferences (nullify) the SizeTrackingAppendOnlyMap. forceSpill returns whatever the spilling of the SpillableIterator returned.

When there is at least one element in the SizeTrackingAppendOnlyMap, forceSpill spills it. forceSpill then creates a new SizeTrackingAppendOnlyMap and always returns true.

In other cases, forceSpill simply returns false.

forceSpill is part of the Spillable abstraction.

Freeing Up SizeTrackingAppendOnlyMap and Releasing Memory

freeCurrentMap(): Unit

freeCurrentMap dereferences (nullify) the SizeTrackingAppendOnlyMap (if there still was one) followed by releasing all memory.

freeCurrentMap is used when SpillableIterator is requested to destroy itself.

spillMemoryIteratorToDisk Method

spillMemoryIteratorToDisk(
  inMemoryIterator: Iterator[(K, C)]): DiskMapIterator

spillMemoryIteratorToDisk…​FIXME

spillMemoryIteratorToDisk is used when…​FIXME