Spillable¶
Spillable is an extension of the MemoryConsumer abstraction for spillable collections that can spill to disk.
Spillable[C] is a parameterized type of C combiner (partial) values.
Contract¶
forceSpill¶
forceSpill(): Boolean
Force spilling the current in-memory collection to disk to release memory.
Used when Spillable is requested to spill
spill¶
spill(
collection: C): Unit
Spills the current in-memory collection to disk, and releases the memory.
Used when:
ExternalAppendOnlyMapis requested to forceSpillSpillableis requested to spilling to disk if necessary
Implementations¶
Memory Threshold¶
Spillable uses a threshold for the memory size (in bytes) to know when to spill to disk.
When the size of the in-memory collection is above the threshold, Spillable will try to acquire more memory. Unless given all requested memory, Spillable spills to disk.
The memory threshold starts as spark.shuffle.spill.initialMemoryThreshold configuration property and is increased every time Spillable is requested to spill to disk if needed, but managed to acquire required memory. The threshold goes back to the initial value when requested to release all memory.
Used when Spillable is requested to spill and releaseMemory.
Creating Instance¶
Spillable takes the following to be created:
Abstract Class
Spillable is an abstract class and cannot be created directly. It is created indirectly for the concrete Spillables.
Configuration Properties¶
spark.shuffle.spill.numElementsForceSpillThreshold¶
Spillable uses spark.shuffle.spill.numElementsForceSpillThreshold configuration property to force spilling in-memory objects to disk when requested to maybeSpill.
spark.shuffle.spill.initialMemoryThreshold¶
Spillable uses spark.shuffle.spill.initialMemoryThreshold configuration property as the initial threshold for the size of a collection (and the minimum memory required to operate properly).
Spillable uses it when requested to spill and releaseMemory.
Releasing All Memory¶
releaseMemory(): Unit
releaseMemory...FIXME
releaseMemory is used when:
ExternalAppendOnlyMapis requested to freeCurrentMapExternalSorteris requested to stopSpillableis requested to maybeSpill and spill (and spilled to disk in either case)
Spilling In-Memory Collection to Disk (to Release Memory)¶
spill(
collection: C): Unit
spill spills the given in-memory collection to disk to release memory.
spill is used when:
ExternalAppendOnlyMapis requested to forceSpillSpillableis requested to maybeSpill
forceSpill¶
forceSpill(): Boolean
forceSpill forcefully spills the Spillable to disk to release memory.
forceSpill is used when Spillable is requested to spill an in-memory collection to disk.
Spilling to Disk if Necessary¶
maybeSpill(
collection: C,
currentMemory: Long): Boolean
maybeSpill...FIXME
maybeSpill is used when:
ExternalAppendOnlyMapis requested to insertAllExternalSorteris requested to attempt to spill an in-memory collection to disk if needed