Spillable¶
Spillable
is an extension of the MemoryConsumer abstraction for spillable collections that can spill to disk.
Spillable[C]
is a parameterized type of C
combiner (partial) values.
Contract¶
forceSpill¶
forceSpill(): Boolean
Force spilling the current in-memory collection to disk to release memory.
Used when Spillable
is requested to spill
spill¶
spill(
collection: C): Unit
Spills the current in-memory collection to disk, and releases the memory.
Used when:
ExternalAppendOnlyMap
is requested to forceSpillSpillable
is requested to spilling to disk if necessary
Implementations¶
Memory Threshold¶
Spillable
uses a threshold for the memory size (in bytes) to know when to spill to disk.
When the size of the in-memory collection is above the threshold, Spillable
will try to acquire more memory. Unless given all requested memory, Spillable
spills to disk.
The memory threshold starts as spark.shuffle.spill.initialMemoryThreshold configuration property and is increased every time Spillable
is requested to spill to disk if needed, but managed to acquire required memory. The threshold goes back to the initial value when requested to release all memory.
Used when Spillable
is requested to spill and releaseMemory.
Creating Instance¶
Spillable
takes the following to be created:
Abstract Class
Spillable
is an abstract class and cannot be created directly. It is created indirectly for the concrete Spillables.
Configuration Properties¶
spark.shuffle.spill.numElementsForceSpillThreshold¶
Spillable
uses spark.shuffle.spill.numElementsForceSpillThreshold configuration property to force spilling in-memory objects to disk when requested to maybeSpill.
spark.shuffle.spill.initialMemoryThreshold¶
Spillable
uses spark.shuffle.spill.initialMemoryThreshold configuration property as the initial threshold for the size of a collection (and the minimum memory required to operate properly).
Spillable
uses it when requested to spill and releaseMemory.
Releasing All Memory¶
releaseMemory(): Unit
releaseMemory
...FIXME
releaseMemory
is used when:
ExternalAppendOnlyMap
is requested to freeCurrentMapExternalSorter
is requested to stopSpillable
is requested to maybeSpill and spill (and spilled to disk in either case)
Spilling In-Memory Collection to Disk (to Release Memory)¶
spill(
collection: C): Unit
spill
spills the given in-memory collection to disk to release memory.
spill
is used when:
ExternalAppendOnlyMap
is requested to forceSpillSpillable
is requested to maybeSpill
forceSpill¶
forceSpill(): Boolean
forceSpill
forcefully spills the Spillable to disk to release memory.
forceSpill
is used when Spillable
is requested to spill an in-memory collection to disk.
Spilling to Disk if Necessary¶
maybeSpill(
collection: C,
currentMemory: Long): Boolean
maybeSpill
...FIXME
maybeSpill
is used when:
ExternalAppendOnlyMap
is requested to insertAllExternalSorter
is requested to attempt to spill an in-memory collection to disk if needed