MemoryStore manages blocks of data in memory for BlockManager.
MemoryStore is available using BlockManager.memoryStore reference to other Spark services.
import org.apache.spark.SparkEnv SparkEnv.get.blockManager.memoryStore
releaseUnrollMemoryForThisTask( memoryMode: MemoryMode, memory: Long = Long.MaxValue): Unit
releaseUnrollMemoryForThisTask is used when:
putIteratorAsBytes[T]( blockId: BlockId, values: Iterator[T], classTag: ClassTag[T], memoryMode: MemoryMode): Either[PartiallySerializedBlock[T], Long]
putIteratorAsBytes is used when BlockManager is requested to doPutIterator.
remove( blockId: BlockId): Boolean
Block [blockId] of size [size] dropped from memory (free [memory])
putBytes[T: ClassTag]( blockId: BlockId, size: Long, memoryMode: MemoryMode, _bytes: () => ChunkedByteBuffer): Boolean
Internally, putBytes first makes sure that
blockId block has not been registered already in entries internal registry.
putBytes then requests
size memory for the
blockId block in a given
memoryMode from the current
If successful, putBytes "materializes"
_bytes byte buffer and makes sure that the size is exactly
size. It then registers a
SerializedMemoryEntry (for the bytes and
blockId in the internal entries registry.
You should see the following INFO message in the logs:
Block [blockId] stored as bytes in memory (estimated size [size], free [bytes])
true only after
blockId was successfully registered in the internal entries registry.
contains( blockId: BlockId): Boolean
contains is positive (
true) when the entries internal registry contains
contains is used when…FIXME
putIteratorAsValues[T]( blockId: BlockId, values: Iterator[T], classTag: ClassTag[T]): Either[PartiallyUnrolledIterator[T], Long]
putIteratorAsValues makes sure that the
BlockId does not exist or throws an
requirement failed: Block [blockId] is already present in the MemoryStore
putIteratorAsValues tries to put the
blockId block in memory store as
reserveUnrollMemoryForThisTask( blockId: BlockId, memory: Long, memoryMode: MemoryMode): Boolean
logUnrollFailureMessage( blockId: BlockId, finalVectorSize: Long): Unit
logUnrollFailureMessage is used when MemoryStore is requested to putIterator.
logMemoryUsage is used when MemoryStore is requested to logUnrollFailureMessage.
ALL logging level for
org.apache.spark.storage.memory.MemoryStore logger to see what happens inside.
Add the following line to
Refer to Logging.
entries: LinkedHashMap[BlockId, MemoryEntry[_]]
entries uses access-order ordering mode where the order of iteration is the order in which the entries were last accessed (from least-recently accessed to most-recently). That gives LRU cache behaviour when MemoryStore is requested to evict blocks.