DiskStore

DiskStore manages data blocks on disk for BlockManager.

DiskStore BlockManager
Figure 1. DiskStore and BlockManager

Creating Instance

DiskStore takes the following to be created:

getBytes Method

getBytes(
  blockId: BlockId): BlockData

getBytes…​FIXME

getBytes is used when BlockManager is requested to getLocalValues and doGetLocalBytes.

blockSizes Internal Registry

blockSizes: ConcurrentHashMap[BlockId, Long]

blockSizes is a Java java.util.concurrent.ConcurrentHashMap that DiskStore uses to track BlockIds by their size on disk.

Checking if Block File Exists

contains(
  blockId: BlockId): Boolean

contains requests the DiskBlockManager for the block file by (the name of) the input BlockId and check whether the file actually exists or not.

contains is used when:

Writing Block to Disk

put(
  blockId: BlockId)(
  writeFunc: WritableByteChannel => Unit): Unit

put prints out the following DEBUG message to the logs:

Attempting to put block [blockId]

put requests the DiskBlockManager for the block file for the input BlockId.

put opens the block file for writing (wrapped into a CountingWritableChannel to count the bytes written).

put executes the given writeFunc function with the WritableByteChannel of the block file and registers the bytes written to the blockSizes internal registry.

In the end, put prints out the following DEBUG message to the logs:

Block [fileName] stored as [size] file on disk in [time] ms

In case of any exception, put deletes the block file.

put throws an IllegalStateException when the BlockId is already is already present in the disk store:

Block [blockId] is already present in the disk store

put is used when:

putBytes Method

putBytes(
  blockId: BlockId,
  bytes: ChunkedByteBuffer): Unit

putBytes…​FIXME

putBytes is used when BlockManager is requested to doPutBytes and dropFromMemory.

Removing Block

remove(
  blockId: BlockId): Boolean

remove…​FIXME

remove is used when:

  • BlockManager is requested to removeBlockInternal

  • DiskStore is requested to put (when an exception was thrown)

Opening Block File For Writing

openForWrite(
  file: File): WritableByteChannel

openForWrite…​FIXME

openForWrite is used when DiskStore is requested to write a block to disk.

Logging

Enable ALL logging level for org.apache.spark.storage.DiskStore logger to see what happens inside.

Add the following line to conf/log4j.properties:

log4j.logger.org.apache.spark.storage.DiskStore=ALL

Refer to Logging.