BlockTransferService

BlockTransferService is the base for ShuffleClients that can fetch and upload blocks of data synchronously or asynchronously.

BlockTransferService is a networking service available by the name of a host and a port.

package org.apache.spark.network

abstract class BlockTransferService extends ShuffleClient {
  // only required methods that have no implementation
  // the others follow
  def init(blockDataManager: BlockDataManager): Unit
  def close(): Unit
  def port: Int
  def hostName: String
  def fetchBlocks(
    host: String,
    port: Int,
    execId: String,
    blockIds: Array[String],
    listener: BlockFetchingListener,
    tempFileManager: TempFileManager): Unit
  def uploadBlock(
    hostname: String,
    port: Int,
    execId: String,
    blockId: BlockId,
    blockData: ManagedBuffer,
    level: StorageLevel,
    classTag: ClassTag[_]): Future[Unit]
}
Table 1. (Subset of) BlockTransferService Contract
Method Description

init

Used when BlockManager is requested to initialize

close

Used when…​FIXME

port

Used when…​FIXME

hostName

Used when…​FIXME

fetchBlocks

Fetches a sequence of blocks from a remote node asynchronously

Used exclusively when BlockTransferService is requested to fetch only one block (in a blocking fashion)

fetchBlocks is part of ShuffleClient Contract to…​FIXME.

uploadBlock

Used exclusively when BlockTransferService is requested to upload a single block to a remote node (in a blocking fashion).

NettyBlockTransferService is the one and only known implementation of BlockTransferService Contract.
BlockTransferService was introduced in SPARK-3019 Pluggable block transfer interface (BlockTransferService) and is available since Spark 1.2.0.

fetchBlockSync Method

fetchBlockSync(
  host: String,
  port: Int,
  execId: String,
  blockId: String,
  tempFileManager: TempFileManager): ManagedBuffer

fetchBlockSync…​FIXME

Synchronous (and hence blocking) fetchBlockSync to fetch one block blockId (that corresponds to the ShuffleClient parent’s asynchronous fetchBlocks).

fetchBlockSync is a mere wrapper around fetchBlocks to fetch one blockId block that waits until the fetch finishes.

fetchBlockSync is used when…​FIXME

Uploading Single Block to Remote Node (Blocking Fashion) — uploadBlockSync Method

uploadBlockSync(
  hostname: String,
  port: Int,
  execId: String,
  blockId: BlockId,
  blockData: ManagedBuffer,
  level: StorageLevel,
  classTag: ClassTag[_]): Unit

uploadBlockSync…​FIXME

uploadBlockSync is a mere blocking wrapper around uploadBlock that waits until the upload finishes.

uploadBlockSync is used exclusively when BlockManager is requested to replicate (when a replication level is greater than 1).