SerializerManager

FIXME

When SparkEnv is created (either for the driver or executors), it instantiates SerializerManager that is then used to create a BlockManager.

The common idiom in Spark’s code is to access the current SerializerManager using SparkEnv.

SparkEnv.get.serializerManager
SerializerManager was introduced in SPARK-13926.

Creating SerializerManager Instance

FIXME

wrapStream Method

FIXME

dataDeserializeStream Method

FIXME

Automatic Selection of Best Serializer

FIXME

SerializerManager will automatically pick a Kryo serializer for ShuffledRDDs whose key, value, and/or combiner types are primitives, arrays of primitives, or strings.

Selecting "Best" Serializer — getSerializer Method

getSerializer(keyClassTag: ClassTag[_], valueClassTag: ClassTag[_]): Serializer

getSerializer selects the "best" Serializer given the input types for keys and values (in a RDD).

getSerializer returns KryoSerializer when the types of keys and values are compatible with Kryo or the default Serializer.

The default Serializer is defined when SerializerManager is created.
getSerializer is used when ShuffledRDD is requested for dependencies.

shouldCompress Internal Method

shouldCompress(blockId: BlockId): Boolean

shouldCompress…​FIXME

shouldCompress is used when…​FIXME

Settings

Table 1. Spark Properties
Name Default value Description

spark.shuffle.compress

true

The flag to control whether to compress shuffle output when stored

spark.rdd.compress

false

The flag to control whether to compress RDD partitions when stored serialized.

spark.shuffle.spill.compress

true

The flag to control whether to compress shuffle output temporarily spilled to disk.

spark.block.failures.beforeLocationRefresh

5

spark.io.encryption.enabled

false

The flag to enable IO encryption