Serializer¶
Serializer
is an abstraction of serializers for serialization and deserialization of tasks (closures) and data blocks in a Spark application.
Contract¶
Creating New SerializerInstance¶
newInstance(): SerializerInstance
Creates a new SerializerInstance
Used when:
Task
is created (only used in tests)SerializerSupport
(Spark SQL) utility is used tonewSerializer
RangePartitioner
is requested to writeObject and readObjectTorrentBroadcast
utility is used to blockifyObject and unBlockifyObjectTaskRunner
is requested to runNettyBlockRpcServer
is requested to deserializeMetadataNettyBlockTransferService
is requested to uploadBlockPairRDDFunctions
is requested to...FIXMEParallelCollectionPartition
is requested to...FIXMERDD
is requested to...FIXMEReliableCheckpointRDD
utility is used to...FIXMENettyRpcEnvFactory
is requested to create a RpcEnvDAGScheduler
is created- others
Implementations¶
JavaSerializer
- KryoSerializer
UnsafeRowSerializer
(Spark SQL)
Accessing Serializer¶
Serializer
is available using SparkEnv as the closureSerializer and serializer.
closureSerializer¶
SparkEnv.get.closureSerializer
serializer¶
SparkEnv.get.serializer
Serialized Objects Relocation Requirements¶
supportsRelocationOfSerializedObjects: Boolean
supportsRelocationOfSerializedObjects
is disabled (false
) by default.
supportsRelocationOfSerializedObjects
is used when:
BlockStoreShuffleReader
is requested to fetchContinuousBlocksInBatchSortShuffleManager
is requested to create a ShuffleHandle for a given ShuffleDependency (and checks out SerializedShuffleHandle requirements)