Serializer¶
Serializer is an abstraction of serializers for serialization and deserialization of tasks (closures) and data blocks in a Spark application.
Contract¶
Creating New SerializerInstance¶
newInstance(): SerializerInstance
Creates a new SerializerInstance
Used when:
Taskis created (only used in tests)SerializerSupport(Spark SQL) utility is used tonewSerializerRangePartitioneris requested to writeObject and readObjectTorrentBroadcastutility is used to blockifyObject and unBlockifyObjectTaskRunneris requested to runNettyBlockRpcServeris requested to deserializeMetadataNettyBlockTransferServiceis requested to uploadBlockPairRDDFunctionsis requested to...FIXMEParallelCollectionPartitionis requested to...FIXMERDDis requested to...FIXMEReliableCheckpointRDDutility is used to...FIXMENettyRpcEnvFactoryis requested to create a RpcEnvDAGScheduleris created- others
Implementations¶
JavaSerializer- KryoSerializer
UnsafeRowSerializer(Spark SQL)
Accessing Serializer¶
Serializer is available using SparkEnv as the closureSerializer and serializer.
closureSerializer¶
SparkEnv.get.closureSerializer
serializer¶
SparkEnv.get.serializer
Serialized Objects Relocation Requirements¶
supportsRelocationOfSerializedObjects: Boolean
supportsRelocationOfSerializedObjects is disabled (false) by default.
supportsRelocationOfSerializedObjects is used when:
BlockStoreShuffleReaderis requested to fetchContinuousBlocksInBatchSortShuffleManageris requested to create a ShuffleHandle for a given ShuffleDependency (and checks out SerializedShuffleHandle requirements)