CompressionCodec is an IO compression codec.

With spark.broadcast.compress enabled (which is the default), TorrentBroadcast uses compression for broadcast blocks.

FIXME What’s compressed?
Table 1. Built-in Compression Codecs
Codec Alias Fully-Qualified Class Name Notes


The default implementation



The fallback when the default codec is not available.

An implementation of CompressionCodec trait has to offer a constructor that accepts a single argument being SparkConf. Read Creating CompressionCodec — createCodec Factory Method in this document.

You can control the default compression codec in a Spark application using Spark property.

Creating CompressionCodec — createCodec Factory Method

createCodec(conf: SparkConf): CompressionCodec  (1)
createCodec(conf: SparkConf, codecName: String): CompressionCodec (2)

createCodec uses the internal shortCompressionCodecNames lookup table to find the input codecName (regardless of the case).

createCodec finds the constructor of the compression codec’s implementation (that accepts a single argument being SparkConf).

If a compression codec could not be found, createCodec throws a IllegalArgumentException exception:

Codec [<codecName>] is not available. Consider setting

getCodecName Method

getCodecName(conf: SparkConf): String

getCodecName reads Spark property from the input conf SparkConf or assumes lz4.

supportsConcatenationOfSerializedStreams Method

  codec: CompressionCodec): Boolean


supportsConcatenationOfSerializedStreams is used when…​FIXME


Table 2. Settings
Name Default value Description


The compression codec to use.

Used when getCodecName is called to find the current compression codec.