CompressionCodec

CompressionCodec is an IO compression codec.

With spark.broadcast.compress enabled (which is the default), TorrentBroadcast uses compression for broadcast blocks.

FIXME What’s compressed?
Table 1. Built-in Compression Codecs
Codec Alias Fully-Qualified Class Name Notes

lz4

org.apache.spark.io.LZ4CompressionCodec

The default implementation

lzf

org.apache.spark.io.LZFCompressionCodec

snappy

org.apache.spark.io.SnappyCompressionCodec

The fallback when the default codec is not available.

An implementation of CompressionCodec trait has to offer a constructor that accepts a single argument being SparkConf. Read Creating CompressionCodec — createCodec Factory Method in this document.

You can control the default compression codec in a Spark application using spark.io.compression.codec Spark property.

Creating CompressionCodec — createCodec Factory Method

createCodec(conf: SparkConf): CompressionCodec  (1)
createCodec(conf: SparkConf, codecName: String): CompressionCodec (2)

createCodec uses the internal shortCompressionCodecNames lookup table to find the input codecName (regardless of the case).

createCodec finds the constructor of the compression codec’s implementation (that accepts a single argument being SparkConf).

If a compression codec could not be found, createCodec throws a IllegalArgumentException exception:

Codec [<codecName>] is not available. Consider setting spark.io.compression.codec=snappy

getCodecName Method

getCodecName(conf: SparkConf): String

getCodecName reads spark.io.compression.codec Spark property from the input conf SparkConf or assumes lz4.

supportsConcatenationOfSerializedStreams Method

supportsConcatenationOfSerializedStreams(
  codec: CompressionCodec): Boolean

supportsConcatenationOfSerializedStreams…​FIXME

supportsConcatenationOfSerializedStreams is used when…​FIXME

Settings

Table 2. Settings
Name Default value Description

spark.io.compression.codec

lz4

The compression codec to use.

Used when getCodecName is called to find the current compression codec.