CompressionCodecs¶
CompressionCodecs utility is used to set Hadoop compression-related configuration properties for CSV, JSON and Text file formats are requested to prepare write.
Compression Codecs¶
| Alias | Class Name |
|---|---|
none | |
uncompressed | |
bzip2 | org.apache.hadoop.io.compress.BZip2Codec |
deflate | org.apache.hadoop.io.compress.DeflateCodec |
gzip | org.apache.hadoop.io.compress.GzipCodec |
lz4 | org.apache.hadoop.io.compress.Lz4Codec |
snappy | org.apache.hadoop.io.compress.SnappyCodec |
getCodecClassName¶
getCodecClassName(
name: String): String
getCodecClassName looks up a codec by name in the known codecs and makes sure that it's available on the classpath.
getCodecClassName is used when:
CSVOptionsis requested forcompressionCodecJSONOptionsis requested forcompressionCodecTextOptionsis requested forcompressionCodec
setCodecConfiguration¶
setCodecConfiguration(
conf: Configuration,
codec: String): Unit
setCodecConfiguration sets mapreduce compression-related configuration properties in the given Configuration (Apache Hadoop) (based on whether codec is defined or not).
| codec | Configuration Property | Value |
|---|---|---|
| defined | mapreduce.output.fileoutputformat.compress | true |
| defined | mapreduce.output.fileoutputformat.compress.type | BLOCK |
| defined | mapreduce.output.fileoutputformat.compress.codec | codec |
| defined | mapreduce.map.output.compress | true |
| defined | mapreduce.map.output.compress.codec | codec |
| undefined | mapreduce.output.fileoutputformat.compress | false |
| undefined | mapreduce.map.output.compress | false |
setCodecConfiguration is used when:
CSVFileFormatis requested toprepareWrite(based oncompressionorcodecoptions)JsonFileFormatis requested toprepareWrite(based oncompressionoption)TextFileFormatis requested toprepareWrite(based oncompressionoption)CSVWriteis requested toprepareWrite(based oncompressionoption)JsonWriteis requested toprepareWrite(based oncompressionoption)TextWriteis requested toprepareWrite(based oncompressionoption)