HadoopWriteConfigUtil

HadoopWriteConfigUtil[K, V] is an abstraction of writer configurers.

HadoopWriteConfigUtil is used for SparkHadoopWriter utility when requested to write an RDD of key-value pairs (for RDD.saveAsNewAPIHadoopDataset and RDD.saveAsHadoopDataset transformations).

Table 1. HadoopWriteConfigUtil Contract
Method Description

assertConf

assertConf(
  jobContext: JobContext,
  conf: SparkConf): Unit

closeWriter

closeWriter(
  taskContext: TaskAttemptContext): Unit

createCommitter

createCommitter(
  jobId: Int): HadoopMapReduceCommitProtocol

createJobContext

createJobContext(
  jobTrackerId: String,
  jobId: Int): JobContext

createTaskAttemptContext

createTaskAttemptContext(
  jobTrackerId: String,
  jobId: Int,
  splitId: Int,
  taskAttemptId: Int): TaskAttemptContext

Creates a Hadoop TaskAttemptContext

initOutputFormat

initOutputFormat(
  jobContext: JobContext): Unit

initWriter

initWriter(
  taskContext: TaskAttemptContext,
  splitId: Int): Unit

write

write(
  pair: (K, V)): Unit

Writes out the key-value pair

Used when SparkHadoopWriter is requested to executeTask (while writing out key-value pairs of a partition)

Table 2. HadoopWriteConfigUtils
HadoopWriteConfigUtil Description

HadoopMapReduceWriteConfigUtil

HadoopMapRedWriteConfigUtil