FileWriterFactory¶

Creating Instance¶

FileWriterFactory takes the following to be created:

FileWriterFactory is created when:

createWriter(
  partitionId: Int,
  realTaskId: Long): DataWriter[InternalRow]

createWriter requests the FileCommitProtocol to setupTask (with the TaskAttemptContext).

For a non-partitioned write job (i.e., no partition columns in the WriteJobDescription), createWriter creates a SingleDirectoryDataWriter. Otherwise, createWriter creates a DynamicPartitionDataSingleWriter.

createWriter is part of the DataWriterFactory abstraction.

createTaskAttemptContext(
  partitionId: Int): TaskAttemptContextImpl

createTaskAttemptContext creates a Hadoop JobID.

createTaskAttemptContext creates a Hadoop TaskID (for the JobID and the given partitionId as TaskType.MAP type).

createTaskAttemptContext creates a Hadoop TaskAttemptID (for the TaskID).

createTaskAttemptContext uses the Hadoop Configuration (from the WriteJobDescription) to set the following properties:

In the end, createTaskAttemptContext creates a new Hadoop TaskAttemptContextImpl (with the Configuration and the TaskAttemptID).