ShuffleWriteProcessor¶
ShuffleWriteProcessor controls write behavior in ShuffleMapTasks while writing partition records out to the shuffle system.
ShuffleWriteProcessor is used to create a ShuffleDependency.
Creating Instance¶
ShuffleWriteProcessor takes no arguments to be created.
ShuffleWriteProcessor is created when:
ShuffleDependencyis createdShuffleExchangeExec(Spark SQL) physical operator is requested tocreateShuffleWriteProcessor
Writing Partition Records to Shuffle System¶
write(
rdd: RDD[_],
dep: ShuffleDependency[_, _, _],
mapId: Long,
context: TaskContext,
partition: Partition): MapStatus
write requests the ShuffleManager for the ShuffleWriter for the ShuffleHandle (of the given ShuffleDependency).
write requests the ShuffleWriter to write out records (of the given Partition and RDD).
In the end, write requests the ShuffleWriter to stop (with the success flag enabled).
In case of any Exceptions, write requests the ShuffleWriter to stop (with the success flag disabled).
write is used when ShuffleMapTask is requested to run.
Creating MetricsReporter¶
createMetricsReporter(
context: TaskContext): ShuffleWriteMetricsReporter
createMetricsReporter creates a ShuffleWriteMetricsReporter from the given TaskContext.
createMetricsReporter requests the given TaskContext for TaskMetrics and then for the ShuffleWriteMetrics.