ShuffleWriteProcessor¶
ShuffleWriteProcessor
controls write behavior in ShuffleMapTasks while writing partition records out to the shuffle system.
ShuffleWriteProcessor
is used to create a ShuffleDependency.
Creating Instance¶
ShuffleWriteProcessor
takes no arguments to be created.
ShuffleWriteProcessor
is created when:
ShuffleDependency
is createdShuffleExchangeExec
(Spark SQL) physical operator is requested tocreateShuffleWriteProcessor
Writing Partition Records to Shuffle System¶
write(
rdd: RDD[_],
dep: ShuffleDependency[_, _, _],
mapId: Long,
context: TaskContext,
partition: Partition): MapStatus
write
requests the ShuffleManager for the ShuffleWriter for the ShuffleHandle (of the given ShuffleDependency).
write
requests the ShuffleWriter
to write out records (of the given Partition and RDD).
In the end, write
requests the ShuffleWriter
to stop (with the success
flag enabled).
In case of any Exception
s, write
requests the ShuffleWriter
to stop (with the success
flag disabled).
write
is used when ShuffleMapTask
is requested to run.
Creating MetricsReporter¶
createMetricsReporter(
context: TaskContext): ShuffleWriteMetricsReporter
createMetricsReporter
creates a ShuffleWriteMetricsReporter from the given TaskContext.
createMetricsReporter
requests the given TaskContext for TaskMetrics and then for the ShuffleWriteMetrics.