BasicWriteJobStatsTracker¶
BasicWriteJobStatsTracker
is a WriteJobStatsTracker.
Creating Instance¶
BasicWriteJobStatsTracker
takes the following to be created:
- Serializable Hadoop
Configuration
(Hadoop) - Driver-side metrics (
Map[String, SQLMetric]
) - Task commit time SQLMetric
BasicWriteJobStatsTracker
is created when:
DataWritingCommand
is requested for a BasicWriteJobStatsTrackerFileWrite
is requested to createWriteJobDescriptionFileStreamSink
(Spark Structured Streaming) is requested for aBasicWriteJobStatsTracker
Creating WriteTaskStatsTracker¶
WriteJobStatsTracker
newTaskInstance(): WriteTaskStatsTracker
newTaskInstance
is part of the WriteJobStatsTracker abstraction.
newTaskInstance
creates a new BasicWriteTaskStatsTracker (with the serializable Hadoop Configuration and the taskCommitTimeMetric).
Processing Write Job Statistics¶
WriteJobStatsTracker
processStats(
stats: Seq[WriteTaskStats],
jobCommitTime: Long): Unit
processStats
is part of the WriteJobStatsTracker abstraction.
processStats
uses the given BasicWriteTaskStatses to set the following driverSideMetrics:
jobCommitTime
numFiles
numOutputBytes
numOutputRows
numParts
processStats
requests the active SparkContext
for the spark.sql.execution.id.
In the end, processStats
posts the metric updates.