BasicWriteJobStatsTracker¶
BasicWriteJobStatsTracker is a WriteJobStatsTracker.
Creating Instance¶
BasicWriteJobStatsTracker takes the following to be created:
- Serializable Hadoop
Configuration(Hadoop) - Driver-side metrics (
Map[String, SQLMetric]) - Task commit time SQLMetric
BasicWriteJobStatsTracker is created when:
DataWritingCommandis requested for a BasicWriteJobStatsTrackerFileWriteis requested to createWriteJobDescriptionFileStreamSink(Spark Structured Streaming) is requested for aBasicWriteJobStatsTracker
Creating WriteTaskStatsTracker¶
WriteJobStatsTracker
newTaskInstance(): WriteTaskStatsTracker
newTaskInstance is part of the WriteJobStatsTracker abstraction.
newTaskInstance creates a new BasicWriteTaskStatsTracker (with the serializable Hadoop Configuration and the taskCommitTimeMetric).
Processing Write Job Statistics¶
WriteJobStatsTracker
processStats(
stats: Seq[WriteTaskStats],
jobCommitTime: Long): Unit
processStats is part of the WriteJobStatsTracker abstraction.
processStats uses the given BasicWriteTaskStatses to set the following driverSideMetrics:
jobCommitTimenumFilesnumOutputBytesnumOutputRowsnumParts
processStats requests the active SparkContext for the spark.sql.execution.id.
In the end, processStats posts the metric updates.