Skip to content

BasicWriteJobStatsTracker

BasicWriteJobStatsTracker is a WriteJobStatsTracker.

Creating Instance

BasicWriteJobStatsTracker takes the following to be created:

  • Serializable Hadoop Configuration (Hadoop)
  • Driver-side metrics (Map[String, SQLMetric])
  • Task commit time SQLMetric

BasicWriteJobStatsTracker is created when:

Creating WriteTaskStatsTracker

WriteJobStatsTracker
newTaskInstance(): WriteTaskStatsTracker

newTaskInstance is part of the WriteJobStatsTracker abstraction.

newTaskInstance creates a new BasicWriteTaskStatsTracker (with the serializable Hadoop Configuration and the taskCommitTimeMetric).

Processing Write Job Statistics

WriteJobStatsTracker
processStats(
  stats: Seq[WriteTaskStats],
  jobCommitTime: Long): Unit

processStats is part of the WriteJobStatsTracker abstraction.

processStats uses the given BasicWriteTaskStatses to set the following driverSideMetrics:

  • jobCommitTime
  • numFiles
  • numOutputBytes
  • numOutputRows
  • numParts

processStats requests the active SparkContext for the spark.sql.execution.id.

In the end, processStats posts the metric updates.