Skip to content

DeltaJobStatisticsTracker

DeltaJobStatisticsTracker is a WriteJobStatsTracker (Spark SQL) for per-file statistics collection (when spark.databricks.delta.stats.collect is enabled).

Creating Instance

DeltaJobStatisticsTracker takes the following to be created:

DeltaJobStatisticsTracker is created when:

Recorded Per-File Statistics

recordedStats: Map[String, String]

recordedStats is a collection of recorded per-file statistics (that are collected upon processing per-job write task statistics).

recordedStats is used when:

Processing Per-Job Write Task Statistics

processStats(
  stats: Seq[WriteTaskStats],
  jobCommitTime: Long): Unit

processStats extracts a DeltaFileStatistics (from the given WriteTaskStats) to access collected per-file statistics.

processStats is part of the WriteJobStatsTracker (Spark SQL) abstraction.