Skip to content

ColumnStat

Creating Instance

ColumnStat takes the following to be created:

  • Distinct Count (number of distinct values)
  • Minimum Value
  • Maximum Value
  • Null Count (number of null values)
  • Average length of the values (for fixed-length types, this should be a constant)
  • Maximum length of the values (for fixed-length types, this should be a constant)
  • Histogram
  • 2

ColumnStat is created when:

  • CatalogColumnStat is requested to toPlanStat
  • Range logical operator is requested to computeStats
  • EstimationUtils is requested to nullColumnStat
  • JoinEstimation is requested to computeByNdv, computeByHistogram
  • UnionEstimation is requested to computeMinMaxStats, computeNullCountStats
  • CommandUtils is requested to rowToColumnStat

Converting to CatalogColumnStat

toCatalogColumnStat(
  colName: String,
  dataType: DataType): CatalogColumnStat

toCatalogColumnStat converts this ColumnStat to a CatalogColumnStat.


toCatalogColumnStat is used when:

  • PruneHiveTablePartitions logical optimization is requested to updateTableMeta
  • AnalyzeColumnCommand logical command is requested to analyzeColumnInCatalog
  • PruneFileSourcePartitions logical optimization is executed