Skip to content

OptimizeTableCommandBase

OptimizeTableCommandBase is a (marker) extension of the DeltaCommand abstraction for optimize commands.

OptimizeTableCommandBase is a RunnableCommand (Spark SQL).

Implementations

Output Attributes

output: Seq[Attribute]

output is part of the Command (Spark SQL) abstraction.

Name DataType
path StringType
metrics OptimizeMetrics

Validating zOrderBy Columns

validateZorderByColumns(
  spark: SparkSession,
  deltaLog: DeltaLog,
  unresolvedZOrderByCols: Seq[UnresolvedAttribute]): Unit

Note

Since validateZorderByColumns returns Unit (no value to work with), I'm sure you have already figured out that it is mainly to throw an exception when things are not as expected for the OPTIMIZE command.

validateZorderByColumns does nothing (and returns) when there is no unresolvedZOrderByCols columns specified.

validateZorderByColumns makes sure that no unresolvedZOrderByCols column violates the following requirements (or throws DeltaIllegalArgumentException or DeltaAnalysisException):

  1. It is part of data schema
  2. Column statistics are available for the column (when spark.databricks.delta.optimize.zorder.checkStatsCollection.enabled enabled)
  3. It is not a partition column (as Z-Ordering can only be performed on data columns)

validateZorderByColumns is used when: