Skip to content

OptimizeTableCommandBase

OptimizeTableCommandBase is a (marker) extension of the DeltaCommand abstraction for optimize commands.

OptimizeTableCommandBase is a RunnableCommand (Spark SQL).

Implementations

Output Attributes

output: Seq[Attribute]

output is part of the Command (Spark SQL) abstraction.

Name DataType
path StringType
metrics OptimizeMetrics

Validating zOrderBy Columns

validateZorderByColumns(
  spark: SparkSession,
  deltaLog: DeltaLog,
  unresolvedZOrderByCols: Seq[UnresolvedAttribute]): Unit
Procedure

validateZorderByColumns is a procedure (returns Unit) so what happens inside stays inside (paraphrasing the former advertising slogan of Las Vegas, Nevada).

It is mainly to throw an exception when things are not as expected for the OPTIMIZE command.

validateZorderByColumns does nothing (and returns) when there is no unresolvedZOrderByCols columns specified.

validateZorderByColumns makes sure that no unresolvedZOrderByCols column violates the following requirements (or throws DeltaIllegalArgumentException or DeltaAnalysisException):

  1. It is part of data schema
  2. Column statistics are available for the column (when spark.databricks.delta.optimize.zorder.checkStatsCollection.enabled enabled)
  3. It is not a partition column (as Z-Ordering can only be performed on data columns)

validateZorderByColumns is used when: