Skip to content

OPTIMIZE Command

OPTIMIZE command compacts files together (that are smaller than spark.databricks.delta.optimize.minFileSize to files of spark.databricks.delta.optimize.maxFileSize size).

OPTIMIZE command uses spark.databricks.delta.optimize.maxThreads threads for compaction.

OPTIMIZE command can be executed using OPTIMIZE SQL command.

Delta Lake Documentation

From Optimize performance with file management:

To improve query speed, Delta Lake on Databricks supports the ability to optimize the layout of data stored in cloud storage. Delta Lake on Databricks supports two layout algorithms: bin-packing and Z-Ordering.

bin-packing is exactly this OPTIMIZE command.

Learn more in the Optimize Demo.

Back to top