Skip to content

VacuumTableCommand

VacuumTableCommand is a runnable command (Spark SQL) for VACUUM SQL command.

Creating Instance

VacuumTableCommand takes the following to be created:

  • Path
  • TableIdentifier
  • Optional Horizon Hours
  • dryRun flag

VacuumTableCommand requires that either the table or the path is defined and it is the root directory of a delta table. Partition directories are not supported.

VacuumTableCommand is created when:

Executing Command

run(
  sparkSession: SparkSession): Seq[Row]

run is part of the RunnableCommand (Spark SQL) abstraction.

run takes the path to vacuum (either the table or the path) and finds the root directory of the delta table.

run creates a DeltaLog instance for the delta table and gc it (passing in the DeltaLog instance, the dryRun and the horizonHours options).

run throws an AnalysisException when executed for a non-root directory of a delta table:

Please provide the base path ([baseDeltaPath]) when Vacuuming Delta tables. Vacuuming specific partitions is currently not supported.

run throws an AnalysisException when executed for a DeltaLog with the snapshot version being -1:

[deltaTableIdentifier] is not a Delta table. VACUUM is only supported for Delta tables.

Output Schema

The output schema of VacuumTableCommand is a single path column (of type StringType).