VacuumTableCommand

VacuumTableCommand is a logical command (RunnableCommand) for VACUUM SQL command.

Read up on RunnableCommand in The Internals of Spark SQL online book.

VacuumTableCommand is created exclusively when DeltaSqlAstBuilder is requested to parse VACUUM SQL command.

VacuumTableCommand requires that either the table or the path is defined and it is the root directory of a delta table. Partition directories are not supported.

The output of VacuumTableCommand is a single path column (of type StringType).

Creating VacuumTableCommand Instance

VacuumTableCommand takes the following to be created:

  • Path (optional)

  • TableIdentifier (optional)

  • Optional horizonHours

  • dryRun flag

Running Command — run Method

run(sparkSession: SparkSession): Seq[Row]
run is part of the RunnableCommand contract to…​FIXME.

run takes the path to vacuum (i.e. either the table or the path) and finds the root directory of the delta table.

run creates a DeltaLog instance for the delta table and executes VacuumCommand.gc utility (passing in the DeltaLog instance, the dryRun and the horizonHours options).

run throws an AnalysisException when executed for a non-root directory of a delta table:

Please provide the base path ([baseDeltaPath]) when Vacuuming Delta tables. Vacuuming specific partitions is currently not supported.

run throws an AnalysisException when executed for a DeltaLog with the snapshot version being -1:

[deltaTableIdentifier] is not a Delta table. VACUUM is only supported for Delta tables.