Skip to content

DeltaTable

DeltaTable is the management interface of delta tables.

io.delta.tables Package

DeltaTable belongs to io.delta.tables package.

import io.delta.tables.DeltaTable

Creating Instance

DeltaTable takes the following to be created:

DeltaTable is created using DeltaTable.forPath and DeltaTable.forName utilities (and indirectly using create, createIfNotExists, createOrReplace and replace).

DeltaLog

deltaLog: DeltaLog

DeltaLog of the DeltaTableV2.

Utilities (Static Methods)

columnBuilder

columnBuilder(
  colName: String): DeltaColumnBuilder
columnBuilder(
  spark: SparkSession,
  colName: String): DeltaColumnBuilder

Creates a DeltaColumnBuilder

convertToDelta

convertToDelta(
  spark: SparkSession,
  identifier: String): DeltaTable
convertToDelta(
  spark: SparkSession,
  identifier: String,
  partitionSchema: String): DeltaTable
convertToDelta(
  spark: SparkSession,
  identifier: String,
  partitionSchema: StructType): DeltaTable

convertToDelta converts the parquet table to delta format

Note

Refer to Demo: Converting Parquet Dataset Into Delta Format for a demo of DeltaTable.convertToDelta.

create

create(): DeltaTableBuilder
create(
  spark: SparkSession): DeltaTableBuilder

Creates a DeltaTableBuilder

createIfNotExists

createIfNotExists(): DeltaTableBuilder
createIfNotExists(
  spark: SparkSession): DeltaTableBuilder

Creates a DeltaTableBuilder (with CreateTableOptions and ifNotExists flag enabled)

createOrReplace

createOrReplace(): DeltaTableBuilder
createOrReplace(
  spark: SparkSession): DeltaTableBuilder

Creates a DeltaTableBuilder (with ReplaceTableOptions and orCreate flag enabled)

forName

forName(
  sparkSession: SparkSession,
  tableName: String): DeltaTable
forName(
  tableOrViewName: String): DeltaTable

forName uses ParserInterface (of the given SparkSession) to parse the given table name.

forName checks whether the given table name is of a Delta table and, if so, creates a DeltaTable with the following:

  • Dataset that represents loading data from the specified table name (using SparkSession.table operator)
  • DeltaTableV2

forName throws an AnalysisException when the given table name is for non-Delta table:

[deltaTableIdentifier] is not a Delta table.

forPath

forPath(
  sparkSession: SparkSession,
  path: String): DeltaTable
forPath(
  path: String): DeltaTable

forPath checks whether the given table name is of a Delta table and, if so, creates a DeltaTable with the following:

  • Dataset that represents loading data from the specified path using delta data source
  • DeltaTableV2

forPath throws an AnalysisException when the given path does not belong to a delta table:

[deltaTableIdentifier] is not a Delta table.

isDeltaTable

isDeltaTable(
  sparkSession: SparkSession,
  identifier: String): Boolean
isDeltaTable(
  identifier: String): Boolean

isDeltaTable...FIXME

replace

replace(): DeltaTableBuilder
replace(
  spark: SparkSession): DeltaTableBuilder

Creates a DeltaTableBuilder (with ReplaceTableOptions and orCreate flag disabled)

Operators

alias

alias(
  alias: String): DeltaTable

Applies an alias to the DeltaTable (equivalent to as)

as

as(
  alias: String): DeltaTable

Applies an alias to the DeltaTable

delete

delete(): Unit
delete(
  condition: Column): Unit
delete(
  condition: String): Unit

Executes DeleteFromTable command

generate

generate(
  mode: String): Unit

Executes the DeltaGenerateCommand

history

history(): DataFrame
history(
  limit: Int): DataFrame

Requests the DeltaHistoryManager for history.

merge

merge(
  source: DataFrame,
  condition: Column): DeltaMergeBuilder
merge(
  source: DataFrame,
  condition: String): DeltaMergeBuilder

Creates a DeltaMergeBuilder

toDF

toDF: Dataset[Row]

Returns the DataFrame representation of the DeltaTable

update

update(
  condition: Column,
  set: Map[String, Column]): Unit
update(
  set: Map[String, Column]): Unit

Executes UpdateTable command

updateExpr

updateExpr(
  set: Map[String, String]): Unit
updateExpr(
  condition: String,
  set: Map[String, String]): Unit

Executes UpdateTable command

upgradeTableProtocol

upgradeTableProtocol(
  readerVersion: Int,
  writerVersion: Int): Unit

Updates the protocol version of the table to leverage new features.

Upgrading the reader version will prevent all clients that have an older version of Delta Lake from accessing this table.

Upgrading the writer version will prevent older versions of Delta Lake to write to this table.

The reader or writer version cannot be downgraded.

Internally, upgradeTableProtocol creates a new Protocol (with the given versions) and requests the DeltaLog to upgradeProtocol.

[SC-44271][DELTA] Introduce default protocol version for Delta tables

upgradeTableProtocol was introduced in [SC-44271][DELTA] Introduce default protocol version for Delta tables commit.

vacuum

vacuum(): DataFrame
vacuum(
  retentionHours: Double): DataFrame

Deletes files and directories (recursively) in the DeltaTable that are not needed by the table (and maintains older versions up to the given retention threshold).


Last update: 2021-06-15
Back to top