DeltaTable¶
DeltaTable is the management interface of delta tables.
io.delta.tables Package¶
DeltaTable belongs to io.delta.tables package.
import io.delta.tables.DeltaTable
Creating Instance¶
DeltaTable takes the following to be created:
- Table Data (
Dataset[Row]) - DeltaTableV2
DeltaTable is created using DeltaTable.forPath and DeltaTable.forName utilities (and indirectly using create, createIfNotExists, createOrReplace and replace).
DeltaLog¶
deltaLog: DeltaLog
DeltaLog of the DeltaTableV2.
Utilities (Static Methods)¶
columnBuilder¶
columnBuilder(
colName: String): DeltaColumnBuilder
columnBuilder(
spark: SparkSession,
colName: String): DeltaColumnBuilder
Creates a DeltaColumnBuilder
convertToDelta¶
convertToDelta(
spark: SparkSession,
identifier: String): DeltaTable
convertToDelta(
spark: SparkSession,
identifier: String,
partitionSchema: String): DeltaTable
convertToDelta(
spark: SparkSession,
identifier: String,
partitionSchema: StructType): DeltaTable
convertToDelta converts the parquet table to delta format
Note
Refer to Demo: Converting Parquet Dataset Into Delta Format for a demo of DeltaTable.convertToDelta.
create¶
create(): DeltaTableBuilder
create(
spark: SparkSession): DeltaTableBuilder
Creates a DeltaTableBuilder
createIfNotExists¶
createIfNotExists(): DeltaTableBuilder
createIfNotExists(
spark: SparkSession): DeltaTableBuilder
Creates a DeltaTableBuilder (with CreateTableOptions and ifNotExists flag enabled)
createOrReplace¶
createOrReplace(): DeltaTableBuilder
createOrReplace(
spark: SparkSession): DeltaTableBuilder
Creates a DeltaTableBuilder (with ReplaceTableOptions and orCreate flag enabled)
forName¶
forName(
sparkSession: SparkSession,
tableName: String): DeltaTable
forName(
tableOrViewName: String): DeltaTable
forName uses ParserInterface (of the given SparkSession) to parse the given table name.
forName checks whether the given table name is of a Delta table and, if so, creates a DeltaTable with the following:
Datasetthat represents loading data from the specified table name (usingSparkSession.tableoperator)- DeltaTableV2
forName throws an AnalysisException when the given table name is for non-Delta table:
[deltaTableIdentifier] is not a Delta table.
forPath¶
forPath(
sparkSession: SparkSession,
path: String): DeltaTable
forPath(
path: String): DeltaTable
forPath checks whether the given table name is of a Delta table and, if so, creates a DeltaTable with the following:
Datasetthat represents loading data from the specifiedpathusing delta data source- DeltaTableV2
forPath throws an AnalysisException when the given path does not belong to a delta table:
[deltaTableIdentifier] is not a Delta table.
isDeltaTable¶
isDeltaTable(
sparkSession: SparkSession,
identifier: String): Boolean
isDeltaTable(
identifier: String): Boolean
isDeltaTable...FIXME
replace¶
replace(): DeltaTableBuilder
replace(
spark: SparkSession): DeltaTableBuilder
Creates a DeltaTableBuilder (with ReplaceTableOptions and orCreate flag disabled)
Operators¶
alias¶
alias(
alias: String): DeltaTable
Applies an alias to the DeltaTable (equivalent to as)
as¶
as(
alias: String): DeltaTable
Applies an alias to the DeltaTable
delete¶
delete(): Unit
delete(
condition: Column): Unit
delete(
condition: String): Unit
Executes DeleteFromTable command
generate¶
generate(
mode: String): Unit
Executes the DeltaGenerateCommand
history¶
history(): DataFrame
history(
limit: Int): DataFrame
Requests the DeltaHistoryManager for history
merge¶
merge(
source: DataFrame,
condition: Column): DeltaMergeBuilder
merge(
source: DataFrame,
condition: String): DeltaMergeBuilder
Creates a DeltaMergeBuilder
optimize¶
optimize(): DeltaOptimizeBuilder
Creates a DeltaOptimizeBuilder
restoreToTimestamp¶
restoreToTimestamp(
timestamp: String): DataFrame
restoreToVersion¶
restoreToVersion(
version: Long): DataFrame
toDF¶
toDF: Dataset[Row]
Returns the DataFrame representation of the DeltaTable
update¶
update(
condition: Column,
set: Map[String, Column]): Unit
update(
set: Map[String, Column]): Unit
updateExpr¶
updateExpr(
set: Map[String, String]): Unit
updateExpr(
condition: String,
set: Map[String, String]): Unit
upgradeTableProtocol¶
upgradeTableProtocol(
readerVersion: Int,
writerVersion: Int): Unit
upgradeTableProtocol creates a new Protocol (for the given reader and writer versions) and requests the DeltaLog to upgrade the protocol.
Transactional Operation
upgradeTableProtocol is a transactional operation.
Updates the protocol version of the table to leverage new features.
Upgrading the reader version will prevent all clients that have an older version of Delta Lake from accessing this table.
Upgrading the writer version will prevent older versions of Delta Lake to write to this table.
The reader or writer version cannot be downgraded.
[SC-44271][DELTA] Introduce default protocol version for Delta tables
upgradeTableProtocol was introduced in [SC-44271][DELTA] Introduce default protocol version for Delta tables commit.
vacuum¶
vacuum(): DataFrame
vacuum(
retentionHours: Double): DataFrame
Deletes files and directories (recursively) in the DeltaTable that are not needed by the table (and maintains older versions up to the given retention threshold).