DeltaMergeBuilder

DeltaMergeBuilder is a builder interface to specify how to merge data from a source DataFrame into the target delta table.

Creating Instance

DeltaMergeBuilder takes the following to be created:

  • Target DeltaTable

  • Source Data (DataFrame)

  • Condition (Column)

  • When Clauses (Seq[MergeIntoClause])

DeltaMergeBuilder is created using DeltaTable.merge operator.

Operators

whenMatched

whenMatched(): DeltaMergeMatchedActionBuilder
whenMatched(
  condition: Column): DeltaMergeMatchedActionBuilder
whenMatched(
  condition: String): DeltaMergeMatchedActionBuilder

Creates a DeltaMergeMatchedActionBuilder (for the DeltaMergeBuilder and a match condition)

whenNotMatched

whenNotMatched(): DeltaMergeNotMatchedActionBuilder
whenNotMatched(
  condition: Column): DeltaMergeNotMatchedActionBuilder
whenNotMatched(
  condition: String): DeltaMergeNotMatchedActionBuilder

Executing Merge Operation

execute(): Unit

execute resolves column references (and creates a MergeInto).

In the end, execute creates a PreprocessTableMerge to create a MergeIntoCommand that is executed right away.

Creating Logical Query Plan for Merge

mergePlan: DeltaMergeInto

mergePlan…​FIXME

mergePlan is used when DeltaMergeBuilder is requested to execute.

Demo

// Create a delta table
val path = "/tmp/delta/demo"
val data = spark.range(5)
data.write.format("delta").save(path)

// Manage the delta table
import io.delta.tables.DeltaTable
val target = DeltaTable.forPath(path)

case class Person(id: Long, name: String)
val source = Seq(Person(0, "Zero"), Person(1, "One")).toDF

// Note the difference in schemas

scala> target.toDF.printSchema
root
 |-- id: long (nullable = true)

scala> source.printSchema
root
 |-- id: long (nullable = false)
 |-- name: string (nullable = true)

// Not only do we update the matching rows
// But also update the schema (schema evolution)

val mergeBuilder = target.as("to").merge(
  source.as("from"),
  condition = $"to.id" === $"from.id")

scala> :type mergeBuilder
io.delta.tables.DeltaMergeBuilder