Skip to content

Demo: User Metadata for Labelling Commits

The demo shows how to differentiate commits of a write batch query using userMetadata option.


A fine example could be for distinguishing between two or more separate streaming write queries.

Creating Delta Table

val tableName = "/tmp/delta-demo-userMetadata"

Describing History

val d = DeltaTable.forPath(tableName)

We are interested in a subset of the available history metadata.

  .select('version, 'operation, 'operationParameters, 'userMetadata)
  .show(truncate = false)
|version|operation|operationParameters                       |userMetadata|
|0      |WRITE    |[mode -> ErrorIfExists, partitionBy -> []]|null        |

Appending Data

In this step, you're going to append new data to the existing Delta table.

You're going to use userMetadata option for a custom user-defined historical marker (e.g. to know when this extra append happended in the life of the Delta table).

val userMetadata = "two more rows for demo"

Since you're appending new rows, it is required to use Append mode.

import org.apache.spark.sql.SaveMode.Append

The whole append write is as follows:

spark.range(start = 5, end = 7)
  .option("userMetadata", userMetadata)

That write query creates another version of the Delta table.

Listing Versions with userMetadata

For the sake of the demo, you are going to show the versions of the Delta table with userMetadata defined.

  .select('version, 'operation, 'operationParameters, 'userMetadata)
  .show(truncate = false)
|version|operation|operationParameters                |userMetadata          |
|1      |WRITE    |[mode -> Append, partitionBy -> []]|two more rows for demo|