Skip to content

Installation

Installation of Delta Lake boils down to using spark-submit's --packages command-line option with the following configuration properties for DeltaSparkSessionExtension and DeltaCatalog:

Make sure that the version of Scala in Apache Spark should match Delta Lake's.

Spark SQL Application

import org.apache.spark.sql.SparkSession
val spark = SparkSession
  .builder
  .config("spark.sql.extensions", "io.delta.sql.DeltaSparkSessionExtension")
  .config("spark.sql.catalog.spark_catalog", "org.apache.spark.sql.delta.catalog.DeltaCatalog")
  .getOrCreate

Spark Shell

$ ./bin/spark-shell --version
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 3.3.0
      /_/

Using Scala version 2.12.15, OpenJDK 64-Bit Server VM, 11.0.16
./bin/spark-shell \
  --packages io.delta:delta-core_2.12:2.1.0rc1 \
  --conf spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension \
  --conf spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog

Version

io.delta.VERSION can be used to show the version of Delta Lake installed.

assert(io.delta.VERSION == "2.1.0rc1")

It is also possible to use DESCRIBE HISTORY and check out the engineInfo column.