As it was well said: "Delta is a storage format while Spark is an execution engine...to separate storage from compute."
As of 0.7.0 Delta Lake requires Spark 3. Please note that Spark 3.1.1 is not yet supported. Use Spark 3.0.2 instead.
Delta Lake uses OptimisticTransaction for transactional writes. A commit is successful when the transaction can write the actions to a delta file (in the transactional log). In case the delta file for the commit version already exists, the transaction is retried.
Structured queries can write (transactionally) to a delta table using the following interfaces:
WriteIntoDelta command for batch queries (Spark SQL)
DeltaSink for streaming queries (Spark Structured Streaming)
More importantly, multiple queries can write to the same delta table simultaneously (at exactly the same time).
Delta Lake supports batch and streaming queries (Spark SQL and Structured Streaming, respectively) using delta format.
Delta Lake supports reading and writing in batch queries:
Delta Lake supports reading and writing in streaming queries:
Delta Tables in Logical Query Plans¶
Put simply, delta tables are
HadoopFsRelation with TahoeFileIndex in logical query plans.