Append-Only Tables¶
Append-Only Tables is a table feature in Delta Lake that forbids deleting data files that could be a result of the following:
- Delete, Update, WriteIntoDelta (in
Overwrite
save mode) commands DeltaSink
to addBatch inComplete
output mode- RemoveFiles with dataChange (at prepareCommit)
Append-Only Tables is enabled on a delta table using delta.appendOnly table property (indirectly, through AppendOnlyTableFeature that is a FeatureAutomaticallyEnabledByMetadata and uses this table property).
Demo¶
Create a delta table with delta.appendOnly table property enabled.
sql("""
CREATE TABLE tbl(a int)
USING delta
TBLPROPERTIES (
'delta.appendOnly' = 'true'
)
""")
Describe the detail of the delta table using DESCRIBE DETAIL command.
sql("""
DESC DETAIL tbl
""")
.select("name", "properties", "minReaderVersion", "minWriterVersion", "tableFeatures")
.show(truncate = false)
+-------------------------+--------------------------+----------------+----------------+------------------------+
|name |properties |minReaderVersion|minWriterVersion|tableFeatures |
+-------------------------+--------------------------+----------------+----------------+------------------------+
|spark_catalog.default.tbl|{delta.appendOnly -> true}|1 |2 |[appendOnly, invariants]|
+-------------------------+--------------------------+----------------+----------------+------------------------+
Insert a record.
sql("""
INSERT INTO tbl
VALUES (1)
""")
Delete a record. It should fail.
sql("""
DELETE FROM tbl
WHERE a = 1
""")
And it did! 👍
org.apache.spark.sql.delta.DeltaUnsupportedOperationException: [DELTA_CANNOT_MODIFY_APPEND_ONLY] This table is configured to only allow appends. If you would like to permit updates or deletes, use 'ALTER TABLE null SET TBLPROPERTIES (delta.appendOnly=false)'.
at org.apache.spark.sql.delta.DeltaErrorsBase.modifyAppendOnlyTableException(DeltaErrors.scala:961)
at org.apache.spark.sql.delta.DeltaErrorsBase.modifyAppendOnlyTableException$(DeltaErrors.scala:957)
at org.apache.spark.sql.delta.DeltaErrors$.modifyAppendOnlyTableException(DeltaErrors.scala:3382)
at org.apache.spark.sql.delta.DeltaLog$.assertRemovable(DeltaLog.scala:1009)
at org.apache.spark.sql.delta.commands.DeleteCommand.$anonfun$run$2(DeleteCommand.scala:122)
at org.apache.spark.sql.delta.commands.DeleteCommand.$anonfun$run$2$adapted(DeleteCommand.scala:121)
at org.apache.spark.sql.delta.DeltaLog.withNewTransaction(DeltaLog.scala:227)
at org.apache.spark.sql.delta.commands.DeleteCommand.$anonfun$run$1(DeleteCommand.scala:121)
...