Contenders¶
As it happens in the open source software world, Delta Lake is not alone in the area of Data Lakes on top of Apache Spark. The following is a list of some other open source projects that seems to compete or cover the same use cases.
Apache Iceberg¶
Apache Iceberg is an open table format for huge analytic datasets. Iceberg adds tables to Presto and Spark that use a high-performance format that works just like a SQL table.
Videos¶
- ACID ORC, Iceberg, and Delta LakeāAn Overview of Table Formats for Large Scale Storage and Analytics
- Introducing Iceberg Tables designed for object stores
- Introducing Apache Iceberg: Tables Designed for Object Stores
- Iceberg: a fast table format for S3
Apache Hudi¶
Apache Hudi ingests and manages storage of large analytical datasets over DFS (HDFS or cloud stores) and provides three logical views for query access.