As it happens in the open source software world, Delta Lake is not alone in the area of Data Lakes on top of Apache Spark. The following is a list of some other open source projects that seems to compete or cover the same use cases.
Apache Iceberg is an open table format for huge analytic datasets. Iceberg adds tables to Presto and Spark that use a high-performance format that works just like a SQL table.
- ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scale Storage and Analytics
- Introducing Iceberg Tables designed for object stores
- Introducing Apache Iceberg: Tables Designed for Object Stores
- Iceberg: a fast table format for S3
Apache Hudi ingests and manages storage of large analytical datasets over DFS (HDFS or cloud stores) and provides three logical views for query access.
- Hoodie: An Open Source Incremental Processing Framework From Uber
- Powering Uber's global network analytics pipelines in real-time with Apache Hudi