Skip to content

Unity Catalog

Unity Catalog is an open-source universal catalog of the following assets:

Unity Catalog is made up of the following runtimes:

The server and CLI can be started on command line using bin/start-uc-server and bin/uc shell scripts, respectively.

Different Kinds of Catalogs

As Jason Reid posted on the unity-catalog slack channel (quoted with some styling changes):

First, a bit of clarification on the term "catalog" which is unfortunately overloaded. It is common to differentiate between so-called "business catalogs" and "operational catalogs".

Operational catalogs are what query engines use to read and write data. They track table schema and state and are often involved in transaction management.

Examples: Hive Metastore, Unity Catalog, AWS Glue, Polaris

Business catalogs are designed primarily for discovery. They typically aggregate metadata from a wide variety of data systems and make it easy to search.

Examples: Atlan, DataHub, Alation

For security, that is typically enforced by a combination of an operational catalog (where policy is defined) and the query engine (where policy is enforced).

There are some solutions for policy management which can push policy into multiple systems.

Examples: Immuta, Privacera

Learning Resources

  1. Open Sourcing Unity Catalog
  2. Getting Started with X-Table and Unity Catalog | Universal Datalakes | Hands on Labs with an accompanying video on YouTube
  3. Unitycatalog: the first look