ExternalCatalog¶
ExternalCatalog is an abstraction of external system catalogs (aka metadata registry or metastore) of permanent relational entities (i.e., databases, tables, partitions, and functions).
ExternalCatalog is available as ephemeral (in-memory) or persistent (hive-aware).
Contract¶
getPartition¶
getPartition(
db: String,
table: String,
spec: TablePartitionSpec): CatalogTablePartition
CatalogTablePartition of a given table (in a database)
See:
Used when:
ExternalCatalogWithListeneris requested togetPartitionSessionCatalogis requested to getPartition
getPartitionOption¶
getPartitionOption(
db: String,
table: String,
spec: TablePartitionSpec): Option[CatalogTablePartition]
CatalogTablePartition of a given table (in a database)
See:
Used when:
ExternalCatalogWithListeneris requested togetPartitionOptionInsertIntoHiveTableis requested to processInsert
getTable¶
getTable(
db: String,
table: String): CatalogTable
CatalogTable of a given table (in a database)
See:
Used when:
ExternalCatalogWithListeneris requested togetTableSessionCatalogis requested to alterTableDataSchema, getTableRawMetadata and lookupRelation
getTablesByName¶
getTablesByName(
db: String,
tables: Seq[String]): Seq[CatalogTable]
CatalogTables of the given tables (in a database)
See:
Used when:
ExternalCatalogWithListeneris requested togetTablesByNameSessionCatalogis requested to getTablesByName
listPartitionsByFilter¶
listPartitionsByFilter(
db: String,
table: String,
predicates: Seq[Expression],
defaultTimeZoneId: String): Seq[CatalogTablePartition]
See:
Used when:
ExternalCatalogWithListeneris requested togetTablesByNameSessionCatalogis requested to listPartitionsByFilter
Implementations¶
ExternalCatalogWithListener- HiveExternalCatalog
- InMemoryCatalog
Accessing ExternalCatalog¶
ExternalCatalog is available as externalCatalog of SharedState (in SparkSession).
scala> :type spark
org.apache.spark.sql.SparkSession
scala> :type spark.sharedState.externalCatalog
org.apache.spark.sql.catalyst.catalog.ExternalCatalog