ExternalCatalog¶
ExternalCatalog
is an abstraction of external system catalogs (aka metadata registry or metastore) of permanent relational entities (i.e., databases, tables, partitions, and functions).
ExternalCatalog
is available as ephemeral (in-memory) or persistent (hive-aware).
Contract¶
getPartition¶
getPartition(
db: String,
table: String,
spec: TablePartitionSpec): CatalogTablePartition
CatalogTablePartition of a given table (in a database)
See:
Used when:
ExternalCatalogWithListener
is requested togetPartition
SessionCatalog
is requested to getPartition
getPartitionOption¶
getPartitionOption(
db: String,
table: String,
spec: TablePartitionSpec): Option[CatalogTablePartition]
CatalogTablePartition of a given table (in a database)
See:
Used when:
ExternalCatalogWithListener
is requested togetPartitionOption
InsertIntoHiveTable
is requested to processInsert
getTable¶
getTable(
db: String,
table: String): CatalogTable
CatalogTable of a given table (in a database)
See:
Used when:
ExternalCatalogWithListener
is requested togetTable
SessionCatalog
is requested to alterTableDataSchema, getTableRawMetadata and lookupRelation
getTablesByName¶
getTablesByName(
db: String,
tables: Seq[String]): Seq[CatalogTable]
CatalogTables of the given tables (in a database)
See:
Used when:
ExternalCatalogWithListener
is requested togetTablesByName
SessionCatalog
is requested to getTablesByName
listPartitionsByFilter¶
listPartitionsByFilter(
db: String,
table: String,
predicates: Seq[Expression],
defaultTimeZoneId: String): Seq[CatalogTablePartition]
See:
Used when:
ExternalCatalogWithListener
is requested togetTablesByName
SessionCatalog
is requested to listPartitionsByFilter
Implementations¶
ExternalCatalogWithListener
- HiveExternalCatalog
- InMemoryCatalog
Accessing ExternalCatalog¶
ExternalCatalog
is available as externalCatalog of SharedState (in SparkSession
).
scala> :type spark
org.apache.spark.sql.SparkSession
scala> :type spark.sharedState.externalCatalog
org.apache.spark.sql.catalyst.catalog.ExternalCatalog