Skip to content

HiveMetastoreCatalog — Legacy SessionCatalog for Converting Hive Metastore Relations to Data Source Relations

HiveMetastoreCatalog is a session-scoped catalog of relational entities that knows how to <>.

HiveMetastoreCatalog is used by HiveSessionCatalog for RelationConversions logical evaluation rule.

HiveMetastoreCatalog is <> when HiveSessionStateBuilder is requested for a[SessionCatalog] (and creates a HiveSessionCatalog).

HiveMetastoreCatalog, HiveSessionCatalog and HiveSessionStateBuilder

Creating Instance

HiveMetastoreCatalog takes the following to be created:

Converting HiveTableRelation to LogicalRelation

  relation: HiveTableRelation): LogicalRelation


convert is used when:


  relation: HiveTableRelation,
  options: Map[String, String],
  fileFormatClass: Class[_ <: FileFormat],
  fileType: String): LogicalRelation

convertToLogicalRelation branches based on whether the input[HiveTableRelation] is <> or <>.

[[convertToLogicalRelation-partitioned]] When the HiveTableRelation is[partitioned], convertToLogicalRelation uses spark.sql.hive.manageFilesourcePartitions configuration property to compute the root paths. With the property enabled, the root path is simply the table location (aka locationUri). Otherwise, the root paths are the locationUri of the partitions (using the shared ExternalCatalog).

convertToLogicalRelation creates a new ../[LogicalRelation] with a HadoopFsRelation (with no bucketing specification among things) unless a LogicalRelation for the table is already in a <>.

[[convertToLogicalRelation-not-partitioned]] When the HiveTableRelation is not partitioned, convertToLogicalRelation...FIXME

In the end, convertToLogicalRelation replaces exprIds in the ../[table relation output (schema)].

NOTE: convertToLogicalRelation is used when[RelationConversions] logical evaluation rule is executed (with Hive tables in parquet as well as native and hive ORC storage formats).


  relation: HiveTableRelation,
  options: Map[String, String],
  fileFormat: FileFormat,
  fileIndexOpt: Option[FileIndex] = None): CatalogTable


=== [[getCached]] getCached Internal Method

[source, scala]

getCached( tableIdentifier: QualifiedTableName, pathsInMetastore: Seq[Path], schemaInMetastore: StructType, expectedFileFormat: Class[_ <: FileFormat], partitionSchema: Option[StructType]): Option[LogicalRelation]


NOTE: getCached is used when HiveMetastoreCatalog is requested to <>.

=== [[internal-properties]] Internal Properties

[cols="30m,70",options="header",width="100%"] |=== | Name | Description

| catalogProxy a| [[catalogProxy]] SessionCatalog (of the <>).

Used when HiveMetastoreCatalog is requested to <>, <>