HiveClientImpl¶

HiveClientImpl is a HiveClient that uses a Hive metastore client to communicate with a Hive metastore.

Creating Instance¶

HiveClientImpl takes the following to be created:

HiveVersion
Metastore Warehouse Directory
SparkConf (Spark Core)
Hadoop Configuration (Iterable[Map.Entry[String, String]])
Extra Configuration (Map[String, String])
Init ClassLoader
IsolatedClientLoader

When created, HiveClientImpl prints out the following INFO message to the logs:

Warehouse location for Hive client (version [fullVersion]) is [the value of hive.metastore.warehouse.dir]

HiveClientImpl is created when:

IsolatedClientLoader is requested to create a HiveClient

Metastore Warehouse Directory¶

HiveClientImpl is given the directory of the default database of a Hive warehouse.

The directory is the value of hive.metastore.warehouse.dir configuration property (default: /user/hive/warehouse).

Hive Metastore Client¶

client: Hive

client is a Hive metastore client (for meta data/DDL operations using calls to the metastore).

Creating CatalogStatistics¶

readHiveStats(
  properties: Map[String, String]): Option[CatalogStatistics]

readHiveStats creates a CatalogStatistics from the input Hive properties (with table and possibly partition parameters). readHiveStats uses the following Hive properties, if available and greater than 0.

Hive Property	Table Statistic
`totalSize` or `rawDataSize`	sizeInBytes
`numRows`	rowCount

readHiveStats is used when:

HiveClientImpl is requested for the metadata of a table or partition

convertHiveTableToCatalogTable¶

convertHiveTableToCatalogTable(
  h: Table): CatalogTable

convertHiveTableToCatalogTable creates a CatalogTable based on the given Hive Table as follows:

CatalogTable	Hive Table
Table Statistics	readHiveStats
...

convertHiveTableToCatalogTable is used when:

HiveClientImpl is requested to getRawHiveTableOption (and requests RawHiveTableImpl to getRawHiveTableOption), getTablesByName, getTableOption

fromHivePartition¶

fromHivePartition(
  hp: HivePartition): CatalogTablePartition

fromHivePartition...FIXME

fromHivePartition is used when:

HiveClientImpl is requested to getPartitionOption, getPartitions, getPartitionsByFilter

Logging¶

Enable ALL logging level for org.apache.spark.sql.hive.client.HiveClientImpl logger to see what happens inside.

Add the following line to conf/log4j2.properties:

log4j.logger.org.apache.spark.sql.hive.client.HiveClientImpl=ALL

Refer to Logging.