RelationConversions PostHoc Logical Evaluation Rule¶
RelationConversions is a HiveSessionStateBuilder.md#postHocResolutionRules[posthoc logical resolution rule] that the HiveSessionStateBuilder.md#analyzer[Hive-specific logical analyzer] uses to <
CAUTION: FIXME Show example of a hive table, e.g. spark.table(...)
RelationConversions is <
=== [[creating-instance]] Creating RelationConversions Instance
RelationConversions takes the following when created:
- [[conf]] SQLConf
- [[sessionCatalog]] Hive-specific session catalog
=== [[apply]] Executing Rule -- apply Method
[source, scala]¶
apply( plan: LogicalPlan): LogicalPlan
NOTE: apply is part of the ../catalyst/Rule.md#apply[Rule] contract to execute (apply) a rule on a ../spark-sql-LogicalPlan.md[LogicalPlan].
apply traverses the input ../spark-sql-LogicalPlan.md[logical plan] looking for ../InsertIntoTable.md[InsertIntoTables] (over a HiveTableRelation.md[HiveTableRelation]) or HiveTableRelation.md[HiveTableRelation] logical operators:
[[apply-InsertIntoTable]] * For an ../InsertIntoTable.md[InsertIntoTable] over a HiveTableRelation.md[HiveTableRelation] that is HiveTableRelation.md#isPartitioned[non-partitioned] and <apply creates a new InsertIntoTable with the HiveTableRelation <
[[apply-HiveTableRelation]] * For a HiveTableRelation logical operator alone apply...FIXME
=== [[isConvertible]] Does Table Use Parquet or ORC SerDe? -- isConvertible Internal Method
[source, scala]¶
isConvertible( relation: HiveTableRelation): Boolean
isConvertible is positive when the input HiveTableRelation.md#tableMeta[HiveTableRelation] is a parquet or ORC table (and corresponding SQL properties are enabled).
Internally, isConvertible takes the Hive SerDe of the table (from HiveTableRelation.md#tableMeta[table metadata]) if available or assumes no SerDe.
isConvertible is turned on when either condition holds:
-
The Hive SerDe is
parquet(aka parquet table) and spark.sql.hive.convertMetastoreParquet configuration property is enabled -
The Hive SerDe is
orc(aka orc table) and spark.sql.hive.convertMetastoreOrc configuration property is enabled
NOTE: isConvertible is used when RelationConversions is <
=== [[convert]] Converting HiveTableRelation to HadoopFsRelation -- convert Internal Method
[source, scala]¶
convert( relation: HiveTableRelation): LogicalRelation
convert branches based on the SerDe of (the storage format of) the input HiveTableRelation logical operator.
For Hive tables in parquet format, convert creates options with one extra mergeSchema per spark.sql.hive.convertMetastoreParquet.mergeSchema configuration property and requests the HiveMetastoreCatalog to convert a HiveTableRelation to a LogicalRelation (with ParquetFileFormat).
For non-parquet Hive tables, convert assumes ORC format:
-
When spark.sql.orc.impl configuration property is
native(default)convertrequestsHiveMetastoreCatalogto HiveMetastoreCatalog.md#convertToLogicalRelation[convert a HiveTableRelation to a LogicalRelation over a HadoopFsRelation] (withorg.apache.spark.sql.execution.datasources.orc.OrcFileFormatasfileFormatClass). -
Otherwise,
convertrequestsHiveMetastoreCatalogto HiveMetastoreCatalog.md#convertToLogicalRelation[convert a HiveTableRelation to a LogicalRelation over a HadoopFsRelation] (withorg.apache.spark.sql.hive.orc.OrcFileFormatasfileFormatClass).
NOTE: convert uses the <
[NOTE]¶
convert is used when RelationConversions logical evaluation rule is <
- Transforms an ../InsertIntoTable.md[InsertIntoTable] over a
HiveTableRelationwith a Hive table (i.e. withhiveprovider) that is not partitioned and usesparquetororcdata storage format