Skip to content

LogicalRelation Leaf Logical Operator

LogicalRelation is a leaf logical operator that represents a BaseRelation in a logical query plan.

LogicalRelation is a MultiInstanceRelation.

Creating Instance

LogicalRelation takes the following to be created:

LogicalRelation is created using apply factory.

apply Utility

  relation: BaseRelation,
  isStreaming: Boolean = false): LogicalRelation
  relation: BaseRelation,
  table: CatalogTable): LogicalRelation

apply wraps the given BaseRelation into a LogicalRelation (so it could be used in a logical query plan).

apply creates a LogicalRelation for the given BaseRelation (with a CatalogTable and isStreaming flag).

import org.apache.spark.sql.sources.BaseRelation
val baseRelation: BaseRelation = ???

val data = spark.baseRelationToDataFrame(baseRelation)

apply is used when:


refresh(): Unit

refresh is part of LogicalPlan abstraction.

refresh requests the FileIndex (of the HadoopFsRelation) to refresh.


refresh does the work for HadoopFsRelation relations only.

Simple Text Representation

  maxFields: Int): String

simpleString is part of the QueryPlan abstraction.

simpleString is made up of the output schema (truncated to maxFields) and the relation:

Relation[[output]] [relation]


val q ="")
val logicalPlan = q.queryExecution.logical

scala> println(logicalPlan.simpleString)
Relation[value#2] text


computeStats(): Statistics

computeStats takes the optional CatalogTable.

If available, computeStats requests the CatalogTable for the CatalogStatistics that, if available, is requested to toPlanStats (with the planStatsEnabled flag enabled when either spark.sql.cbo.enabled or spark.sql.cbo.planStats.enabled is enabled).

Otherwise, computeStats creates a Statistics with the sizeInBytes only to be the sizeInBytes of the BaseRelation.

computeStats is part of the LeafNode abstraction.


The following are two logically-equivalent batch queries described using different Spark APIs: Scala and SQL.

val format = "csv"
val path = "../datasets/people.csv"
val q = spark
  .option("header", true)
scala> println(q.queryExecution.logical.numberedTreeString)
00 Relation[id#16,name#17] csv
val q = sql(s"select * from `$format`.`$path`")
scala> println(q.queryExecution.optimizedPlan.numberedTreeString)
00 Relation[_c0#74,_c1#75] csv