TreeNode — Node in Catalyst Tree¶
TreeNode
is an abstraction of named nodes in Catalyst with zero, one or more children.
Contract¶
children¶
children: Seq[BaseType]
Zero, one or more child nodes of the node
simpleStringWithNodeId¶
simpleStringWithNodeId(): String
One-line description of this node with the node identifier
Used when:
TreeNode
is requested to generateTreeString (with node ID)
verboseString¶
verboseString(
maxFields: Int): String
One-line verbose description
Used when TreeNode
is requested to verboseStringWithSuffix and generateTreeString (with verbose
flag enabled)
Implementations¶
- Block
- Expression
- QueryPlan
Simple Description¶
simpleString: String
simpleString
gives a simple one-line description of a TreeNode
.
Internally, simpleString
is the <
simpleString
is used when TreeNode
is requested for <verbose
flag off).
Numbered String Representation¶
numberedTreeString: String
numberedTreeString
adds numbers to the string representation of this node tree.
numberedTreeString
is used primarily for interactive debugging (using apply and p methods).
Getting n-th TreeNode in Tree (for Interactive Debugging)¶
apply(
number: Int): TreeNode[_]
apply
gives number
-th tree node in a tree.
apply
can be used for interactive debugging.
Internally, apply
<number
position or null
.
Getting n-th BaseType in Tree (for Interactive Debugging)¶
p(
number: Int): BaseType
p
gives number
-th tree node in a tree as BaseType
for interactive debugging.
Note
p
can be used for interactive debugging.
BaseType
is the base type of a tree and in Spark SQL can be:
-
LogicalPlan for logical plan trees
-
SparkPlan for physical plan trees
-
Expression for expression trees
String Representation¶
toString: String
toString
is part of Java's java.lang.Object for the string representation of an object, e.g. TreeNode
.
toString
is a synonym of treeString.
String Representation of All Nodes in Tree¶
treeString: String // (1)
treeString(
verbose: Boolean,
addSuffix: Boolean = false,
maxFields: Int = SQLConf.get.maxToStringFields,
printOperatorId: Boolean = false): String
treeString(
append: String => Unit,
verbose: Boolean,
addSuffix: Boolean,
maxFields: Int,
printOperatorId: Boolean): Unit
verbose
flag is enabled (true
)
printOperatorId
printOperatorId
argument is false
by default and seems turned on only when:
ExplainUtils
utility is used toprocessPlanSkippingSubqueries
treeString
returns the string representation of all the nodes in the TreeNode
.
treeString
is used when:
QueryPlan
is requested to appendTreeNode
is requested for a string representation and numbered string representation
Demo¶
import org.apache.spark.sql.{functions => f}
val q = spark.range(10).withColumn("rand", f.rand())
val executedPlan = q.queryExecution.executedPlan
val output = executedPlan.treeString(verbose = true)
scala> println(output)
*(1) Project [id#0L, rand(6790207094253656854) AS rand#2]
+- *(1) Range (0, 10, step=1, splits=8)
Verbose Description with Suffix¶
verboseStringWithSuffix: String
verboseStringWithSuffix
simply returns <
verboseStringWithSuffix
is used when TreeNode
is requested to <verbose
and addSuffix
flags enabled).
Generating Text Representation¶
generateTreeString(
depth: Int,
lastChildren: Seq[Boolean],
append: String => Unit,
verbose: Boolean,
prefix: String = "",
addSuffix: Boolean = false,
maxFields: Int,
printNodeId: Boolean,
indent: Int = 0): Unit
generateTreeString
...FIXME
generateTreeString
is used when:
TreeNode
is requested for the text representation of all nodes in the tree
Inner Child Nodes¶
innerChildren: Seq[TreeNode[_]]
innerChildren
returns the inner nodes that should be shown as an inner nested tree of this node.
innerChildren
simply returns an empty collection of TreeNodes
.
innerChildren
is used when TreeNode
is requested to <
allChildren¶
allChildren: Set[TreeNode[_]]
NOTE: allChildren
is a Scala lazy value which is computed once when accessed and cached afterwards.
allChildren
...FIXME
allChildren
is used when...FIXME
foreach¶
foreach(f: BaseType => Unit): Unit
foreach
applies the input function f
to itself (this
) first and then (recursively) to the <
Node Name¶
nodeName: String
nodeName
returns the name of the class with Exec
suffix removed (that is used as a naming convention for the class name of physical operators).
nodeName
is used when:
TreeNode
is requested for simpleString and asCode
Scala Definition¶
abstract class TreeNode[BaseType <: TreeNode[BaseType]] extends Product {
self: BaseType =>
// ...
}
TreeNode
is a recursive data structure that can have one or many <TreeNodes
.
Tip
Read up on <:
type operator in Scala in Upper Type Bounds.
Scala-specific, TreeNode
is an abstract class that is the <
TreeNode
therefore allows for building entire trees of TreeNodes
, e.g. generic <TreeNodes
again).
NOTE: Spark SQL uses TreeNode
for <
TreeNode
can itself be a node in a tree or a collection of nodes, i.e. itself and the <TreeNode
come with the <