DataSourceV2ScanExecBase Leaf Physical Operators¶
DataSourceV2ScanExecBase
is an extension of LeafExecNode abstraction for leaf physical operators that track number of output rows when executed (with or without support for columnar reads).
Contract¶
Input RDD¶
inputRDD: RDD[InternalRow]
Used when...FIXME
InputPartitions¶
partitions: Seq[InputPartition]
Used when:
-
BatchScanExec
physical operator is requested for an input RDD -
ContinuousScanExec
andMicroBatchScanExec
physical operators (from Spark Structured Streaming) are requested for aninputRDD
-
DataSourceV2ScanExecBase
physical operator is requested to outputPartitioning or supportsColumnar
PartitionReaderFactory¶
readerFactory: PartitionReaderFactory
PartitionReaderFactory for partition readers
Used when:
-
BatchScanExec
physical operator is requested for an input RDD -
ContinuousScanExec
andMicroBatchScanExec
physical operators (from Spark Structured Streaming) are requested for aninputRDD
-
DataSourceV2ScanExecBase
physical operator is requested to outputPartitioning or supportsColumnar
Scan¶
scan: Scan
Used when...FIXME
Implementations¶
- BatchScanExec
- others
Executing Physical Operator¶
doExecute(): RDD[InternalRow]
doExecute
is part of the SparkPlan abstraction.
doExecute
...FIXME
doExecuteColumnar¶
doExecuteColumnar(): RDD[ColumnarBatch]
doExecuteColumnar
is part of the SparkPlan abstraction.
doExecuteColumnar
...FIXME
inputRDDs¶
inputRDDs(): Seq[RDD[InternalRow]]
inputRDDs
...FIXME
inputRDDs
is used when...FIXME
Performance Metrics¶
metrics: Map[String, SQLMetric]
metrics
is part of the SparkPlan abstraction.
metrics
...FIXME
Output Data Partitioning Requirements¶
outputPartitioning: physical.Partitioning
outputPartitioning
is part of the SparkPlan abstraction.
outputPartitioning
...FIXME
Simple Node Description¶
simpleString(
maxFields: Int): String
simpleString
is part of the TreeNode abstraction.
simpleString
...FIXME
supportsColumnar¶
supportsColumnar: Boolean
supportsColumnar
is part of the SparkPlan abstraction.
supportsColumnar
...FIXME