RowDataSourceScanExec Leaf Physical Operator¶
RowDataSourceScanExec
is a DataSourceScanExec (and so indirectly a leaf physical operator) for scanning data from a BaseRelation.
RowDataSourceScanExec
is an InputRDDCodegen
.
Performance Metrics¶
Key | Name (in web UI) | Description |
---|---|---|
numOutputRows | number of output rows | Number of output rows |
Creating Instance¶
RowDataSourceScanExec
takes the following to be created:
- Output Schema (Attributes)
- Required Schema (StructType)
- Data Source Filter Predicates
- Handled Data Source Filter Predicates
-
RDD[InternalRow]
- BaseRelation
- Optional
TableIdentifier
RowDataSourceScanExec
is created when:
- DataSourceStrategy execution planning strategy is executed (for LogicalRelation logical operators)
Metadata¶
metadata: Map[String, String]
metadata
is part of the DataSourceScanExec abstraction.
metadata
marks the filter predicates that are included in the handled filters predicates with *
(star).
Note
Filter predicates with *
(star) are to denote filters that are pushed down to a relation (aka data source).
In the end, metadata
creates the following mapping:
- ReadSchema with the required schema converted to catalog representation
- PushedFilters with the marked and unmarked filter predicates
createUnsafeProjection¶
createUnsafeProjection: Boolean
createUnsafeProjection
is true
.
createUnsafeProjection
is part of the InputRDDCodegen
abstraction.
Input RDD¶
inputRDD: RDD[InternalRow]
inputRDD
is the RDD.
inputRDD
is part of the InputRDDCodegen
abstraction.