RowDataSourceScanExec Leaf Physical Operator¶
RowDataSourceScanExec is a DataSourceScanExec (and so indirectly a leaf physical operator) for scanning data from a BaseRelation.
RowDataSourceScanExec is an
|Key||Name (in web UI)||Description|
|numOutputRows||number of output rows||Number of output rows|
RowDataSourceScanExec takes the following to be created:
- Output Schema (Attributes)
- Required Schema (StructType)
- Data Source Filter Predicates
- Handled Data Source Filter Predicates
RowDataSourceScanExec is created when:
- DataSourceStrategy execution planning strategy is executed (for LogicalRelation logical operators)
metadata: Map[String, String]
metadata is part of the DataSourceScanExec abstraction.
metadata marks the filter predicates that are included in the handled filters predicates with
Filter predicates with
* (star) are to denote filters that are pushed down to a relation (aka data source).
In the end,
metadata creates the following mapping:
- ReadSchema with the required schema converted to catalog representation
- PushedFilters with the marked and unmarked filter predicates
createUnsafeProjection is part of the
inputRDD is the RDD.
inputRDD is part of the