Scan¶
Scan is an abstraction of logical scans over data sources.
Contract¶
Description¶
String description()
Human-readable description of this scan (e.g. for logging purposes).
default: the fully-qualified class name
Used when:
- BatchScanExecphysical operator is requested for the simpleString
- DataSourceV2ScanExecBasephysical operator is requested for the simpleString and verboseStringWithOperatorId
Read Schema¶
StructType readSchema()
Read schema of this scan
Used when:
- FileScanis requested for the partition and data filters
- GroupBasedRowLevelOperationScanPlanningis executed
- PushDownUtilsutility is used to pruneColumns
- V2ScanRelationPushDown logical optimization is executed (and requested to pushDownAggregates)
Supported Custom Metrics¶
CustomMetric[] supportedCustomMetrics()
Empty by default and expected to be overriden by implementations
See:
Used when:
- DataSourceV2ScanExecBasephysical operator is requested for the custom metrics
Physical Representation for Batch Query¶
Batch toBatch()
By default, toBatch throws an UnsupportedOperationException (with the description):
[description]: Batch scan are not supported
See:
Must be implemented (overriden), if the Table that created this Scan has BATCH_READ capability (among the capabilities).
Used when:
- BatchScanExecphysical operator is requested for the Batch and the filteredPartitions
Converting to ContinuousStream¶
ContinuousStream toContinuousStream(
    String checkpointLocation)
By default, toContinuousStream throws an UnsupportedOperationException (with the description):
[description]: Continuous scan are not supported
Must be implemented (overriden), if the Table that created this Scan has CONTINUOUS_READ capability (among the capabilities).
Used when:
- ContinuousExecution(Spark Structured Streaming) is requested for the logical plan (WriteToContinuousDataSource)
Converting to MicroBatchStream¶
MicroBatchStream toMicroBatchStream(
    String checkpointLocation)
By default, toMicroBatchStream throws an UnsupportedOperationException (with the description):
[description]: Micro-batch scan are not supported
Must be implemented (overriden), if the Table that created this Scan has MICRO_BATCH_READ capability (among the capabilities).
Used when:
- MicroBatchExecution(Spark Structured Streaming) is requested for the logical plan