Scan¶
Scan
is an abstraction of logical scans over data sources.
Contract¶
Description¶
String description()
Human-readable description of this scan (e.g. for logging purposes).
default: the fully-qualified class name
Used when:
BatchScanExec
physical operator is requested for the simpleStringDataSourceV2ScanExecBase
physical operator is requested for the simpleString and verboseStringWithOperatorId
Read Schema¶
StructType readSchema()
Read schema of this scan
Used when:
FileScan
is requested for the partition and data filtersGroupBasedRowLevelOperationScanPlanning
is executedPushDownUtils
utility is used to pruneColumns- V2ScanRelationPushDown logical optimization is executed (and requested to pushDownAggregates)
Supported Custom Metrics¶
CustomMetric[] supportedCustomMetrics()
Empty by default and expected to be overriden by implementations
See:
Used when:
DataSourceV2ScanExecBase
physical operator is requested for the custom metrics
Physical Representation for Batch Query¶
Batch toBatch()
By default, toBatch
throws an UnsupportedOperationException
(with the description):
[description]: Batch scan are not supported
See:
Must be implemented (overriden), if the Table that created this Scan
has BATCH_READ capability (among the capabilities).
Used when:
BatchScanExec
physical operator is requested for the Batch and the filteredPartitions
Converting to ContinuousStream¶
ContinuousStream toContinuousStream(
String checkpointLocation)
By default, toContinuousStream
throws an UnsupportedOperationException
(with the description):
[description]: Continuous scan are not supported
Must be implemented (overriden), if the Table that created this Scan
has CONTINUOUS_READ capability (among the capabilities).
Used when:
ContinuousExecution
(Spark Structured Streaming) is requested for the logical plan (WriteToContinuousDataSource)
Converting to MicroBatchStream¶
MicroBatchStream toMicroBatchStream(
String checkpointLocation)
By default, toMicroBatchStream
throws an UnsupportedOperationException
(with the description):
[description]: Micro-batch scan are not supported
Must be implemented (overriden), if the Table that created this Scan
has MICRO_BATCH_READ capability (among the capabilities).
Used when:
MicroBatchExecution
(Spark Structured Streaming) is requested for the logical plan