PartitionReaderFactory¶
PartitionReaderFactory
is an abstraction of partition reader factories that can create partition or columnar partition readers.
Contract¶
Creating Columnar PartitionReader¶
PartitionReader<ColumnarBatch> createColumnarReader(
InputPartition partition)
Creates a PartitionReader for a columnar scan (to read data) from the given InputPartition
By default, createColumnarReader
throws an UnsupportedOperationException
:
Cannot create columnar reader.
See:
Used when:
DataSourceRDD
is requested to compute a partition (with columnarReads enabled)
Creating PartitionReader¶
PartitionReader<InternalRow> createReader(
InputPartition partition)
Creates a PartitionReader for a row-based scan (to read data) from the given InputPartition
Used when:
DataSourceRDD
is requested to compute a partitionContinuousDataSourceRDD
(Spark Structured Streaming) is requested tocompute
a partition
supportColumnarReads¶
boolean supportColumnarReads(
InputPartition partition)
Controls whether columnar scan can be used (and hence createColumnarReader) or not
By default, supportColumnarReads
indicates no support for columnar scans (and returns false
).
See:
Used when:
DataSourceV2ScanExecBase
is requested to supportsColumnar
Implementations¶
ContinuousPartitionReaderFactory
- FilePartitionReaderFactory
KafkaBatchReaderFactory
MemoryStreamReaderFactory
RateStreamMicroBatchReaderFactory