PartitionReaderFactory¶
PartitionReaderFactory is an abstraction of partition reader factories that can create partition or columnar partition readers.
Contract¶
Creating Columnar PartitionReader¶
PartitionReader<ColumnarBatch> createColumnarReader(
InputPartition partition)
Creates a PartitionReader for a columnar scan (to read data) from the given InputPartition
By default, createColumnarReader throws an UnsupportedOperationException:
Cannot create columnar reader.
See:
Used when:
DataSourceRDDis requested to compute a partition (with columnarReads enabled)
Creating PartitionReader¶
PartitionReader<InternalRow> createReader(
InputPartition partition)
Creates a PartitionReader for a row-based scan (to read data) from the given InputPartition
Used when:
DataSourceRDDis requested to compute a partitionContinuousDataSourceRDD(Spark Structured Streaming) is requested tocomputea partition
supportColumnarReads¶
boolean supportColumnarReads(
InputPartition partition)
Controls whether columnar scan can be used (and hence createColumnarReader) or not
By default, supportColumnarReads indicates no support for columnar scans (and returns false).
See:
Used when:
DataSourceV2ScanExecBaseis requested to supportsColumnar
Implementations¶
ContinuousPartitionReaderFactory- FilePartitionReaderFactory
KafkaBatchReaderFactoryMemoryStreamReaderFactoryRateStreamMicroBatchReaderFactory