Skip to content

ParquetScan

ParquetScan is a FileScan.

Creating Instance

ParquetScan takes the following to be created:

ParquetScan is created when:

createReaderFactory

createReaderFactory(): PartitionReaderFactory

createReaderFactory creates a ParquetPartitionReaderFactory (with the Hadoop Configuration broadcast).

createReaderFactory adds the following properties to the Hadoop Configuration before broadcasting it (to executors).

Name Value
ParquetInputFormat.READ_SUPPORT_CLASS ParquetReadSupport
others

createReaderFactory is part of the Batch abstraction.

isSplitable

isSplitable(
  path: Path): Boolean

isSplitable is true.

isSplitable is part of the FileScan abstraction.