Skip to content

ParquetTable

ParquetTable is a FileTable of ParquetDataSourceV2 in Parquet Data Source.

ParquetTable uses ParquetScanBuilder for scanning and ParquetWrite for writing.

Creating Instance

ParquetTable takes the following to be created:

ParquetTable is created when:

  • ParquetDataSourceV2 is requested for a Table

Format Name

Signature
formatName: String

formatName is part of the FileTable abstraction.

formatName is the following text:

Parquet

Schema Inference

Signature
inferSchema(
  files: Seq[FileStatus]): Option[StructType]

inferSchema is part of the FileTable abstraction.

inferSchema infers the schema (with the options and the input Hadoop FileStatuses).

Creating ScanBuilder

Signature
newScanBuilder(
  options: CaseInsensitiveStringMap): ParquetScanBuilder

newScanBuilder is part of the SupportsRead abstraction.

newScanBuilder creates a ParquetScanBuilder with the following:

Creating WriteBuilder

Signature
newWriteBuilder(
  info: LogicalWriteInfo): WriteBuilder

newWriteBuilder is part of the SupportsWrite abstraction.

newWriteBuilder creates a WriteBuilder that creates a ParquetWrite (when requested to build a Write).

supportsDataType

Signature
supportsDataType(
  dataType: DataType): Boolean

supportsDataType is part of the FileTable abstraction.

supportsDataType supports all AtomicTypes and the following complex DataTypes with AtomicTypes: