ParquetTable¶
ParquetTable
is a FileTable of ParquetDataSourceV2 in Parquet Data Source.
ParquetTable
uses ParquetScanBuilder for scanning and ParquetWrite for writing.
Creating Instance¶
ParquetTable
takes the following to be created:
- Name
- SparkSession
- Case-insensitive options
- Paths
- User-specified schema
- Fallback FileFormat
ParquetTable
is created when:
ParquetDataSourceV2
is requested for a Table
Format Name¶
formatName
is the following text:
Parquet
Schema Inference¶
Signature
inferSchema(
files: Seq[FileStatus]): Option[StructType]
inferSchema
is part of the FileTable abstraction.
inferSchema
infers the schema (with the options and the input Hadoop FileStatus
es).
Creating ScanBuilder¶
Signature
newScanBuilder(
options: CaseInsensitiveStringMap): ParquetScanBuilder
newScanBuilder
is part of the SupportsRead abstraction.
newScanBuilder
creates a ParquetScanBuilder with the following:
Creating WriteBuilder¶
Signature
newWriteBuilder(
info: LogicalWriteInfo): WriteBuilder
newWriteBuilder
is part of the SupportsWrite abstraction.
newWriteBuilder
creates a WriteBuilder that creates a ParquetWrite (when requested to build a Write).
supportsDataType¶
Signature
supportsDataType(
dataType: DataType): Boolean
supportsDataType
is part of the FileTable abstraction.
supportsDataType
supports all AtomicTypes and the following complex DataTypes with AtomicType
s:
- ArrayType
MapType
- StructType
- UserDefinedType