ParquetTable¶
ConvertTargetTable¶
ParquetTable
is a ConvertTargetTable.
Creating Instance¶
ParquetTable
takes the following to be created:
-
SparkSession
- Base Path
- Partition Schema (
Option[StructType]
)
ParquetTable
is created when:
ConvertToDeltaCommand
is requested to getTargetTable
numFiles¶
numFiles: Long
numFiles
inferSchema when _numFiles registry is uninitialized.
In the end, numFiles
returns the value of _numFiles registry.
numFiles
is part of the ConvertTargetTable abstraction.
_numFiles¶
_numFiles: Option[Long]
ParquetTable
defines _numFiles
internal registry.
_numFiles
is None
(uninitialized) when ParquetTable
is created.
_numFiles
is initialized once when ParquetTable
is requested for the numFiles (and inferSchema).
_numFiles
is used for the numFiles.
inferSchema¶
inferSchema(): Unit
inferSchema
...FIXME
inferSchema
is used when:
ParquetTable
is requested for the numFiles and tableSchema
getSchemaForBatch¶
getSchemaForBatch(
spark: SparkSession,
batch: Seq[SerializableFileStatus],
serializedConf: SerializableConfiguration): StructType
getSchemaForBatch
...FIXME
mergeSchemasInParallel¶
mergeSchemasInParallel(
sparkSession: SparkSession,
filesToTouch: Seq[FileStatus],
serializedConf: SerializableConfiguration): Option[StructType]
mergeSchemasInParallel
...FIXME