Skip to content

FileScan

FileScan is an extension of the Scan abstraction for scans in Batch queries.

FileScan is with SupportsReportStatistics.

Contract

dataFilters

dataFilters: Seq[Expression]

Used when...FIXME

fileIndex

fileIndex: PartitioningAwareFileIndex

Used when...FIXME

getFileUnSplittableReason

getFileUnSplittableReason(
  path: Path): String

Used when...FIXME

partitionFilters

partitionFilters: Seq[Expression]

Used when...FIXME

readDataSchema

readDataSchema: StructType

Used when...FIXME

readPartitionSchema

readPartitionSchema: StructType

Used when...FIXME

seqToString

seqToString(
  seq: Seq[Any]): String

Used when...FIXME

sparkSession

sparkSession: SparkSession

Used when...FIXME

withFilters

withFilters(
  partitionFilters: Seq[Expression],
  dataFilters: Seq[Expression]): FileScan

Used when...FIXME

Implementations

description

description(): String

description...FIXME

description is part of the Scan abstraction.

planInputPartitions

planInputPartitions(): Array[InputPartition]

planInputPartitions is partitions.

planInputPartitions is part of the Batch abstraction.

FilePartitions

partitions: Seq[FilePartition]

partitions requests the PartitioningAwareFileIndex for the partition directories (selectedPartitions).

For every selected partition directory, partitions requests the Hadoop FileStatuses that are split (if isSplitable) to maxSplitBytes and sorted by size (in reversed order).

In the end, partitions returns the FilePartitions.

estimateStatistics

estimateStatistics(): Statistics

estimateStatistics...FIXME

estimateStatistics is part of the SupportsReportStatistics abstraction.

toBatch

toBatch: Batch

toBatch is enabled (true) by default.

toBatch is part of the Scan abstraction.

readSchema

readSchema(): StructType

readSchema...FIXME

readSchema is part of the Scan abstraction.

isSplitable

isSplitable(
  path: Path): Boolean

isSplitable is false.

Used when: