DataFiltersBuilder¶
DataFiltersBuilder
builds data filters for Data Skipping.
DataFiltersBuilder
used when DataSkippingReaderBase
is requested for the filesForScan with data filters and spark.databricks.delta.stats.skipping enabled.
Creating Instance¶
DataFiltersBuilder
takes the following to be created:
-
SparkSession
(Spark SQL) - DeltaDataSkippingType
DataFiltersBuilder
is created when:
DataSkippingReaderBase
is requested to filesForScan (with data filters and spark.databricks.delta.stats.skipping enabled)
StatsProvider¶
DataFiltersBuilder
creates a StatsProvider (for the getStatsColumnOpt) when created.
Creating DataSkippingPredicate¶
apply(
dataFilter: Expression): Option[DataSkippingPredicate]
apply
constructDataFilters for the given dataFilter
expression.
apply
is used when:
DataSkippingReaderBase
is requested to filesForScan (with data filters and spark.databricks.delta.stats.skipping enabled)
constructDataFilters¶
constructDataFilters(
dataFilter: Expression): Option[DataSkippingPredicate]
constructDataFilters
creates a DataSkippingPredicate
for expression types that can be used for data skipping.
constructDataFilters
...FIXME
For IsNull
with a skipping-eligible column, constructDataFilters
requests the StatsProvider for the getPredicateWithStatType for nullCount to build a Catalyst expression to match files with null count larger than zero.
nullCount > Literal(0)
For IsNotNull
with a skipping-eligible column, constructDataFilters
creates StatsColumn
s for the following:
constructDataFilters
requests the StatsProvider for the getPredicateWithStatsColumns for the two StatsColumn
s to build a Catalyst expression to match files with null count less than the row count.
nullCount < numRecords
constructDataFilters
...FIXME
constructLiteralInListDataFilters¶
constructLiteralInListDataFilters(
a: Expression,
possiblyNullValues: Seq[Any]): Option[DataSkippingPredicate]
constructLiteralInListDataFilters
...FIXME