Data Source Filter Predicate¶
Filter is the <
Filter is used when:
-
(Data Source API V1)
BaseRelationis requested for unhandled filter predicates (and henceBaseRelationimplementations, i.e. JDBCRelation) -
(Data Source API V1)
PrunedFilteredScanis requested for build a scan (and hencePrunedFilteredScanimplementations, i.e. JDBCRelation) -
FileFormatis requested to buildReader (and henceFileFormatimplementations, i.e.OrcFileFormat,CSVFileFormat,JsonFileFormat,TextFileFormatand Spark MLlib'sLibSVMFileFormat) -
FileFormatis requested to build a Data Reader with partition column values appended (and henceFileFormatimplementations, i.e.OrcFileFormat, ParquetFileFormat) -
RowDataSourceScanExecis RowDataSourceScanExec.md#creating-instance[created] (for a DataSourceScanExec.md#simpleString[simple text representation (in a query plan tree)]) -
DataSourceStrategyexecution planning strategy is requested to pruneFilterProject (when executed for LogicalRelation.md[LogicalRelation] logical operators with a PrunedFilteredScan or a PrunedScan) -
DataSourceStrategyexecution planning strategy is requested to selectFilters
[[contract]] [source, scala]
package org.apache.spark.sql.sources
abstract class Filter { // only required methods that have no implementation // the others follow def references: Array[String] }
.Filter Contract [cols="1,2",options="header",width="100%"] |=== | Method | Description
| references a| [[references]] Column references, i.e. list of column names that are referenced by a filter
Used when:
-
Filteris requested to <> -
<
>, < > and < > filters are requested for the < > |===
=== [[findReferences]] Finding Column References in Any Value -- findReferences Method
[source, scala]¶
findReferences(value: Any): Array[String]¶
findReferences takes the <value filter is it is one or returns an empty array.