Skip to content

DataSourceV2Utils Utility

DataSourceV2Utils is an utility to extractSessionConfigs and getTableFromProvider for batch and streaming reads and writes.

extractSessionConfigs

extractSessionConfigs(
  source: TableProvider,
  conf: SQLConf): Map[String, String]

Note

extractSessionConfigs supports data sources with SessionConfigSupport only.

extractSessionConfigs requests the SessionConfigSupport data source for the custom key prefix for configuration options that is used to find all configuration options with the keys in the format of spark.datasource.[keyPrefix] in the given SQLConf.

extractSessionConfigs returns the matching keys with the spark.datasource.[keyPrefix] prefix removed (i.e. spark.datasource.keyPrefix.k1 becomes k1).

extractSessionConfigs is used when:

  • DataFrameReader is requested to load data
  • DataFrameWriter is requested to save data
  • (Spark Structured Streaming) DataStreamReader is requested to load data from a streaming data source
  • (Spark Structured Streaming) DataStreamWriter is requested to start a streaming query

Creating Table (using TableProvider)

getTableFromProvider(
  provider: TableProvider,
  options: CaseInsensitiveStringMap,
  userSpecifiedSchema: Option[StructType]): Table

getTableFromProvider creates a Table for the given TableProvider, options and user-defined schema.


getTableFromProvider is used when:

loadV2Source

loadV2Source(
  sparkSession: SparkSession,
  provider: TableProvider,
  userSpecifiedSchema: Option[StructType],
  extraOptions: CaseInsensitiveMap[String],
  source: String,
  paths: String*): Option[DataFrame]

loadV2Source creates a DataFrame.


loadV2Source is used when: