Skip to content

DeltaSourceSnapshot

= DeltaSourceSnapshot

[[SnapshotIterator]][[StateCache]] DeltaSourceSnapshot is a <> with <>

DeltaSourceSnapshot is <> when DeltaSource is requested for the <>.

[[version]] When <>, DeltaSourceSnapshot requests the <> for the <> that it uses for the <> (a new column and the name of the cached RDD).

== [[creating-instance]] Creating DeltaSourceSnapshot Instance

DeltaSourceSnapshot takes the following to be created:

  • [[spark]] SparkSession
  • [[snapshot]] <>
  • [[filters]] Filter expressions (Seq[Expression])

== [[initialFiles]] Initial Files (Indexed AddFiles) -- initialFiles Method

[source, scala]

initialFiles: Dataset[IndexedFile]

initialFiles requests the <> for <> (Dataset[AddFile]) and sorts them by <> and <> in ascending order.

initialFiles zips the <> with indices (using RDD.zipWithIndex operator), adds two new columns with the <> and isLast as false, and finally creates a Dataset[IndexedFile].

In the end, initialFiles <> with the following name (with the <> and the <> of the <>)

Delta Source Snapshot #[version] - [redactedPath]

NOTE: initialFiles is used exclusively when SnapshotIterator is requested for a <>.


Last update: 2020-09-24