DeltaFileOperations Utilities

Utilities

recursiveListDirs

recursiveListDirs(
  spark: SparkSession,
  subDirs: Seq[String],
  hadoopConf: Broadcast[SerializableConfiguration],
  hiddenFileNameFilter: String => Boolean = defaultHiddenFileFilter,
  fileListingParallelism: Option[Int] = None): Dataset[SerializableFileStatus]

recursiveListDirs…​FIXME

recursiveListDirs is used when:

  • ManualListingFileManifest (of ConvertToDeltaCommandBase) is requested to doList

  • VacuumCommand utility is used to gc

tryDeleteNonRecursive

tryDeleteNonRecursive(
  fs: FileSystem,
  path: Path,
  tries: Int = 3): Boolean

tryDeleteNonRecursive…​FIXME

tryDeleteNonRecursive is used when VacuumCommandImpl is requested to delete

Internal Methods

recurseDirectories

recurseDirectories(
  logStore: LogStore,
  filesAndDirs: Iterator[SerializableFileStatus],
  hiddenFileNameFilter: String => Boolean): Iterator[SerializableFileStatus]

recurseDirectories…​FIXME

recurseDirectories is used when DeltaFileOperations is requested to recursiveListDirs and listUsingLogStore.

listUsingLogStore

listUsingLogStore(
  logStore: LogStore,
  subDirs: Iterator[String],
  recurse: Boolean,
  hiddenFileNameFilter: String => Boolean): Iterator[SerializableFileStatus]

listUsingLogStore…​FIXME

listUsingLogStore is used when DeltaFileOperations is requested to recursiveListDirs and recurseDirectories.

isThrottlingError

isThrottlingError(
  t: Throwable): Boolean

isThrottlingError returns true when the Throwable contains slow down.

isThrottlingError is used when DeltaFileOperations is requested to listUsingLogStore and tryDeleteNonRecursive.