HadoopFileLinesReader¶
HadoopFileLinesReader is a Scala Iterator of Apache Hadoop's org.apache.hadoop.io.Text.
HadoopFileLinesReader is <
SimpleTextSourceLibSVMFileFormatTextInputCSVDataSourceTextInputJsonDataSourceTextFileFormat
HadoopFileLinesReader uses the internal <
Creating Instance¶
HadoopFileLinesReader takes the following when created:
- [[file]] PartitionedFile
- [[conf]] Hadoop's
Configuration
=== [[iterator]] iterator Internal Property
[source, scala]¶
iterator: RecordReaderIterator[Text]¶
When <HadoopFileLinesReader creates an internal iterator that uses Hadoop's https://hadoop.apache.org/docs/r2.7.3/api/org/apache/hadoop/mapreduce/lib/input/FileSplit.html[org.apache.hadoop.mapreduce.lib.input.FileSplit] with Hadoop's https://hadoop.apache.org/docs/r2.7.3/api/org/apache/hadoop/fs/Path.html[org.apache.hadoop.fs.Path] and <
iterator creates Hadoop's TaskAttemptID, TaskAttemptContextImpl and LineRecordReader.
iterator initializes LineRecordReader and passes it on to a RecordReaderIterator.
NOTE: iterator is used for Iterator-specific methods, i.e. hasNext, next and close.