KafkaScan¶
KafkaScan
is a Scan (a logical scan over data in Apache Kafka).
Creating Instance¶
KafkaScan
takes the following to be created:
- Case-Insensitive Options
KafkaScan
is created when:
KafkaTable
is requested for a ScanBuilder
Read Schema¶
readSchema(): StructType
readSchema
is part of the Scan abstraction.
readSchema
builds the read schema (possibly with records headers based on includeHeaders option).
Supported Custom Metrics¶
supportedCustomMetrics(): Array[CustomMetric]
supportedCustomMetrics
is part of the Scan abstraction.
supportedCustomMetrics
gives the following CustomMetrics:
OffsetOutOfRangeMetric
DataLossMetric
toMicroBatchStream¶
toMicroBatchStream(
checkpointLocation: String): MicroBatchStream
toMicroBatchStream
is part of the Scan abstraction.
toMicroBatchStream
validateStreamOptions.
toMicroBatchStream
streamingUniqueGroupId.
toMicroBatchStream
convertToSpecifiedParams.
toMicroBatchStream
determines KafkaOffsetRangeLimit based on the following options (with LatestOffsetRangeLimit
as the default):
toMicroBatchStream
builds a KafkaOffsetReader for the following:
Argument | Value |
---|---|
ConsumerStrategy | strategy |
driverKafkaParams | Kafka Configuration Properties for Driver |
readerOptions | options |
driverGroupIdPrefix | streamingUniqueGroupId with -driver suffix |
In the end, toMicroBatchStream
creates a KafkaMicroBatchStream
(Spark Structured Streaming) with the following: