KafkaScan¶
KafkaScan is a Scan (a logical scan over data in Apache Kafka).
Creating Instance¶
KafkaScan takes the following to be created:
- Case-Insensitive Options
KafkaScan is created when:
KafkaTableis requested for a ScanBuilder
Read Schema¶
readSchema(): StructType
readSchema is part of the Scan abstraction.
readSchema builds the read schema (possibly with records headers based on includeHeaders option).
Supported Custom Metrics¶
supportedCustomMetrics(): Array[CustomMetric]
supportedCustomMetrics is part of the Scan abstraction.
supportedCustomMetrics gives the following CustomMetrics:
OffsetOutOfRangeMetricDataLossMetric
toMicroBatchStream¶
toMicroBatchStream(
checkpointLocation: String): MicroBatchStream
toMicroBatchStream is part of the Scan abstraction.
toMicroBatchStream validateStreamOptions.
toMicroBatchStream streamingUniqueGroupId.
toMicroBatchStream convertToSpecifiedParams.
toMicroBatchStream determines KafkaOffsetRangeLimit based on the following options (with LatestOffsetRangeLimit as the default):
toMicroBatchStream builds a KafkaOffsetReader for the following:
| Argument | Value |
|---|---|
| ConsumerStrategy | strategy |
driverKafkaParams | Kafka Configuration Properties for Driver |
readerOptions | options |
driverGroupIdPrefix | streamingUniqueGroupId with -driver suffix |
In the end, toMicroBatchStream creates a KafkaMicroBatchStream (Spark Structured Streaming) with the following: