KafkaScan¶

The Internals of Spark SQL

Head over to The Internals of Spark SQL to learn more.

KafkaScan is a logical Scan (Spark SQL).

Creating Instance¶

KafkaScan takes the following to be created:

KafkaScan is created when:

toMicroBatchStream(
  checkpointLocation: String): MicroBatchStream

toMicroBatchStream is part of the Scan (Spark SQL) abstraction.

toMicroBatchStream validates the streaming part of the options.

toMicroBatchStream removes kafka prefix from the keys in the options.

In the end, toMicroBatchStream creates a KafkaMicroBatchStream with the following:

supportedCustomMetrics(): Array[CustomMetric]

supportedCustomMetrics is part of the Scan (Spark SQL) abstraction.

supportedCustomMetrics is the following metrics: