Skip to content

KafkaSource

KafkaSource is a streaming source that loads data from Apache Kafka.

Creating Instance

KafkaSource takes the following to be created:

KafkaSource is created when:

Metadata Log Directory

KafkaSource uses the metadata log directory to persist offsets. The directory is the source ID under the sources directory in the checkpointRoot (of the StreamExecution).

Note

The checkpointRoot directory is one of the following:

Logging

Enable ALL logging level for org.apache.spark.sql.kafka010.KafkaSource logger to see what happens inside.

Add the following line to conf/log4j.properties:

log4j.logger.org.apache.spark.sql.kafka010.KafkaSource=ALL

Refer to Logging.