Skip to content

KafkaDataConsumer

KafkaDataConsumer is the <> for <> that use an <> for the following:

  • <>

  • <>

KafkaDataConsumer has to be <> explicitly.

[[contract]] [source, scala]


package org.apache.spark.sql.kafka010

sealed trait KafkaDataConsumer { // only required properties (vals and methods) that have no implementation // the others follow def internalConsumer: InternalKafkaConsumer def release(): Unit }


.KafkaDataConsumer Contract [cols="1m,2",options="header",width="100%"] |=== | Property | Description

| internalConsumer a| [[internalConsumer]] Used when:

  • KafkaDataConsumer is requested to <> and <>

  • CachedKafkaDataConsumer and NonCachedKafkaDataConsumer are requested to <> the InternalKafkaConsumer

| release a| [[release]] Used when:

  • KafkaSourceRDD is requested to compute a partition

  • (Spark Structured Streaming) KafkaContinuousDataReader is requested to close |===

[[implementations]] .KafkaDataConsumers [cols="1,2",options="header",width="100%"] |=== | KafkaDataConsumer | Description

| CachedKafkaDataConsumer | [[CachedKafkaDataConsumer]]

| NonCachedKafkaDataConsumer | [[NonCachedKafkaDataConsumer]] |===

NOTE: KafkaDataConsumer is a Scala sealed trait which means that all the <> are in the same compilation unit (a single file).

=== [[get]] Getting Single Kafka ConsumerRecord -- get Method

[source, scala]

get( offset: Long, untilOffset: Long, pollTimeoutMs: Long, failOnDataLoss: Boolean): ConsumerRecord[Array[Byte], Array[Byte]]


get simply requests the <> to get a single Kafka ConsumerRecord.

get is used when:

  • KafkaSourceRDD is requested to compute a partition

  • (Spark Structured Streaming) KafkaContinuousDataReader is requested to next

=== [[getAvailableOffsetRange]] Getting Single AvailableOffsetRange -- getAvailableOffsetRange Method

[source, scala]

getAvailableOffsetRange(): AvailableOffsetRange

getAvailableOffsetRange simply requests the InternalKafkaConsumer to get a single AvailableOffsetRange.

getAvailableOffsetRange is used when: