Skip to content

Partitioner

Partitioner is an abstraction of partitioners that define how the elements in a key-value pair RDD are partitioned by key.

Partitioner maps keys to partition IDs (from 0 to numPartitions exclusive).

Partitioner ensures that records with the same key are in the same partition.

Partitioner is a Java Serializable.

Contract

Partition for Key

getPartition(
  key: Any): Int

Partition ID for the given key

Number of Partitions

numPartitions: Int

Implementations