Partitioner¶
Partitioner is an abstraction of partitioners that define how the elements in a key-value pair RDD are partitioned by key.
Partitioner maps keys to partition IDs (from 0 to numPartitions exclusive).
Partitioner ensures that records with the same key are in the same partition.
Partitioner is a Java Serializable.
Contract¶
Partition for Key¶
getPartition(
key: Any): Int
Partition ID for the given key
Number of Partitions¶
numPartitions: Int