Skip to content

Partitioning

Partitioning is an abstraction of output data partitioning requirements (data distribution) of a Spark SQL connector.

Note

This Partitioning interface for Spark SQL developers mimics the internal Catalyst Partitioning that is converted into with the help of DataSourcePartitioning.

Contract

Number of Partitions

int numPartitions()

Used when:

Satisfying Distribution

boolean satisfy(
  Distribution distribution)

Used when:

Implementations

  • KeyGroupedPartitioning
  • UnknownPartitioning