Skip to content

ParallelCollectionRDD

ParallelCollectionRDD is an RDD of a collection of elements with numSlices partitions and optional locationPrefs.

ParallelCollectionRDD is the result of SparkContext.parallelize and SparkContext.makeRDD methods.

The data collection is split on to numSlices slices.

It uses ParallelCollectionPartition.


Last update: 2020-10-14