Skip to content

ParallelCollectionRDD

ParallelCollectionRDD is an RDD of a collection of elements with numSlices partitions and optional locationPrefs.

ParallelCollectionRDD is the result of SparkContext.parallelize and SparkContext.makeRDD methods.

The data collection is split on to numSlices slices.

It uses ParallelCollectionPartition.