ParallelCollectionRDD¶
ParallelCollectionRDD
is an RDD of a collection of elements with numSlices
partitions and optional locationPrefs
.
ParallelCollectionRDD
is the result of SparkContext.parallelize
and SparkContext.makeRDD
methods.
The data collection is split on to numSlices
slices.
It uses ParallelCollectionPartition
.