LocalRDDCheckpointData¶
LocalRDDCheckpointData
is a RDDCheckpointData.
Creating Instance¶
LocalRDDCheckpointData
takes the following to be created:
LocalRDDCheckpointData
is created when:
RDD
is requested to localCheckpoint
doCheckpoint¶
doCheckpoint(): CheckpointRDD[T]
doCheckpoint
is part of the RDDCheckpointData abstraction.
doCheckpoint
creates a LocalCheckpointRDD with the RDD. doCheckpoint
triggers caching any missing partitions (by checking availability of the RDDBlockIds for the partitions in the BlockManagerMaster).
Extra Spark Job
If there are any missing partitions (RDDBlockId
s) doCheckpoint
requests the SparkContext
to run a Spark job with the RDD
and the missing partitions.
doCheckpoint
makes sure that the StorageLevel of the RDD
uses disk (among other persistence storages). If not, doCheckpoint
throws an AssertionError
:
Storage level [level] is not appropriate for local checkpointing