BlockStoreShuffleReader¶
BlockStoreShuffleReader[K, C] is a ShuffleReader of K keys and C values.
Creating Instance¶
BlockStoreShuffleReader takes the following to be created:
- BaseShuffleHandle
- Blocks by Address (
Iterator[(BlockManagerId, Seq[(BlockId, Long, Int)])]) - TaskContext
- ShuffleReadMetricsReporter
- SerializerManager
- BlockManager
- MapOutputTracker
-
shouldBatchFetchflag (default:false)
BlockStoreShuffleReader is created when:
SortShuffleManageris requested for a ShuffleReader (for aShuffleHandleand a range of reduce partitions)
Reading Combined Records (for Reduce Task)¶
read creates a ShuffleBlockFetcherIterator.
read...FIXME
fetchContinuousBlocksInBatch¶
fetchContinuousBlocksInBatch: Boolean
fetchContinuousBlocksInBatch reads the following configuration properties to determine whether continuous shuffle block fetching could be used or not:
- spark.io.encryption.enabled
- spark.shuffle.compress
- spark.shuffle.useOldFetchProtocol
- supportsRelocationOfSerializedObjects (of the Serializer of the ShuffleDependency of this BaseShuffleHandle)
fetchContinuousBlocksInBatch prints out the following DEBUG message when continuous shuffle block fetching is requested yet not satisfied by the configuration:
The feature tag of continuous shuffle block fetching is set to true, but
we can not enable the feature because other conditions are not satisfied.
Shuffle compress: [compressed], serializer relocatable: [serializerRelocatable],
codec concatenation: [codecConcatenation], use old shuffle fetch protocol:
[useOldFetchProtocol], io encryption: [ioEncryption].