ShuffleInMemorySorter is used by ShuffleExternalSorter to sort pointers of key-value records and partition IDs using radix or tim sort algorithms.

Figure 1. ShuffleInMemorySorter and ShuffleExternalSorter

Creating Instance

ShuffleInMemorySorter takes the following to be created:

ShuffleInMemorySorter requests the given MemoryConsumer to allocate an array of the given initial size for the Unsafe LongArray of Record Pointers and Partition IDs.

ShuffleInMemorySorter is created for a ShuffleExternalSorter.

Iterator of Records Sorted

ShuffleSorterIterator getSortedIterator()


getSortedIterator is used when ShuffleExternalSorter is requested to writeSortedFile.


void reset()


reset is used when…​FIXME

numRecords Method

int numRecords()


numRecords is used when…​FIXME

Calculating Usable Capacity

int getUsableCapacity()

getUsableCapacity calculates the capacity that is a half or two-third of the memory used for the LongArray.

getUsableCapacity is used when…​FIXME


Enable ALL logging level for org.apache.spark.shuffle.sort.ShuffleExternalSorter logger to see what happens inside.

Add the following line to conf/

Refer to Logging.

Internal Properties

Unsafe LongArray of Record Pointers and Partition IDs

ShuffleInMemorySorter uses a LongArray.

Usable Capacity