UnsafeExternalRowSorter¶
UnsafeExternalRowSorter is a facade of UnsafeExternalSorter (Spark Core) to allow sorting InternalRows.
Creating Instance¶
UnsafeExternalRowSorter takes the following to be created:
- Output Schema
-
RecordComparatorSupplier -
PrefixComparator -
UnsafeExternalRowSorter.PrefixComputer - Page Size (bytes)
-
canUseRadixSortflag
UnsafeExternalRowSorter is created using createWithRecordComparator and create utilities.
Spilling¶
UnsafeExternalRowSorter (UnsafeExternalSorter actually) may be forced to spill in-memory data when the number of elements reaches spark.shuffle.spill.numElementsForceSpillThreshold (Spark Core).
UnsafeExternalSorter¶
UnsafeExternalRowSorter creates an UnsafeExternalSorter (Spark Core) when created.
Creating UnsafeExternalRowSorter¶
UnsafeExternalRowSorter create(
StructType schema,
Ordering<InternalRow> ordering,
PrefixComparator prefixComparator,
UnsafeExternalRowSorter.PrefixComputer prefixComputer,
long pageSizeBytes,
boolean canUseRadixSort)
create creates an UnsafeExternalRowSorter (with a new RowComparator when requested).
create is used when:
SortExecphysical operator is requested to create one
Sorting¶
Iterator<InternalRow> sort()
Iterator<InternalRow> sort(
Iterator<UnsafeRow> inputIterator) // (1)!
- Inserts all the input rows and calls no-argument
sort
sort...FIXME
sort is used when:
DynamicPartitionDataConcurrentWriteris requested to writeWithIteratorShuffleExchangeExecphysical operator is requested to prepareShuffleDependencySortExecphysical operator is executed