UnsafeExternalRowSorter¶
UnsafeExternalRowSorter
is a facade of UnsafeExternalSorter
(Spark Core) to allow sorting InternalRows.
Creating Instance¶
UnsafeExternalRowSorter
takes the following to be created:
- Output Schema
-
RecordComparator
Supplier -
PrefixComparator
-
UnsafeExternalRowSorter.PrefixComputer
- Page Size (bytes)
-
canUseRadixSort
flag
UnsafeExternalRowSorter
is created using createWithRecordComparator and create utilities.
Spilling¶
UnsafeExternalRowSorter
(UnsafeExternalSorter actually) may be forced to spill in-memory data when the number of elements reaches spark.shuffle.spill.numElementsForceSpillThreshold
(Spark Core).
UnsafeExternalSorter¶
UnsafeExternalRowSorter
creates an UnsafeExternalSorter
(Spark Core) when created.
Creating UnsafeExternalRowSorter¶
UnsafeExternalRowSorter create(
StructType schema,
Ordering<InternalRow> ordering,
PrefixComparator prefixComparator,
UnsafeExternalRowSorter.PrefixComputer prefixComputer,
long pageSizeBytes,
boolean canUseRadixSort)
create
creates an UnsafeExternalRowSorter (with a new RowComparator
when requested).
create
is used when:
SortExec
physical operator is requested to create one
Sorting¶
Iterator<InternalRow> sort()
Iterator<InternalRow> sort(
Iterator<UnsafeRow> inputIterator) // (1)!
- Inserts all the input rows and calls no-argument
sort
sort
...FIXME
sort
is used when:
DynamicPartitionDataConcurrentWriter
is requested to writeWithIteratorShuffleExchangeExec
physical operator is requested to prepareShuffleDependencySortExec
physical operator is executed