Skip to content

UnsafeExternalRowSorter

UnsafeExternalRowSorter is a facade of UnsafeExternalSorter (Spark Core) to allow sorting InternalRows.

Creating Instance

UnsafeExternalRowSorter takes the following to be created:

  • Output Schema
  • RecordComparator Supplier
  • PrefixComparator
  • UnsafeExternalRowSorter.PrefixComputer
  • Page Size (bytes)
  • canUseRadixSort flag

UnsafeExternalRowSorter is created using createWithRecordComparator and create utilities.

Spilling

UnsafeExternalRowSorter (UnsafeExternalSorter actually) may be forced to spill in-memory data when the number of elements reaches spark.shuffle.spill.numElementsForceSpillThreshold (Spark Core).

UnsafeExternalSorter

UnsafeExternalRowSorter creates an UnsafeExternalSorter (Spark Core) when created.

Creating UnsafeExternalRowSorter

UnsafeExternalRowSorter create(
  StructType schema,
  Ordering<InternalRow> ordering,
  PrefixComparator prefixComparator,
  UnsafeExternalRowSorter.PrefixComputer prefixComputer,
  long pageSizeBytes,
  boolean canUseRadixSort)

create creates an UnsafeExternalRowSorter (with a new RowComparator when requested).


create is used when:

  • SortExec physical operator is requested to create one

Sorting

Iterator<InternalRow> sort()
Iterator<InternalRow> sort(
  Iterator<UnsafeRow> inputIterator) // (1)!
  1. Inserts all the input rows and calls no-argument sort

sort...FIXME


sort is used when: