Skip to content

BytesToBytesMap

BytesToBytesMap is a memory consumer that supports spilling.

Spark SQL

BytesToBytesMap is used in Spark SQL only in the following:

Creating Instance

BytesToBytesMap takes the following to be created:

BytesToBytesMap is created when:

  • UnsafeFixedWidthAggregationMap (Spark SQL) is created
  • UnsafeHashedRelation (Spark SQL) is created

Destructive MapIterator

MapIterator destructiveIterator

BytesToBytesMap defines a reference to a "destructive" MapIterator (if ever created for UnsafeFixedWidthAggregationMap (Spark SQL)).

The destructiveIterator reference is in two states:

  • Undefined (null) initially when BytesToBytesMap is created
  • The MapIterator if created

Creating Destructive MapIterator

MapIterator destructiveIterator()

destructiveIterator updatePeakMemoryUsed and then creates a MapIterator with the following:

  • numValues for the number of records
  • A new Location
  • Destructive flag enabled (true)

destructiveIterator is used when:

  • UnsafeFixedWidthAggregationMap (Spark SQL) is created

Spilling

long spill(
  long size,
  MemoryConsumer trigger)

spill is part of the MemoryConsumer abstraction.


Only when the given MemoryConsumer is not this BytesToBytesMap and the destructive MapIterator has been used, spill requests the destructive MapIterator to spill (the given size bytes).

spill returns 0 when the trigger is this BytesToBytesMap or there is no destructiveIterator in use. Otherwise, spill returns how much bytes the destructiveIterator managed to release.

numValues

numValues registry is 0 after reset.

numValues is incremented when Location is requested to append

numValues can never be bigger than maximum capacity of this BytesToBytesMap or growthThreshold.

Maximum Capacity

BytesToBytesMap supports up to 1 << 29 keys.

BytesToBytesMap makes sure that the initialCapacity is not bigger when creted.

Allocating Memory

void allocate(
  int capacity)

allocate...FIXME


allocate is used when:

Growing Memory And Rehashing

void growAndRehash()

growAndRehash...FIXME


growAndRehash is used when:

  • Location is requested to append (a new value for a key)