Skip to content


UnsafeFixedWidthAggregationMap is a tiny layer (extension) around Spark Core's BytesToBytesMap with UnsafeRow keys and values.


boolean supportsAggregationBufferSchema(
  StructType schema)

supportsAggregationBufferSchema is enabled (true) if all of the top-level fields (of the given schema) are mutable.

import org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap
import org.apache.spark.sql.types._
val schemaWithImmutableField = StructType(
  StructField("name", StringType) :: Nil)
assert(UnsafeFixedWidthAggregationMap.supportsAggregationBufferSchema(schemaWithImmutableField) == false)
val schemaWithMutableFields = StructType(
  StructField("id", IntegerType) :: StructField("bool", BooleanType) :: Nil)

supportsAggregationBufferSchema is used when:

Review Me

Whenever requested for performance metrics (i.e. <> and <>), UnsafeFixedWidthAggregationMap simply requests the underlying <>.

UnsafeFixedWidthAggregationMap is <> when:

[[internal-registries]] .UnsafeFixedWidthAggregationMap's Internal Properties (e.g. Registries, Counters and Flags) [cols="1m,2",options="header",width="100%"] |=== | Name | Description

| currentAggregationBuffer | [[currentAggregationBuffer]] Re-used pointer (as an <> with the number of fields to match the <>) to the current aggregation buffer

Used exclusively when UnsafeFixedWidthAggregationMap is requested to <>.

| emptyAggregationBuffer | [[emptyAggregationBuffer-byte-array]] <> (encoded in UnsafeRow format)

| groupingKeyProjection | [[groupingKeyProjection]] UnsafeProjection for the <> (to encode grouping keys as UnsafeRows)

| map a| [[map]] Spark Core's BytesToBytesMap with the <>, <>, <> and performance metrics enabled |===

Creating Instance

UnsafeFixedWidthAggregationMap takes the following when created:

  • [[emptyAggregationBuffer]] Empty aggregation buffer (as an InternalRow)
  • [[aggregationBufferSchema]] Aggregation buffer schema
  • [[groupingKeySchema]] Grouping key schema
  • [[taskMemoryManager]] Spark Core's TaskMemoryManager
  • [[initialCapacity]] Initial capacity
  • [[pageSizeBytes]] Page size (in bytes)

=== [[getAggregationBufferFromUnsafeRow]] getAggregationBufferFromUnsafeRow Method

[source, scala]

UnsafeRow getAggregationBufferFromUnsafeRow(UnsafeRow key) // <1> UnsafeRow getAggregationBufferFromUnsafeRow(UnsafeRow key, int hash)

<1> Uses the hash code of the key

getAggregationBufferFromUnsafeRow requests the <> to lookup the input key (to get a BytesToBytesMap.Location).



getAggregationBufferFromUnsafeRow is used when:

  • TungstenAggregationIterator is requested to <> (exclusively when TungstenAggregationIterator is <>)

* (for testing only) UnsafeFixedWidthAggregationMap is requested to <>

=== [[getAggregationBuffer]] getAggregationBuffer Method

[source, java]

UnsafeRow getAggregationBuffer(InternalRow groupingKey)


NOTE: getAggregationBuffer seems to be used exclusively for testing.

=== [[iterator]] Getting KVIterator -- iterator Method

[source, java]

KVIterator iterator()


iterator is used when:

=== [[getPeakMemoryUsedBytes]] getPeakMemoryUsedBytes Method

[source, java]

long getPeakMemoryUsedBytes()


getPeakMemoryUsedBytes is used when:

=== [[getAverageProbesPerLookup]] getAverageProbesPerLookup Method

[source, java]

double getAverageProbesPerLookup()


getAverageProbesPerLookup is used when:

=== [[free]] free Method

[source, java]

void free()


free is used when:

=== [[destructAndCreateExternalSorter]] destructAndCreateExternalSorter Method

[source, java]

UnsafeKVExternalSorter destructAndCreateExternalSorter() throws IOException


destructAndCreateExternalSorter is used when: