Skip to content

ArrayFilter Expression

ArrayFilter is a ArrayBasedSimpleHigherOrderFunction with CodegenFallback.

Creating Instance

ArrayFilter takes the following to be created:

ArrayFilter is created for the following functions:

Demo

Scala

import org.apache.spark.sql.Column
val even: (Column => Column) = x => x % 2 === 1
val filter_collect = filter(collect_set("id") as "ids", even) as "evens"
val q = spark.range(5).groupBy($"id" % 2 as "gid").agg(filter_collect)
scala> q.show
+---+------+
|gid| evens|
+---+------+
|  0|    []|
|  1|[1, 3]|
+---+------+

SQL

spark.range(5).createOrReplaceTempView("nums")
val q = sql("""
SELECT id % 2 gid, filter(collect_set(id), x -> x % 2 == 1) evens
FROM nums
GROUP BY id % 2
""")
scala> q.show
+---+------+
|gid| evens|
+---+------+
|  0|    []|
|  1|[1, 3]|
+---+------+