PandasGroupedOpsMixin¶

PandasGroupedOpsMixin is a Python mixin for GroupedData class.

applyInPandas¶

applyInPandas(
  self,
  func: "PandasGroupedMapFunction", # (1)!
  schema: Union[StructType, str]
) -> DataFrame

from pandas.core.frame import DataFrame as PandasDataFrame
DataFrameLike = PandasDataFrame
PandasGroupedMapFunction = Union[
  # func: pandas.DataFrame -> pandas.DataFrame
  Callable[[DataFrameLike], DataFrameLike],
  # func: (groupKey(s), pandas.DataFrame) -> pandas.DataFrame
  Callable[[Any, DataFrameLike], DataFrameLike],
]

applyInPandas creates a pandas_udf with the following:

pandas_udf	Value
`f`	The given `func`
`returnType`	The given `schema`
`functionType`	PandasUDFType.GROUPED_MAP

applyInPandas creates a Column wtih the pandas_udf applied to all the columns of the DataFrame of this GroupedData.

applyInPandas requests the RelationalGroupedDataset to flatMapGroupsInPandas with the underlying Catalyst expression of the Column with the pandas_udf.

In the end, applyInPandas creates a DataFrame with the result.

cogroup¶

cogroup(
  self,
  other: "GroupedData") -> "PandasCogroupedOps"

cogroup creates a PandasCogroupedOps for this and the other GroupedDatas.