PandasGroupedOpsMixin¶
PandasGroupedOpsMixin
is a Python mixin for GroupedData class.
applyInPandas¶
applyInPandas(
self,
func: "PandasGroupedMapFunction", # (1)!
schema: Union[StructType, str]
) -> DataFrame
from pandas.core.frame import DataFrame as PandasDataFrame DataFrameLike = PandasDataFrame PandasGroupedMapFunction = Union[ # func: pandas.DataFrame -> pandas.DataFrame Callable[[DataFrameLike], DataFrameLike], # func: (groupKey(s), pandas.DataFrame) -> pandas.DataFrame Callable[[Any, DataFrameLike], DataFrameLike], ]
applyInPandas
creates a pandas_udf with the following:
pandas_udf | Value |
---|---|
f | The given func |
returnType | The given schema |
functionType | PandasUDFType.GROUPED_MAP |
applyInPandas
creates a Column
wtih the pandas_udf
applied to all the columns of the DataFrame of this GroupedData.
applyInPandas
requests the RelationalGroupedDataset to flatMapGroupsInPandas with the underlying Catalyst expression of the Column
with the pandas_udf
.
In the end, applyInPandas
creates a DataFrame with the result.
cogroup¶
cogroup(
self,
other: "GroupedData") -> "PandasCogroupedOps"
cogroup
creates a PandasCogroupedOps for this and the other GroupedDatas.