PandasConversionMixin¶
PandasConversionMixin
is a Python mixin of DataFrame to convert to Pandas (pandas.DataFrame).
toPandas¶
toPandas(self)
toPandas
can only be used with DataFrame.
With Arrow optimization enabled, toPandas
to_arrow_schema.
pyarrow
Arrow Optimization uses pyarrow
module.
toPandas
renames the columns to be of col_[index]
format and _collect_as_arrow (with split_batches
based on arrowPySparkSelfDestructEnabled
configuration property).
toPandas
creates a pyarrow.Table
(from the RecordBatch
es) and converts the table to a pandas-compatible NumPy array or DataFrame
. toPandas
renames the columns back to the initial column names.
Note
Column order is assumed.
With Arrow optimization disabled, toPandas
collects the records (DataFrame.collect
) and creates a pandas.DataFrame
(with some type munging).