PandasConversionMixin¶
PandasConversionMixin is a Python mixin of DataFrame to convert to Pandas (pandas.DataFrame).
toPandas¶
toPandas(self)
toPandas can only be used with DataFrame.
With Arrow optimization enabled, toPandas to_arrow_schema.
pyarrow
Arrow Optimization uses pyarrow module.
toPandas renames the columns to be of col_[index] format and _collect_as_arrow (with split_batches based on arrowPySparkSelfDestructEnabled configuration property).
toPandas creates a pyarrow.Table (from the RecordBatches) and converts the table to a pandas-compatible NumPy array or DataFrame. toPandas renames the columns back to the initial column names.
Note
Column order is assumed.
With Arrow optimization disabled, toPandas collects the records (DataFrame.collect) and creates a pandas.DataFrame (with some type munging).