DataSourceV2Strategy Execution Planning Strategy¶
DataSourceV2Strategy is an execution planning strategy.
| Logical Operator | Physical Operator |
|---|---|
| DataSourceV2ScanRelation with V1Scan | RowDataSourceScanExec |
| DataSourceV2ScanRelation | BatchScanExec |
StreamingDataSourceV2Relation | |
WriteToDataSourceV2 (Spark Structured Streaming) | WriteToDataSourceV2Exec (Spark Structured Streaming) |
| CreateTableAsSelect | AtomicCreateTableAsSelectExec or CreateTableAsSelectExec |
RefreshTable | RefreshTableExec |
ReplaceTable | AtomicReplaceTableExec or ReplaceTableExec |
ReplaceTableAsSelect | AtomicReplaceTableAsSelectExec or ReplaceTableAsSelectExec |
| AppendData | AppendDataExecV1 or AppendDataExec |
| OverwriteByExpression with a DataSourceV2Relation | OverwriteByExpressionExecV1 or OverwriteByExpressionExec |
| OverwritePartitionsDynamic | OverwritePartitionsDynamicExec |
| DeleteFromTable with DataSourceV2ScanRelation | DeleteFromTableExec |
WriteToContinuousDataSource | WriteToContinuousDataSourceExec |
DescribeNamespace | DescribeNamespaceExec |
| DescribeRelation | DescribeTableExec |
DropTable | DropTableExec |
NoopDropTable | LocalTableScanExec |
| AlterTable | AlterTableExec |
| others |
Creating Instance¶
DataSourceV2Strategy takes the following to be created:
DataSourceV2Strategy is created when:
SparkPlanneris requested for the strategies
Executing Rule¶
apply(
plan: LogicalPlan): Seq[SparkPlan]
apply is part of GenericStrategy abstraction.
apply branches off per the type of the given logical operator.
Logging¶
Enable ALL logging level for org.apache.spark.sql.execution.datasources.v2.DataSourceV2Strategy logger to see what happens inside.
Add the following line to conf/log4j2.properties:
log4j.logger.org.apache.spark.sql.execution.datasources.v2.DataSourceV2Strategy=ALL
Refer to Logging.