DataSourceV2Strategy Execution Planning Strategy¶
DataSourceV2Strategy
is an execution planning strategy.
Logical Operator | Physical Operator |
---|---|
DataSourceV2ScanRelation with V1Scan | RowDataSourceScanExec |
DataSourceV2ScanRelation | BatchScanExec |
StreamingDataSourceV2Relation | |
WriteToDataSourceV2 (Spark Structured Streaming) | WriteToDataSourceV2Exec (Spark Structured Streaming) |
CreateTableAsSelect | AtomicCreateTableAsSelectExec or CreateTableAsSelectExec |
RefreshTable | RefreshTableExec |
ReplaceTable | AtomicReplaceTableExec or ReplaceTableExec |
ReplaceTableAsSelect | AtomicReplaceTableAsSelectExec or ReplaceTableAsSelectExec |
AppendData | AppendDataExecV1 or AppendDataExec |
OverwriteByExpression with a DataSourceV2Relation | OverwriteByExpressionExecV1 or OverwriteByExpressionExec |
OverwritePartitionsDynamic | OverwritePartitionsDynamicExec |
DeleteFromTable with DataSourceV2ScanRelation | DeleteFromTableExec |
WriteToContinuousDataSource | WriteToContinuousDataSourceExec |
DescribeNamespace | DescribeNamespaceExec |
DescribeRelation | DescribeTableExec |
DropTable | DropTableExec |
NoopDropTable | LocalTableScanExec |
AlterTable | AlterTableExec |
others |
Creating Instance¶
DataSourceV2Strategy
takes the following to be created:
DataSourceV2Strategy
is created when:
SparkPlanner
is requested for the strategies
Executing Rule¶
apply(
plan: LogicalPlan): Seq[SparkPlan]
apply
is part of GenericStrategy abstraction.
apply
branches off per the type of the given logical operator.
Logging¶
Enable ALL
logging level for org.apache.spark.sql.execution.datasources.v2.DataSourceV2Strategy
logger to see what happens inside.
Add the following line to conf/log4j2.properties
:
log4j.logger.org.apache.spark.sql.execution.datasources.v2.DataSourceV2Strategy=ALL
Refer to Logging.