WriteToDataSourceV2Exec Physical Operator¶
WriteToDataSourceV2Exec
is a V2TableWriteExec
(Spark SQL) that represents WriteToDataSourceV2 logical operator at execution time.
Creating Instance¶
WriteToDataSourceV2Exec
takes the following to be created:
-
BatchWrite
(Spark SQL) - Refresh Cache Function (
() => Unit
) - Physical Query Plan (Spark SQL)
- Write
CustomMetric
s (Spark SQL)
WriteToDataSourceV2Exec
is created when:
DataSourceV2Strategy
(Spark SQL) execution planning strategy is requested to plan a logical query plan (that is a WriteToDataSourceV2 logical operator)
Executing Physical Operator¶
run(): Seq[InternalRow]
run
is part of the V2CommandExec
(Spark SQL) abstraction.
run
writes rows out (Spark SQL) using the BatchWrite and then refreshes the cache (using the refresh cache function).
In the end, run
returns the rows written out.
Logging¶
Enable ALL
logging level for org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec
logger to see what happens inside.
Add the following line to conf/log4j.properties
:
log4j.logger.org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec=ALL
Refer to Logging.