WriteToDataSourceV2Exec Physical Operator¶
WriteToDataSourceV2Exec is a V2TableWriteExec (Spark SQL) that represents WriteToDataSourceV2 logical operator at execution time.
Creating Instance¶
WriteToDataSourceV2Exec takes the following to be created:
-
BatchWrite(Spark SQL) - Refresh Cache Function (
() => Unit) - Physical Query Plan (Spark SQL)
- Write
CustomMetrics (Spark SQL)
WriteToDataSourceV2Exec is created when:
DataSourceV2Strategy(Spark SQL) execution planning strategy is requested to plan a logical query plan (that is a WriteToDataSourceV2 logical operator)
Executing Physical Operator¶
run(): Seq[InternalRow]
run is part of the V2CommandExec (Spark SQL) abstraction.
run writes rows out (Spark SQL) using the BatchWrite and then refreshes the cache (using the refresh cache function).
In the end, run returns the rows written out.
Logging¶
Enable ALL logging level for org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec logger to see what happens inside.
Add the following line to conf/log4j.properties:
log4j.logger.org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec=ALL
Refer to Logging.