DataWritingCommand Logical Commands¶
DataWritingCommand
is an extension of the UnaryCommand
abstraction for logical commands that write the result of executing query (query data) to a relation (when executed).
Performance Metrics¶
Key | Name (in web UI) | Description |
---|---|---|
numFiles | number of written files | |
numOutputBytes | bytes of written output | |
numOutputRows | number of output rows | |
numParts | number of dynamic part | |
taskCommitTime | task commit time | |
jobCommitTime | job commit time |
Contract¶
Output Column Names¶
outputColumnNames: Seq[String]
The names of the output columns of the analyzed input query plan
Used when:
DataWritingCommand
is requested for the output columns
Query¶
query: LogicalPlan
The analyzed LogicalPlan representing the data to write (i.e. whose result will be inserted into a relation)
Used when:
- BasicOperators execution planning strategy is executed
DataWritingCommand
is requested for the child logical operator and the output columns
Executing¶
run(
sparkSession: SparkSession,
child: SparkPlan): Seq[Row]
Used when:
CreateHiveTableAsSelectBase
is requested torun
- DataWritingCommandExec physical operator is requested for the sideEffectResult
Implementations¶
- CreateDataSourceTableAsSelectCommand
CreateHiveTableAsSelectBase
- InsertIntoHadoopFsRelationCommand
- SaveAsHiveFile
Execution Planning¶
DataWritingCommand
is resolved to a DataWritingCommandExec physical operator by BasicOperators execution planning strategy.
BasicWriteJobStatsTracker¶
basicWriteJobStatsTracker(
hadoopConf: Configuration): BasicWriteJobStatsTracker
basicWriteJobStatsTracker
creates a new BasicWriteJobStatsTracker (with the given Hadoop Configuration and the metrics).
basicWriteJobStatsTracker
is used when:
- InsertIntoHadoopFsRelationCommand logical command is executed
- SaveAsHiveFile logical command is executed (and requested to saveAsHiveFile)