MicroBatchWrite¶
MicroBatchWrite is a BatchWrite (Spark SQL) for WriteToDataSourceV2 logical operator in Micro-Batch Stream Processing.
WriteToMicroBatchDataSource
WriteToDataSourceV2 logical operator replaces WriteToMicroBatchDataSource logical operator at logical optimization (using V2Writes logical optimization).
MicroBatchWrite is just a very thin wrapper over StreamingWrite and does nothing but delegates (relays) all the important execution-specific calls to it.
Creating Instance¶
MicroBatchWrite takes the following to be created:
- Epoch ID
- StreamingWrite
MicroBatchWrite is created when:
V2Writes(Spark SQL) logical optimization is requested to optimize a logical plan (with a WriteToMicroBatchDataSource)
Committing Writing Job¶
commit(
messages: Array[WriterCommitMessage]): Unit
commit is part of the BatchWrite (Spark SQL) abstraction.
commit requests the StreamingWrite to commit.
Creating DataWriterFactory for Batch Write¶
createBatchWriterFactory(
info: PhysicalWriteInfo): DataWriterFactory
createBatchWriterFactory is part of the BatchWrite (Spark SQL) abstraction.
createBatchWriterFactory requests the StreamingWrite to create a StreamingDataWriterFactory.
In the end, createBatchWriterFactory creates a MicroBatchWriterFactory (with the given epochId and the StreamingDataWriterFactory).