MicroBatchWrite¶
MicroBatchWrite
is a BatchWrite
(Spark SQL) for WriteToDataSourceV2 logical operator in Micro-Batch Stream Processing.
WriteToMicroBatchDataSource
WriteToDataSourceV2
logical operator replaces WriteToMicroBatchDataSource logical operator at logical optimization (using V2Writes
logical optimization).
MicroBatchWrite
is just a very thin wrapper over StreamingWrite and does nothing but delegates (relays) all the important execution-specific calls to it.
Creating Instance¶
MicroBatchWrite
takes the following to be created:
- Epoch ID
- StreamingWrite
MicroBatchWrite
is created when:
V2Writes
(Spark SQL) logical optimization is requested to optimize a logical plan (with a WriteToMicroBatchDataSource)
Committing Writing Job¶
commit(
messages: Array[WriterCommitMessage]): Unit
commit
is part of the BatchWrite
(Spark SQL) abstraction.
commit
requests the StreamingWrite to commit.
Creating DataWriterFactory for Batch Write¶
createBatchWriterFactory(
info: PhysicalWriteInfo): DataWriterFactory
createBatchWriterFactory
is part of the BatchWrite
(Spark SQL) abstraction.
createBatchWriterFactory
requests the StreamingWrite to create a StreamingDataWriterFactory.
In the end, createBatchWriterFactory
creates a MicroBatchWriterFactory (with the given epochId and the StreamingDataWriterFactory
).