DeletionVectorDescriptor¶
DeletionVectorDescriptor
describes a deletion vector attached to a file.
Creating Instance¶
DeletionVectorDescriptor
takes the following to be created:
- Storage Type
- Path or an inline deletion vector
- Offset
- Size (in bytes)
- Cardinality
- maxRowIndex
DeletionVectorDescriptor
is created using the following utilities:
Storage Type¶
storageType: String
DeletionVectorDescriptor
is given a storage type that indicates how the deletion vector is stored.
The storage types of a deletion vector can be one of the following:
Storage Type | Format | Description |
---|---|---|
p (ath) | <absolute path> | Stored in a file that is available at an absolute path |
i (nline) | <base85 encoded bytes> | Stored inline in the transaction log |
u (uid) | <random prefix - optional><base85 encoded uuid> | (UUID-based) Stored in a file with a path relative to the data directory of a delta table |
Creating Empty Deletion Vector¶
EMPTY: DeletionVectorDescriptor
EMPTY
is an empty deletion vector (DeletionVectorDescriptor
) with the following:
Property | Value |
---|---|
storageType | i |
pathOrInlineDv | (empty) |
sizeInBytes | 0 |
cardinality | 0 |
EMPTY
is used when:
DeletionVectorWriter
is requested to storeSerializedBitmapStoredBitmap
is requested for The stored bitmap of an empty deletion vector
onDiskWithRelativePath¶
onDiskWithRelativePath(
id: UUID,
randomPrefix: String = "",
sizeInBytes: Int,
cardinality: Long,
offset: Option[Int] = None,
maxRowIndex: Option[Long] = None): DeletionVectorDescriptor
onDiskWithRelativePath
creates a DeletionVectorDescriptor
with the following:
Property | Value |
---|---|
storageType | u |
pathOrInlineDv | encodeUUID with the given id and randomPrefix |
offset | The given offset |
sizeInBytes | The given sizeInBytes |
cardinality | The given cardinality |
maxRowIndex | The given maxRowIndex |
onDiskWithRelativePath
is used when:
DeletionVectorWriter
is requested to storeSerializedBitmap
inlineInLog¶
inlineInLog(
data: Array[Byte],
cardinality: Long): DeletionVectorDescriptor
inlineInLog
creates a DeletionVectorDescriptor
with the following:
Property | Value |
---|---|
storageType | i |
pathOrInlineDv | encodeData for the given data |
sizeInBytes | The size of the given data |
cardinality | The given cardinality |
inlineInLog
is used when:
CDCReaderImpl
is requested to generateFileActionsWithInlineDv
onDiskWithAbsolutePath¶
onDiskWithAbsolutePath(
path: String,
sizeInBytes: Int,
cardinality: Long,
offset: Option[Int] = None,
maxRowIndex: Option[Long] = None): DeletionVectorDescriptor
Note
onDiskWithAbsolutePath
is used for testing only.
copyWithAbsolutePath¶
copyWithAbsolutePath(
tableLocation: Path): DeletionVectorDescriptor
copyWithAbsolutePath
creates a new copy of this DeletionVectorDescriptor
.
For uuid storage type, copyWithAbsolutePath
replaces the following:
Attribute | New Value |
---|---|
Storage type | p |
Path | The absolute path based on the given tableLocation |
copyWithAbsolutePath
is used when:
DeltaFileOperations
is requested to makePathsAbsolute
Absolute Path¶
absolutePath(
tableLocation: Path): Path
absolutePath
...FIXME
absolutePath
is used when:
DeletionVectorDescriptor
is requested to copyWithAbsolutePath (for SHALLOW CLONE command)DeletionVectorStoredBitmap
is requested for the absolute path of this on-disk deletion vector- VACUUM command is executed (and getDeletionVectorRelativePath)
assembleDeletionVectorPath¶
assembleDeletionVectorPath(
targetParentPath: Path,
id: UUID,
prefix: String = ""): Path
assembleDeletionVectorPath
creates a new Path
(Apache Hadoop) for the given targetParentPath
and fileName
(and the optional prefix
).
assembleDeletionVectorPath
is used when:
DeletionVectorDescriptor
is requested to absolutePath (for the uuid marker)DeletionVectorStoreUtils
is requested to assembleDeletionVectorPathWithFileSystem
isOnDisk¶
isOnDisk: Boolean
isOnDisk
is the negation (opposite) of the isInline flag.
isOnDisk
is used when:
VacuumCommandImpl
is requested for the path of an on-disk deletion vectorDeletionVectorStoredBitmap
is requested to isOnDiskStoredBitmap
utility is requested to create a StoredBitmap
isInline¶
isInline: Boolean
isInline
holds true for the storageType being i.
isInline
is used when:
DeletionVectorDescriptor
is requested to inlineData, isOnDiskDeletionVectorStoredBitmap
is requested to isInlineStoredBitmap
is requested to inline