DeletionVectorDescriptor¶
DeletionVectorDescriptor describes a deletion vector attached to a file.
Creating Instance¶
DeletionVectorDescriptor takes the following to be created:
- Storage Type
- Path or an inline deletion vector
- Offset
- Size (in bytes)
- Cardinality
- maxRowIndex
DeletionVectorDescriptor is created using the following utilities:
Storage Type¶
storageType: String
DeletionVectorDescriptor is given a storage type that indicates how the deletion vector is stored.
The storage types of a deletion vector can be one of the following:
| Storage Type | Format | Description |
|---|---|---|
p(ath) | <absolute path> | Stored in a file that is available at an absolute path |
i(nline) | <base85 encoded bytes> | Stored inline in the transaction log |
u(uid) | <random prefix - optional><base85 encoded uuid> | (UUID-based) Stored in a file with a path relative to the data directory of a delta table |
Creating Empty Deletion Vector¶
EMPTY: DeletionVectorDescriptor
EMPTY is an empty deletion vector (DeletionVectorDescriptor) with the following:
| Property | Value |
|---|---|
| storageType | i |
| pathOrInlineDv | (empty) |
| sizeInBytes | 0 |
| cardinality | 0 |
EMPTY is used when:
DeletionVectorWriteris requested to storeSerializedBitmapStoredBitmapis requested for The stored bitmap of an empty deletion vector
onDiskWithRelativePath¶
onDiskWithRelativePath(
id: UUID,
randomPrefix: String = "",
sizeInBytes: Int,
cardinality: Long,
offset: Option[Int] = None,
maxRowIndex: Option[Long] = None): DeletionVectorDescriptor
onDiskWithRelativePath creates a DeletionVectorDescriptor with the following:
| Property | Value |
|---|---|
| storageType | u |
| pathOrInlineDv | encodeUUID with the given id and randomPrefix |
| offset | The given offset |
| sizeInBytes | The given sizeInBytes |
| cardinality | The given cardinality |
| maxRowIndex | The given maxRowIndex |
onDiskWithRelativePath is used when:
DeletionVectorWriteris requested to storeSerializedBitmap
inlineInLog¶
inlineInLog(
data: Array[Byte],
cardinality: Long): DeletionVectorDescriptor
inlineInLog creates a DeletionVectorDescriptor with the following:
| Property | Value |
|---|---|
| storageType | i |
| pathOrInlineDv | encodeData for the given data |
| sizeInBytes | The size of the given data |
| cardinality | The given cardinality |
inlineInLog is used when:
CDCReaderImplis requested to generateFileActionsWithInlineDv
onDiskWithAbsolutePath¶
onDiskWithAbsolutePath(
path: String,
sizeInBytes: Int,
cardinality: Long,
offset: Option[Int] = None,
maxRowIndex: Option[Long] = None): DeletionVectorDescriptor
Note
onDiskWithAbsolutePath is used for testing only.
copyWithAbsolutePath¶
copyWithAbsolutePath(
tableLocation: Path): DeletionVectorDescriptor
copyWithAbsolutePath creates a new copy of this DeletionVectorDescriptor.
For uuid storage type, copyWithAbsolutePath replaces the following:
| Attribute | New Value |
|---|---|
| Storage type | p |
| Path | The absolute path based on the given tableLocation |
copyWithAbsolutePath is used when:
DeltaFileOperationsis requested to makePathsAbsolute
Absolute Path¶
absolutePath(
tableLocation: Path): Path
absolutePath...FIXME
absolutePath is used when:
DeletionVectorDescriptoris requested to copyWithAbsolutePath (for SHALLOW CLONE command)DeletionVectorStoredBitmapis requested for the absolute path of this on-disk deletion vector- VACUUM command is executed (and getDeletionVectorRelativePath)
assembleDeletionVectorPath¶
assembleDeletionVectorPath(
targetParentPath: Path,
id: UUID,
prefix: String = ""): Path
assembleDeletionVectorPath creates a new Path (Apache Hadoop) for the given targetParentPath and fileName (and the optional prefix).
assembleDeletionVectorPath is used when:
DeletionVectorDescriptoris requested to absolutePath (for the uuid marker)DeletionVectorStoreUtilsis requested to assembleDeletionVectorPathWithFileSystem
isOnDisk¶
isOnDisk: Boolean
isOnDisk is the negation (opposite) of the isInline flag.
isOnDisk is used when:
VacuumCommandImplis requested for the path of an on-disk deletion vectorDeletionVectorStoredBitmapis requested to isOnDiskStoredBitmaputility is requested to create a StoredBitmap
isInline¶
isInline: Boolean
isInline holds true for the storageType being i.
isInline is used when:
DeletionVectorDescriptoris requested to inlineData, isOnDiskDeletionVectorStoredBitmapis requested to isInlineStoredBitmapis requested to inline