GeneratedColumn Utility¶
GeneratedColumn
is a utility for Generated Columns.
import org.apache.spark.sql.delta.GeneratedColumn
isGeneratedColumn¶
isGeneratedColumn(
protocol: Protocol,
field: StructField): Boolean
isGeneratedColumn(
field: StructField): Boolean
isGeneratedColumn
returns true
when the following all hold:
- satisfyGeneratedColumnProtocol
- The metadata of the given
StructField
(Spark SQL) contains (a binding for) the delta.generationExpression key.
isGeneratedColumn
is used when:
ColumnWithDefaultExprUtils
utility is used to removeDefaultExpressions and columnHasDefaultExprGeneratedColumn
utility is used to hasGeneratedColumns, getGeneratedColumns, enforcesGeneratedColumns and validateGeneratedColumns
getGeneratedColumns¶
getGeneratedColumns(
snapshot: Snapshot): Seq[StructField]
getGeneratedColumns
satisfyGeneratedColumnProtocol (with the protocol of the given Snapshot) and returns generated columns (based on the schema of the Metadata of the given Snapshot).
getGeneratedColumns
is used when:
- PreprocessTableUpdate logical resolution rule is executed (and toCommand)
enforcesGeneratedColumns¶
enforcesGeneratedColumns(
protocol: Protocol,
metadata: Metadata): Boolean
enforcesGeneratedColumns
is true
when the following all hold:
- satisfyGeneratedColumnProtocol with the given Protocol
- There is at least one generated column in the table schema (of the given Metadata)
enforcesGeneratedColumns
is used when:
TransactionalWrite
is requested to write data out (and normalizeData)
satisfyGeneratedColumnProtocol¶
satisfyGeneratedColumnProtocol(
protocol: Protocol): Boolean
satisfyGeneratedColumnProtocol
is true
when the minWriterVersion of the given Protocol is at least 4
.
satisfyGeneratedColumnProtocol
is used when:
ColumnWithDefaultExprUtils
utility is used to satisfyProtocolGeneratedColumn
utility is used to isGeneratedColumn, getGeneratedColumns, enforcesGeneratedColumns and generatePartitionFilters- AlterTableChangeColumnDeltaCommand is executed
ImplicitMetadataOperation
is requested to mergeSchema
addGeneratedColumnsOrReturnConstraints¶
addGeneratedColumnsOrReturnConstraints(
deltaLog: DeltaLog,
queryExecution: QueryExecution,
schema: StructType,
df: DataFrame): (DataFrame, Seq[Constraint])
addGeneratedColumnsOrReturnConstraints
returns a DataFrame
with generated columns (missing in the schema) and constraints for generated columns (existing in the schema).
addGeneratedColumnsOrReturnConstraints
finds generated columns (among the top-level columns in the given schema (StructType)).
For every generated column, addGeneratedColumnsOrReturnConstraints
creates a Check constraint with the following:
Generated Column
nameEqualNullSafe
expression that compares the generated column expression with the value provided by the user
In the end, addGeneratedColumnsOrReturnConstraints
uses select
operator on the given DataFrame
.
addGeneratedColumnsOrReturnConstraints
is used when:
TransactionalWrite
is requested to write data out (and normalizeData)
hasGeneratedColumns¶
hasGeneratedColumns(
schema: StructType): Boolean
hasGeneratedColumns
returns true
if any of the top-level columns in the given StructType
(Spark SQL) is a generated column.
hasGeneratedColumns
is used when:
OptimisticTransactionImpl
is requested to verify a new metadataProtocol
is requested for the required minimum protocolSchemaUtils
utility is used to findDependentGeneratedColumns
validateGeneratedColumns¶
validateGeneratedColumns(
spark: SparkSession,
schema: StructType): Unit
validateGeneratedColumns
...FIXME
validateGeneratedColumns
is used when:
OptimisticTransactionImpl
is requested to verify a new metadata