Skip to content

SQL

Spark Declarative Pipelines supports the following SQL statements to define data processing pipelines:

Pipelines elements are defined in files with .sql file extension.

The SQL files are included as libraries in a pipelines specification file.

SqlGraphRegistrationContext is used on Spark Connect Server to handle SQL statements (from SQL definitions files and Python decorators).

A streaming table can be defined without a query, as streaming tables' data can be backed by standalone flows. During a pipeline execution, it is validated that a streaming table has at least one standalone flow writing to the table, if no query is specified in the create statement itself.