CatalystSqlParser¶
CatalystSqlParser
is an AbstractSqlParser for DataTypes.
CatalystSqlParser
uses AstBuilder for parsing SQL texts.
import org.apache.spark.sql.catalyst.parser.CatalystSqlParser
import org.apache.spark.sql.internal.SQLConf
val catalystSqlParser = new CatalystSqlParser(SQLConf.get)
scala> :type catalystSqlParser.astBuilder
org.apache.spark.sql.catalyst.parser.AstBuilder
CatalystSqlParser
is used to translate DataTypes from their canonical string representation (e.g. when adding fields to a schema or casting column to a different data type) or StructTypes.
import org.apache.spark.sql.types.StructType
scala> val struct = new StructType().add("a", "int")
struct: org.apache.spark.sql.types.StructType = StructType(StructField(a,IntegerType,true))
scala> val asInt = expr("token = 'hello'").cast("int")
asInt: org.apache.spark.sql.Column = CAST((token = hello) AS INT)
When parsing, you should see INFO messages in the logs:
Parsing command: int
It is also used in HiveClientImpl
(when converting columns from Hive to Spark) and in OrcFileOperator
(when inferring the schema for ORC files).
Creating Instance¶
CatalystSqlParser
takes the following to be created:
CatalystSqlParser
is created when:
- SessionCatalog is created
Logging¶
Enable ALL
logging level for org.apache.spark.sql.catalyst.parser.CatalystSqlParser
logger to see what happens inside.
Add the following line to conf/log4j2.properties
:
log4j.logger.org.apache.spark.sql.catalyst.parser.CatalystSqlParser=ALL
Refer to Logging.