SparkConnectService¶
SparkConnectService
is a BindableService
(gRPC).
SparkConnectService
is started as a gRPC service for the following:
- Apache Spark applications (as a Spark driver plugin)
- On command line as a SparkConnectServer standalone application
Right after having been started, SparkConnectService
posts a SparkListenerConnectServiceStarted event with all the network connectivity information.
Creating Instance¶
SparkConnectService
takes the following to be created:
-
debug
flag
SparkConnectService
is created when:
SparkConnectService
is requested to start a gRPC service
gRPC Server¶
SparkConnectService
creates and starts a Server
(gRPC) when starting the gRPC Service.
Start Spark Connect Service¶
start
starts a gRPC service (with a SparkConnectService) and then creates a listener and the UI.
In the end, start
posts a SparkListenerConnectServiceStarted event.
start
is used when:
SparkConnectPlugin
is requested for the Spark driver plugin- SparkConnectServer standalone application is started
Start gRPC Service¶
startGRPCService
reads the values of the following configuration properties:
Configuration Property | Default Value |
---|---|
spark.connect.grpc.debug.enabled | true |
spark.connect.grpc.binding.port | 15002 |
spark.connect.grpc.maxInboundMessageSize | 128 * 1024 * 1024 |
startGRPCService
builds a NettyServerBuilder
with the spark.connect.grpc.binding.port
and a SparkConnectService.
startGRPCService
registers interceptors.
startGRPCService
builds the server and starts it.
createListenerAndUI¶
createListenerAndUI
creates a SparkConnectServerTab (for spark.ui.enabled
enabled).
Post SparkListenerConnectServiceStarted¶
postSparkConnectServiceStarted
posts a SparkListenerConnectServiceStarted
event (with this server's hostAddress, port and the current time)
Handle Add Artifacts Request¶
Generated by gRPC Proto Compiler
addArtifacts(
responseObserver: StreamObserver[AddArtifactsResponse]
): StreamObserver[AddArtifactsRequest]
addArtifacts
is generated by the gRPC proto compiler from spark/connect/base.proto
.
addArtifacts
creates a new SparkConnectAddArtifactsHandler for the given responseObserver
.
Handle Analyze Plan Request¶
Generated by gRPC Proto Compiler
analyzePlan(
request: proto.AnalyzePlanRequest,
responseObserver: StreamObserver[proto.AnalyzePlanResponse]): Unit
analyzePlan
is generated by the gRPC proto compiler from spark/connect/base.proto
.
analyzePlan
creates a new SparkConnectAnalyzeHandler to handle the AnalyzePlanRequest
request.
Handle Execute Plan Request¶
Generated by gRPC Proto Compiler
executePlan(
request: proto.ExecutePlanRequest,
responseObserver: StreamObserver[proto.ExecutePlanResponse]): Unit
executePlan
is generated by the gRPC proto compiler from spark/connect/base.proto
.
executePlan
creates a SparkConnectExecutePlanHandler to handle the ExecutePlanRequest
request.
Handle Execute Plan Request¶
Generated by gRPC Proto Compiler
releaseSession(
request: proto.ReleaseSessionRequest,
responseObserver: StreamObserver[proto.ReleaseSessionResponse]): Unit
releaseSession
is generated by the gRPC proto compiler from spark/connect/base.proto
.
releaseSession
creates a SparkConnectReleaseSessionHandler to handle the ReleaseSessionRequest
request.
Logging¶
Enable ALL
logging level for org.apache.spark.sql.connect.service.SparkConnectService
logger to see what happens inside.
Add the following line to conf/log4j2.properties
:
logger.SparkConnectService.name = org.apache.spark.sql.connect.service.SparkConnectService
logger.SparkConnectService.level = all
Refer to Logging.