SparkConnectService¶
SparkConnectService is a BindableService (gRPC).
SparkConnectService is started as a gRPC service for the following:
- Apache Spark applications (as a Spark driver plugin)
- On command line as a SparkConnectServer standalone application
Upon start, SparkConnectService posts a SparkListenerConnectServiceStarted event with all the network connectivity information.
Creating Instance¶
SparkConnectService takes the following to be created:
-
debugflag
SparkConnectService is created when:
SparkConnectServiceis requested to start a gRPC service
gRPC Server¶
SparkConnectService creates and starts a Server (gRPC) when starting the gRPC Service.
Start Spark Connect Service¶
start starts a gRPC service (with a SparkConnectService) and then creates a listener and the UI.
In the end, start posts a SparkListenerConnectServiceStarted event.
start is used when:
SparkConnectPluginis requested for the Spark driver plugin- SparkConnectServer standalone application is started
Start gRPC Service¶
startGRPCService reads the values of the following configuration properties:
| Configuration Property | Default Value |
|---|---|
| spark.connect.grpc.debug.enabled | false |
| spark.connect.grpc.binding.address | |
| spark.connect.grpc.binding.port | 15002 |
| spark.connect.grpc.maxInboundMessageSize | 128 * 1024 * 1024 |
startGRPCService creates a SparkConnectService (with the value of spark.connect.grpc.debug.enabled).
With spark.connect.grpc.debug.enabled enabled, startGRPCService creates a ProtoReflectionService.
startGRPCService creates configured ServerInterceptors (based on spark.connect.grpc.interceptor.classes).
With spark.connect.grpc.binding.address defined, startGRPCService prints out the following INFO message to the logs:
startGRPCService builds a NettyServerBuilder for the hostname (if defined) and spark.connect.grpc.binding.port.
startGRPCService adds the SparkConnectService.
With authenticate token defined, startGRPCService creates a PreSharedKeyAuthenticationInterceptor.
startGRPCService registers all the configured interceptors.
In debug mode (spark.connect.grpc.debug.enabled enabled), startGRPCService adds the ProtoReflectionService.
startGRPCService builds the server and starts it.
createListenerAndUI¶
createListenerAndUI creates a SparkConnectServerTab (for spark.ui.enabled enabled).
Post SparkListenerConnectServiceStarted¶
postSparkConnectServiceStarted posts a SparkListenerConnectServiceStarted event (with this server's hostAddress, port and the current time)
Handle Add Artifacts Request¶
Generated by gRPC Proto Compiler
addArtifacts(
responseObserver: StreamObserver[AddArtifactsResponse]
): StreamObserver[AddArtifactsRequest]
addArtifacts is generated by the gRPC proto compiler from spark/connect/base.proto.
addArtifacts creates a new SparkConnectAddArtifactsHandler for the given responseObserver.
Handle Analyze Plan Request¶
Generated by gRPC Proto Compiler
analyzePlan(
request: proto.AnalyzePlanRequest,
responseObserver: StreamObserver[proto.AnalyzePlanResponse]): Unit
analyzePlan is generated by the gRPC proto compiler from spark/connect/base.proto.
analyzePlan creates a new SparkConnectAnalyzeHandler to handle the AnalyzePlanRequest request.
Handle Execute Plan Request¶
Generated by gRPC Proto Compiler
executePlan(
request: proto.ExecutePlanRequest,
responseObserver: StreamObserver[proto.ExecutePlanResponse]): Unit
executePlan is generated by the gRPC proto compiler from spark/connect/base.proto.
executePlan creates a SparkConnectExecutePlanHandler to handle the ExecutePlanRequest request.
Handle Execute Plan Request¶
Generated by gRPC Proto Compiler
releaseSession(
request: proto.ReleaseSessionRequest,
responseObserver: StreamObserver[proto.ReleaseSessionResponse]): Unit
releaseSession is generated by the gRPC proto compiler from spark/connect/base.proto.
releaseSession creates a SparkConnectReleaseSessionHandler to handle the ReleaseSessionRequest request.
Logging¶
Enable ALL logging level for org.apache.spark.sql.connect.service.SparkConnectService logger to see what happens inside.
Add the following line to conf/log4j2.properties:
logger.SparkConnectService.name = org.apache.spark.sql.connect.service.SparkConnectService
logger.SparkConnectService.level = all
Refer to Logging.