Skip to content

SparkConnectService

SparkConnectService is a BindableService (gRPC).

SparkConnectService is started as a gRPC service for the following:

Right after having been started, SparkConnectService posts a SparkListenerConnectServiceStarted event with all the network connectivity information.

Creating Instance

SparkConnectService takes the following to be created:

  • debug flag

SparkConnectService is created when:

gRPC Server

SparkConnectService creates and starts a Server (gRPC) when starting the gRPC Service.

Start Spark Connect Service

start(
  sc: SparkContext): Unit

start starts a gRPC service (with a SparkConnectService) and then creates a listener and the UI.

In the end, start posts a SparkListenerConnectServiceStarted event.


start is used when:

Start gRPC Service

startGRPCService(): Unit

startGRPCService reads the values of the following configuration properties:

Configuration Property Default Value
spark.connect.grpc.debug.enabled true
spark.connect.grpc.binding.port 15002
spark.connect.grpc.maxInboundMessageSize 128 * 1024 * 1024

startGRPCService builds a NettyServerBuilder with the spark.connect.grpc.binding.port and a SparkConnectService.

startGRPCService registers interceptors.

startGRPCService builds the server and starts it.

createListenerAndUI

createListenerAndUI(
  sc: SparkContext): Unit

createListenerAndUI creates a SparkConnectServerTab (for spark.ui.enabled enabled).

Post SparkListenerConnectServiceStarted

postSparkConnectServiceStarted(): Unit

postSparkConnectServiceStarted posts a SparkListenerConnectServiceStarted event (with this server's hostAddress, port and the current time)

Handle Add Artifacts Request

Generated by gRPC Proto Compiler
addArtifacts(
  responseObserver: StreamObserver[AddArtifactsResponse]
): StreamObserver[AddArtifactsRequest]

addArtifacts is generated by the gRPC proto compiler from spark/connect/base.proto.

addArtifacts creates a new SparkConnectAddArtifactsHandler for the given responseObserver.

Handle Analyze Plan Request

Generated by gRPC Proto Compiler
analyzePlan(
  request: proto.AnalyzePlanRequest,
  responseObserver: StreamObserver[proto.AnalyzePlanResponse]): Unit

analyzePlan is generated by the gRPC proto compiler from spark/connect/base.proto.

analyzePlan creates a new SparkConnectAnalyzeHandler to handle the AnalyzePlanRequest request.

Handle Execute Plan Request

Generated by gRPC Proto Compiler
executePlan(
  request: proto.ExecutePlanRequest,
  responseObserver: StreamObserver[proto.ExecutePlanResponse]): Unit

executePlan is generated by the gRPC proto compiler from spark/connect/base.proto.

executePlan creates a SparkConnectExecutePlanHandler to handle the ExecutePlanRequest request.

Handle Execute Plan Request

Generated by gRPC Proto Compiler
releaseSession(
  request: proto.ReleaseSessionRequest,
  responseObserver: StreamObserver[proto.ReleaseSessionResponse]): Unit

releaseSession is generated by the gRPC proto compiler from spark/connect/base.proto.

releaseSession creates a SparkConnectReleaseSessionHandler to handle the ReleaseSessionRequest request.

Logging

Enable ALL logging level for org.apache.spark.sql.connect.service.SparkConnectService logger to see what happens inside.

Add the following line to conf/log4j2.properties:

logger.SparkConnectService.name = org.apache.spark.sql.connect.service.SparkConnectService
logger.SparkConnectService.level = all

Refer to Logging.