The Internals of Apache Spark 2.4.3

Welcome to The Internals of Apache Spark gitbook! I’m very excited to have you here and hope you will enjoy exploring the internals of Apache Spark (Core) as much as I have.

I write to discover what I know.
— Flannery O'Connor

I’m Jacek Laskowski, a freelance IT consultant, software engineer and technical instructor specializing in Apache Spark, Apache Kafka and Kafka Streams (with Scala and sbt).

I offer software development and consultancy services with hands-on in-depth workshops and mentoring. Reach out to me at or @jaceklaskowski to discuss opportunities.

Consider joining me at Warsaw Scala Enthusiasts and Warsaw Spark meetups in Warsaw, Poland.

I’m also writing other books in the "The Internals of" series about Spark SQL, Spark Structured Streaming, Apache Kafka and Kafka Streams.

Expect text and code snippets from a variety of public sources. Attribution follows.

Now, let me introduce you to Apache Spark.