The Internals of PySpark (Apache Spark 3.1.1)¶
Welcome to The Internals of PySpark online book! 🤙
I'm Jacek Laskowski, an IT freelancer specializing in Apache Spark, Delta Lake and Apache Kafka (with brief forays into a wider data engineering space, e.g. Trino and ksqlDB, mostly during Warsaw Data Engineering meetups).
I'm very excited to have you here and hope you will enjoy exploring the internals of Spark SQL as much as I have.
I write to discover what I know.
"The Internals Of" series
I'm also writing other online books in the "The Internals Of" series. Please visit "The Internals Of" Online Books home page.
Expect text and code snippets from a variety of public sources. Attribution follows.
Now, let's deep dive into PySpark 🔥