The Internals Online Books

Welcome to “The Internals Of” Online Books project! 🤙

I’m Jacek Laskowski, an IT freelancer specializing in Apache Spark, Delta Lake and Apache Kafka (with brief forays into a wider data engineering space, e.g. Trino and ksqlDB, mostly during Warsaw Data Engineering meetups).

I’m very excited to have you here and hope you will enjoy exploring the internals of the open source projects together (in no particular order):

  1. Apache Spark
  2. Spark SQL
  3. Spark Structured Streaming
  4. Delta Lake
  5. Spark on Kubernetes
  6. PySpark
  7. Apache Kafka
  8. Kafka Streams
  9. Apache Beam

Please note that some books have less current content than others, but that’s expected with a one-person project where some many things are so interesting and thus time-consuming. Life’s too short to taste everything :/

The aim of this project is to host all the current and future internals books under a single organization on GitHub and publish to a single domain via GitHub Pages (until I find a better way to publish the books).

Custom Docker Image

The books projects use a custom Docker image (based on the Insiders image).

The official Docker image does not include all plugins the books need and hence this custom image.

Review Dockerfile and requirements.txt files to learn more.

Build Books Docker Image

export INSIDERS_TAG=7.0.1-insiders-2.0.0
docker build \
  --build-arg INSIDERS_TAG \
  --tag jaceklaskowski/mkdocs-material-insiders \
  --tag jaceklaskowski/mkdocs-material-insiders:$INSIDERS_TAG \

NOTE Learn more about docker build command in the official documentation of Docker.

Build Book

Use docker run command with build argument to build a book.

docker run \
  -it \
  -p 8000:8000 \
  -v ${PWD}:/docs \
  jaceklaskowski/mkdocs-material-insiders \
  build --clean

TIP: Consult the Material for MkDocs documentation to get started.

Live Editing

Use docker run command with serve argument (with --dirtyreload for faster reloads) in the project root (the folder with mkdocs.yml).

docker run \
  -it \
  -p 8000:8000 \
  -v ${PWD}:/docs \
  jaceklaskowski/mkdocs-material-insiders \
  serve --dirtyreload --verbose --dev-addr