Learning Path
Featured Chapters
15 chapters of interactive lessons, animated DAGs, and bilingual explanations.
hub
Foundations
Horizontal Scaling, Nodes, and Clusters
4+ chaptersBeginner
swap_horiz
Shuffles & SQL
Wide Dependencies, Stage Boundaries, and Shuffle Mechanism
3+ chaptersIntermediate
memory
Memory & Joins
Broadcast Join, Sort-Merge Join, Shuffle Hash Join, and Aggregation Strategies
2+ chaptersIntermediate
speed
Performance & Production
AQE, Data Skew, Predicate Pushdown, and Shuffle Optimization
4+ chaptersAdvanced
The Method
Learning that actually sticks
01
Read the story panel
Every concept is introduced through a character-driven narrative with clear, jargon-free explanations before any code appears.
02
Try the interactive demo
Step through a live DAG, fire Actions, and watch execution animate. All in the browser, no Spark cluster needed.
03
Switch PySpark and SQL
Every example has a dual-syntax toggle. Prove to yourself that the Catalyst optimizer produces identical plans for both.
Why This Project
We won't go deep on theory. We'll get you to know-how.
Spark Lessons skips the dense academic papers and config-flag deep dives. Each chapter builds a working mental model fast, story first, then a simulated DAG you can poke at, so you walk away knowing how Spark actually behaves, not just what the docs say.
Free · No account needed · EN / TH
Ready to spark something?
Open the first chapter and start building your mental model of Apache Spark, one interactive lesson at a time.
