We may earn an affiliate commission when you visit our partners.

Learning Spark

Jules S. Damji, Brooke Wenig, and Denny Lee

Data is bigger, arrives faster, and comes in a variety of formats--and it all needs to be processed at scale for analytics or machine learning. But how can you process such varied workloads efficiently? Enter Apache Spark.

Updated to include Spark 3.0, this second edition shows data engineers and data scientists why structure and unification in Spark matters. Specifically, this book explains how to perform simple and complex data analytics and employ machine learning algorithms. Through step-by-step walk-throughs, code snippets, and notebooks, you'll be able to:

Learn Python, SQL, Scala, or Java high-level Structured APIs

Understand Spark operations and SQL Engine

Inspect, tune, and debug Spark operations with Spark configurations and Spark UI

Connect to data sources: JSON, Parquet, CSV, Avro, ORC, Hive, S3, or Kafka

Perform analytics on batch and streaming data using Structured Streaming

Build reliable data pipelines with open source Delta Lake and Spark

Develop machine learning pipelines with MLlib and productionize models using MLflow

Read on Amazon
Read this for free with Kindle Unlimited

Save this book

Create your own learning path. Save this book to your list so you can find it easily later.
Save

Share

Help others find this book page by sharing it with your friends and followers:
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2025 OpenCourser