Set Reminder Save for later

Big Data Analysis with Scala and Spark

This course is a part of Functional Programming in Scala, a 5-course Specialization series from Coursera.

Manipulating big data distributed over a cluster using functional concepts is rampant in industry, and is arguably one of the first widespread industrial uses of functional ideas. This is evidenced by the popularity of MapReduce and Hadoop, and most recently Apache Spark, a fast, in-memory distributed collections framework written in Scala. In this course, we'll see how the data parallel paradigm can be extended to the distributed case, using Spark throughout. We'll cover Spark's programming model in detail, being careful to understand how and when it differs from familiar programming models, like shared-memory parallel collections or sequential Scala collections. Through hands-on examples in Spark and Scala, we'll learn when important issues related to distribution like latency and network communication should be considered and how they can be addressed effectively for improved performance. Learning Outcomes. By the end of this course you will be able to: - read data from persistent storage and load it into Apache Spark, - manipulate data with Spark and Scala, - express algorithms for data analysis in a functional style, - recognize how to avoid shuffles and recomputation in Spark, Recommended background: You should have at least one year programming experience. Proficiency with Java or C# is ideal, but experience with other languages such as C/C++, Python, Javascript or Ruby is also sufficient. You should have some familiarity using the command line. This course is intended to be taken after Parallel Programming:

Get Details and Enroll Now

OpenCourser is an affiliate partner of Coursera.

Get a Reminder

Not ready to enroll yet? We'll send you an email reminder for this course

Send to:



École polytechnique fédérale de Lausanne

Rating 4.6 based on 309 ratings
Length 5 weeks
Starts Feb 11 (10 weeks ago)
Cost $79
From École polytechnique fédérale de Lausanne via Coursera
Instructor Dr. Heather Miller
Download Videos On all desktop and mobile devices
Language English
Subjects Programming
Tags Computer Science Algorithms

Get a Reminder

Get an email reminder about this course

Send to:

What people are saying

We analyzed reviews for this course to surface learners' thoughts about it

introduction to spark in 12 reviews

excellent introduction to Spark.

Great introduction to Spark accessed through Scala.

Good introduction to spark, the most useful big data framework!

Great introduction to Spark.

Concepts covered here are very helpful though and it is a useful introduction to Spark.

I enjoyed the exercises and thought it gave a good introduction to Spark.

A great introduction to Spark !!!

Good introduction to Spark.

very interesting in 10 reviews

Outstanding Very interesting course!

Very interesting lectures, slides and assignments Great approach to learn about Spark in practice Very useful.

Very interesting one!

I learned some very interesting things, specially y de last 2 weeks about partitioning and Spark SQL.

Very interesting course!!

Very interesting course about Spark, it covers a lot of key concepts!

Very Very Interesting and helpful!The slides' layout is very clear and step by step for each important topic.The motivation of why we need dataframe and dataset and what's their difference is explained with a logical and reasonable way!

A very interesting and useful course.Highly Recommended!

very good course in 9 reviews

Super course, well done Heather It is indeed a very good course, but 2nd assignment was tough though.

Very good course!

Very good course and good materials for learning With this course, I surely improved my knowledge about Spark...

Very good course, it is a must for anyone who is starting in spark using scala, thanks a lot, it did really help me Nice introduction into Spark with details about how Spark works internally.

very good course, really enjoyed The sessions where clearly explained and focused.

Very good course, but it needs more details and examples.

It is very good course material for Spark with scala.

Kudos to Professor Miller, we love you :-) Very good course.

big data in 8 reviews

it was a super interesting course Dear Heather,your course on big data with scala is the very first online course I participate in.I enjoy the way you explain the material and receive a real aesthetic pleasure.

Very good for Scala beginners and students who are entering the world of Big Data The material of the fourth week is quite dense, this could be split over two weeks (including splitting it into two exercises).

More and more concentrations and analysis on this big data research.I wish more courses about Parallel clustering using Spark available to the many.

you cannot complete them just by following the course material, forcing you to waste quite a lot of time either: (1) learning from other sources; (2) looking for answers on the forum; or (3) brute forcing an answer till rage quitting :)another bad point: the course is supposed to be focused on spark & big data analysis but it has 1-2 lectures (around 40-60 mins) pretty much devoted to showing some SQL.

It is not a general Big Data course, neither is it an easy one.

The course gave me insight into the world och big data batch processing and how Spark solves it.

goot as introduction about spark and big data.

really enjoyed in 8 reviews

I really enjoyed the course, specially the first 3 weeks.

Really enjoyed this course.

I had previous experience with Scala and Spark and I really enjoyed the course.

I really enjoyed going through the course, and I learned a lot.

I learned a lot and I really enjoyed the course.

I really enjoyed coding the assignments.

I really enjoyed this course!

programming assignments in 8 reviews

You can quite freely apply the course material to the programming assignments.

An introductory course to spark programming, lectures are well-balanced between theory and boier-plate codes, but programming assignments are mainly about teaching you the APIs.

Would be great only if there were a bit more programming assignments, with more fine grained structure, so that one could practice more in simple things, not only trying to fill out ???

The only thing I think could be improved is that each programming assignments has unit tests that drive students towards the final solution.

Programming assignments very nice!

Doing the programming assignments properly requires reading a lot trough the Spark documentation, which I personally liked as part of the challenge, but beware if you are not that type of person or aim at finishing the course as quickly as possible.

Great introduction course to Spark with excellent materials and hand-on programming assignments.


An overview of related careers and their average salaries in the US. Bars indicate income percentile.

Volunteer Big Data Engineer $48k

Informatica PowerCenter with Big Data $69k

Oracle Big Data Appliance $76k

Corporate Technology- Scala/Spark/Hadoop Engineer $76k

Big data developer with AWS $78k

Senior Big Data Engineer 2 $93k

Big Data Architect Consultant $132k

Big Data Specialist $149k

Big Data Practice Architect $162k

Big Data Architect Lead $177k

Principal Big Data Architect $180k

Big Data Enterprise Architect $202k


Sorted by most helpful reviews first

Guest says:

This is a nice introduction to Spark. It's worth noting that even though it's part of a "Specialization" you can take this course individually if you're already familiar with Scala, which is what I did. Otherwise I imagine the other courses in this series are excellent too.

Write a review

Your opinion matters. Tell us what you think.



École polytechnique fédérale de Lausanne

Rating 4.6 based on 309 ratings
Length 5 weeks
Starts Feb 11 (10 weeks ago)
Cost $79
From École polytechnique fédérale de Lausanne via Coursera
Instructor Dr. Heather Miller
Download Videos On all desktop and mobile devices
Language English
Subjects Programming
Tags Computer Science Algorithms

Similar Courses

Sorted by relevance

Like this course?

Here's what to do next:

  • Save this course for later
  • Get more details from the course provider
  • Enroll in this course
Enroll Now