Save for later

Apache Spark with Scala - Hands On with Big Data!

New. Updated for Spark 3.0.0.

“Big data" analysis is a hot and highly valuable skill – and this course will teach you the hottest technology in big data: Apache Spark. Employers including Amazon, EBay, NASA JPL, and Yahoo all use Spark to quickly extract meaning from massive data sets across a fault-tolerant Hadoop cluster. You'll learn those same techniques, using your own Windows system right at home. It's easier than you might think, and you'll be learning from an ex-engineer and senior manager from Amazon and IMDb.

Spark works best when using the Scala programming language, and this course includes a crash-course in Scala to get you up to speed quickly. For those more familiar with Python however, a Python version of this class is also available: "Taming Big Data with Apache Spark and Python - Hands On".

Learn and master the art of framing data analysis problems as Spark problems through over 20 hands-on examples, and then scale them up to run on cloud computing services in this course.

  • Learn the concepts of Spark's Resilient Distributed Datastores

  • Get a crash course in the Scala programming language

  • Develop and run Spark jobs quickly using Scala

  • Translate complex analysis problems into iterative or multi-stage Spark scripts

  • Scale up to larger data sets using Amazon's Elastic MapReduce service

  • Understand how Hadoop YARN distributes Spark across computing clusters

  • Practice using other Spark technologies, like Spark SQL, DataFrames, DataSets, Spark Streaming, and GraphX

By the end of this course, you'll be running code that analyzes gigabytes worth of information – in the cloud – in a matter of minutes. 

We'll have some fun along the way. You'll get warmed up with some simple examples of using Spark to analyze movie ratings data and text in a book. Once you've got the basics under your belt, we'll move to some more complex and interesting tasks. We'll use a million movie ratings to find movies that are similar to each other, and you might even discover some new movies you might like in the process. We'll analyze a social graph of superheroes, and learn who the most “popular" superhero is – and develop a system to find “degrees of separation" between superheroes. Are all Marvel superheroes within a few degrees of being connected to SpiderMan? You'll find the answer.

This course is very hands-on; you'll spend most of your time following along with the instructor as we write, analyze, and run real code together – both on your own system, and in the cloud using Amazon's Elastic MapReduce service. 7.5 hours of video content is included, with over 20 real examples of increasing complexity you can build, run and study yourself. Move through them at your own pace, on your own schedule. The course wraps up with an overview of other Spark-based technologies, including Spark SQL, Spark Streaming, and GraphX.

Enroll now, and enjoy the course.

"I studied Spark for the first time using Frank's course "Apache Spark 2 with Scala - Hands On with Big Data. ". It was a great starting point for me,  gaining knowledge in Scala and most importantly practical examples of Spark applications. It gave me an understanding of all the relevant Spark core concepts,  RDDs, Dataframes & Datasets, Spark Streaming, AWS EMR. Within a few months of completion, I used the knowledge gained from the course to propose in my current company to  work primarily on Spark applications. Since then I have continued to work with Spark. I would highly recommend any of Franks courses as he simplifies concepts well and his teaching manner is easy to follow and continue with.   " - Joey Faherty

Get Details and Enroll Now

OpenCourser is an affiliate partner of Udemy.

Get a Reminder

Send to:
Rating 4.4 based on 1,438 ratings
Length 7.5 total hours
Starts On Demand (Start anytime)
Cost $9
From Udemy
Instructors Sundog Education by Frank Kane, Frank Kane
Download Videos Only via the Udemy mobile app
Language English
Subjects Data Science Business
Tags Data Science Business Development Data & Analytics

Get a Reminder

Send to:

Similar Courses

What people are saying

easy to follow

Trying it yourself helps to lock in the information The instructor knows the subject very well, and he presents each subject clearly and made each of the class easy to follow.

Excellent course for a good introduction to Spark with a lot of easy to follow hands-on exercises.

Easy to follow, nice to watch.

Easy to follow Excellent boot camp on Scala.

The course is detailed and easy to follow.

I am very exited about this course - so far everything is easy to follow.

Frank makes it super easy to follow along as he works through all the set up and examples with you.

Read more

big data

Thank you for this awesome experience for the new guys to big data like me.

I repeat again.This is a course worth taking for freshers in big data and I assure this will not disappoint you.

Additionally, it would be helpful to have a bit more content orienting folks to the wider big data universe.

At this stage I got a good overview of spark and scala and continue to stay excited to dive more into the big data concepts with spark and scala.

I feel confident to perform some basic data analysis on big data using spark with scala.

-Perry Rajagopal / Sr. Data Engineer Disclaimer: This is the first course that I purchased and completed from Frank's (the instructor) big data series.

I'm halfway through the lectures and planning on getting his other classes on Big Data related subjects.

Read more

clear and concise

All the lectures are very clear and concise.

VERY good teacher, always clear and concise.

easy to follow and example were good to show the techiques Clear and concise instructions, progressing step-by-step to more complex examples.

Instructor is incredibly responsive and lectures are clear and concise This guy is brilliant.

The trainer is clear and concise.

Easy to understand for beginners :) very detailed good explanations great learning Great to understand Spark Clear and concise beginner course so far going good.

It's clear and concise.

Read more

step by step

I already tried what they told me to change the scala versions and I keep getting this in the exercises "Error: the main class was not found or loaded"I already reviewed step by step and apparently it is well in addition to that the instructor makes the videos very repetitive besides that they have not answered me satisfactorily what to do with the problem that is not found or loaded the main console and does not do exercises in which we see how perform the program in a more practical way Frank explains things very simply.

I was able to set up my environment by following step by step based on course which will be very helpful to start playing with.

Very detailed explanation of the concepts with step by step instructions along with loads of examples to try.

It energizes you while following step by step Very well explained material.

yes very simple to understand very good and step by step instruction to install the required things Frank Kane Thank you for this superb extraordinary course, i specially thankful for your real-time twitter data analysis lectures and machine learning lectures.

Best Nice explanation of code step by step, really enjoying like anything .I am a ZERO in SCALA,Spark and hoping to be HERO by this video.

Various examples and step by step instructions to solve different problems will diffidently put you on the best track to start your big data career or even a new page or stage of your career.

Read more

so far so

So far so good.

So far So Good!

clear instructions Yes, so far so good This is an excellent cource on Spark & Scala that i never seen in any of the online courses.

I loved it So far so good It goes at a nice pace so far.

So far so good I've taken some of Franks classes before and I highly recommend them.

So far so good.. .hoping same for future lectures as well.

The style of teaching is good but this is the first lecture I have viewed so far so I can't say what will happen next.

Read more

real world

But, request to add/create a course for working in real world projects.

The way the presenter explains real world problems and how analytical data is extracted from it is very convincing.

-One betterment area would be still more real world examples.

It will be better if provide some scenarios/use cases of the real world projects.

Real world example is the main beauty of this course.

The materials touch upon all the features of spark, while solving real world problems using each one of them.

He really uses a lot of real world examples and applications to build up your skills.

Read more


An overview of related careers and their average salaries in the US. Bars indicate income percentile.

Volunteer Big Data Engineer $48k

Data Scientist - Big Data $68k

Big Data and AWS Data Lake $73k

Big Data Developer (Streaming Data) $77k

Big data developer with AWS $78k

Research Scientist Big Data $94k

Big Data Developer Consultant $98k

Big Data Engineer 6 $107k

Big data and ETL specialist $121k

Big Data Specialist $149k

Principal Big Data Architect $180k

Senior Big Data Sales $181k

Write a review

Your opinion matters. Tell us what you think.

Rating 4.4 based on 1,438 ratings
Length 7.5 total hours
Starts On Demand (Start anytime)
Cost $9
From Udemy
Instructors Sundog Education by Frank Kane, Frank Kane
Download Videos Only via the Udemy mobile app
Language English
Subjects Data Science Business
Tags Data Science Business Development Data & Analytics

Similar Courses

Sorted by relevance

Like this course?

Here's what to do next:

  • Save this course for later
  • Get more details from the course provider
  • Enroll in this course
Enroll Now