Save for later

Big Data Analysis with Apache Spark

Organizations use their data to support and influence decisions and build data-intensive products and services, such as recommendation, prediction, and diagnostic systems. The collection of skills required by organizations to support these functions has been grouped under the term ‘data science’.

This statistics and data analysis course will attempt to articulate the expected output of data scientists and then teach students how to use PySpark (part of Spark) to deliver against these expectations. The course assignments include log mining, textual entity recognition, and collaborative filtering exercises that teach students how to manipulate data sets using parallel processing with PySpark.

This course covers advanced undergraduate-level material. It requires a programming background and experience with Python (or the ability to learn it quickly). All exercises will use PySpark (the Python API for Spark), and previous experience with Spark equivalent to Introduction to Apache Spark, is required.

Get Details and Enroll Now

OpenCourser is an affiliate partner of edX.

Set Reminder Save for later

Get a Reminder

Not ready to enroll yet? We'll send you an email reminder for this course

Send to:

edX

&

Berkeley

Rating Not enough ratings
Length 4 weeks
Effort 5-10 hours per week
Starts On Demand (Start anytime)
Cost $0
From Berkeley via edX
Instructor Anthony D. Joseph
Download Videos On all desktop and mobile devices
Language English
Subjects Programming Data Science
Tags Computer Science Data Analysis & Statistics

Get a Reminder

Get an email reminder about this course

Send to:

Careers

An overview of related careers and their average salaries in the US. Bars indicate income percentile.

Volunteer Big Data Engineer $48k

Informatica PowerCenter with Big Data $69k

Oracle Big Data Appliance $76k

Big Data Developer (Streaming Data) $77k

Big data developer with AWS $78k

Senior Big Data Engineer 2 $93k

Senior Big Data Engineer 6 $100k

Big Data Specialist $149k

Big Data Practice Architect $162k

Big Data Architect Lead $177k

Principal Big Data Architect $180k

Big Data Enterprise Architect $202k

Write a review

Your opinion matters. Tell us what you think.

edX

&

Berkeley

Rating Not enough ratings
Length 4 weeks
Effort 5-10 hours per week
Starts On Demand (Start anytime)
Cost $0
From Berkeley via edX
Instructor Anthony D. Joseph
Download Videos On all desktop and mobile devices
Language English
Subjects Programming Data Science
Tags Computer Science Data Analysis & Statistics

Similar Courses

Sorted by relevance

Like this course?

Here's what to do next:

  • Save this course for later
  • Get more details from the course provider
  • Enroll in this course
Enroll Now