Save for later

Distributed Machine Learning with Apache Spark

Heads up! This course may be archived and/or unavailable.

Machine learning aims to extract knowledge from data, relying on fundamental concepts in computer science, statistics, probability and optimization. Learning algorithms enable a wide range of applications, from everyday tasks such as product recommendations and spam filtering to bleeding edge applications like self-driving cars and personalized medicine. In the age of ‘big data’, with datasets rapidly growing in size and complexity and cloud computing becoming more pervasive, machine learning techniques are fast becoming a core component of large-scale data processing pipelines.

This statistics and data analysis course introduces the underlying statistical and algorithmic principles required to develop scalable real-world machine learning pipelines. We present an integrated view of data processing by highlighting the various components of these pipelines, including exploratory data analysis, feature extraction, supervised learning, and model evaluation. You will gain hands-on experience applying these principles using Spark, a cluster computing system well-suited for large-scale machine learning tasks, and its packages spark.ml and spark.mllib. You will implement distributed algorithms for fundamental statistical models (linear regression, logistic regression, principal component analysis) while tackling key problems from domains such as online advertising and cognitive neuroscience.

Get Details and Enroll Now

OpenCourser is an affiliate partner of edX and may earn a commission when you buy through our links.

Get a Reminder

Send to:
Rating Not enough ratings
Length 4 weeks
Effort 5-10 hours per week
Starts On Demand (Start anytime)
Cost $0
From Berkeley via edX
Instructors Ameet Talwalkar, Jon Bates
Download Videos On all desktop and mobile devices
Language English
Subjects Programming Data Science
Tags Computer Science Data Analysis & Statistics

Get a Reminder

Send to:

Similar Courses

Careers

An overview of related careers and their average salaries in the US. Bars indicate income percentile.

Research Scientist-Machine Learning $55k

Cloud Architect - Azure / Machine Learning $75k

Watson Machine Learning Engineer $81k

Machine Learning Software Developer $103k

Software Engineer (Machine Learning) $116k

Applied Scientist, Machine Learning $130k

Autonomy and Machine Learning Solutions Architect $131k

Applied Scientist - Machine Learning -... $136k

RESEARCH SCIENTIST (MACHINE LEARNING) $147k

Machine Learning Engineer 2 $161k

Machine Learning Scientist Manager $170k

Machine Learning Scientist, Personalization $213k

Write a review

Your opinion matters. Tell us what you think.

Rating Not enough ratings
Length 4 weeks
Effort 5-10 hours per week
Starts On Demand (Start anytime)
Cost $0
From Berkeley via edX
Instructors Ameet Talwalkar, Jon Bates
Download Videos On all desktop and mobile devices
Language English
Subjects Programming Data Science
Tags Computer Science Data Analysis & Statistics

Similar Courses

Sorted by relevance

Like this course?

Here's what to do next:

  • Save this course for later
  • Get more details from the course provider
  • Enroll in this course
Enroll Now