Distributed Machine Learning with Apache Spark
Heads up! This course may be archived and/or unavailable.
Machine learning aims to extract knowledge from data, relying on fundamental concepts in computer science, statistics, probability and optimization. Learning algorithms enable a wide range of applications, from everyday tasks such as product recommendations and spam filtering to bleeding edge applications like self-driving cars and personalized medicine. In the age of ‘big data’, with datasets rapidly growing in size and complexity and cloud computing becoming more pervasive, machine learning techniques are fast becoming a core component of large-scale data processing pipelines.
This statistics and data analysis course introduces the underlying statistical and algorithmic principles required to develop scalable real-world machine learning pipelines. We present an integrated view of data processing by highlighting the various components of these pipelines, including exploratory data analysis, feature extraction, supervised learning, and model evaluation. You will gain hands-on experience applying these principles using Spark, a cluster computing system well-suited for large-scale machine learning tasks, and its packages spark.ml and spark.mllib. You will implement distributed algorithms for fundamental statistical models (linear regression, logistic regression, principal component analysis) while tackling key problems from domains such as online advertising and cognitive neuroscience.
Get a Reminder
Rating | Not enough ratings |
---|---|
Length | 4 weeks |
Effort | 5-10 hours per week |
Starts | On Demand (Start anytime) |
Cost | $0 |
From | Berkeley via edX |
Instructors | Ameet Talwalkar, Jon Bates |
Download Videos | On all desktop and mobile devices |
Language | English |
Subjects | Programming Data Science |
Tags | Computer Science Data Analysis & Statistics |
Get a Reminder
Similar Courses
Careers
An overview of related careers and their average salaries in the US. Bars indicate income percentile.
Research Scientist-Machine Learning $55k
Cloud Architect - Azure / Machine Learning $75k
Watson Machine Learning Engineer $81k
Machine Learning Software Developer $103k
Software Engineer (Machine Learning) $116k
Applied Scientist, Machine Learning $130k
Autonomy and Machine Learning Solutions Architect $131k
Applied Scientist - Machine Learning -... $136k
RESEARCH SCIENTIST (MACHINE LEARNING) $147k
Machine Learning Engineer 2 $161k
Machine Learning Scientist Manager $170k
Machine Learning Scientist, Personalization $213k
Write a review
Your opinion matters. Tell us what you think.
Please login to leave a review
Rating | Not enough ratings |
---|---|
Length | 4 weeks |
Effort | 5-10 hours per week |
Starts | On Demand (Start anytime) |
Cost | $0 |
From | Berkeley via edX |
Instructors | Ameet Talwalkar, Jon Bates |
Download Videos | On all desktop and mobile devices |
Language | English |
Subjects | Programming Data Science |
Tags | Computer Science Data Analysis & Statistics |
Similar Courses
Sorted by relevance
Like this course?
Here's what to do next:
- Save this course for later
- Get more details from the course provider
- Enroll in this course