We may earn an affiliate commission when you visit our partners.
Course image
Rafael Irizarry

The demand for skilled data science practitioners in industry, academia, and government is rapidly growing. The HarvardX Data Science program prepares you with the necessary knowledge base and useful skills to tackle real-world data analysis challenges. The program covers concepts such as probability, inference, regression, and machine learning and helps you develop an essential skill set that includes R programming, data wrangling with dplyr, data visualization with ggplot2, file organization with Unix/Linux, version control with git and GitHub, and reproducible document preparation with RStudio.

Read more

The demand for skilled data science practitioners in industry, academia, and government is rapidly growing. The HarvardX Data Science program prepares you with the necessary knowledge base and useful skills to tackle real-world data analysis challenges. The program covers concepts such as probability, inference, regression, and machine learning and helps you develop an essential skill set that includes R programming, data wrangling with dplyr, data visualization with ggplot2, file organization with Unix/Linux, version control with git and GitHub, and reproducible document preparation with RStudio.

In each course, we use motivating case studies, ask specific questions, and learn by answering these through data analysis. Case studies include: Trends in World Health and Economics, US Crime Rates, The Financial Crisis of 2007-2008, Election Forecasting, Building a Baseball Team (inspired by Moneyball), and Movie Recommendation Systems.

Throughout the program, we will be using the R software environment. You will learn R, statistical concepts, and data analysis techniques simultaneously. We believe that you can better retain R knowledge when you learn how to solve a specific problem.

What you'll learn

  • Fundamental R programming skills
  • Statistical concepts such as probability, inference, and modeling and how to apply them in practice
  • Gain experience with the tidyverse, including data visualization with ggplot2 and data wrangling with dplyr
  • Become familiar with essential tools for practicing data scientists such as Unix/Linux, git and GitHub, and RStudio
  • Implement machine learning algorithms
  • In-depth knowledge of fundamental data science concepts through motivating real-world case studies

Share

Help others find this collection page by sharing it with your friends and followers:

What's inside

Nine courses

Data Science: Inference and Modeling

(12 hours)
Statistical inference and modeling are crucial for data scientists. This course teaches these concepts through an election forecasting case study. You'll learn to define estimates and margins of error, use models to aggregate data, and understand Bayesian statistics and predictive modeling.

Data Science: Capstone

(35 hours)
To become an expert data scientist, you need practice and experience. This capstone project will test your skills in data visualization, probability, inference and modeling, data wrangling, data organization, regression, and machine learning.

Data Science: Visualization

(12 hours)
As part of our Professional Certificate Program in Data Science, this course covers the basics of data visualization and exploratory data analysis. We will use ggplot2, a data visualization package for the statistical programming language R. We will start with simple datasets and then graduate to case studies about world health, economics, and infectious disease trends in the United States.

Data Science: Wrangling

(12 hours)
In this course, we cover several standard steps of the data wrangling process like importing data into R, tidying data, string processing, HTML parsing, working with dates and times, and text mining.

Data Science: R Basics

(12 hours)
The first in our Professional Certificate Program in Data Science, this course will introduce you to the basics of R programming. You'll use a real-world dataset about crime in the United States to learn R skills needed to answer essential questions about differences in crime across the different states.

Data Science: Probability

(12 hours)
In this course, part of our Professional Certificate Program in Data Science, you will learn valuable concepts in probability theory, including random variables, independence, Monte Carlo simulations, expected values, standard errors, and the Central Limit Theorem.

Data Science: Linear Regression

(12 hours)
Linear regression quantifies relationships between variables. It's used to adjust for confounding. This course covers how to implement linear regression and adjust for confounding in practice using R.

Data Science: Machine Learning

(24 hours)
Perhaps the most popular data science methodologies come from machine learning. It builds prediction algorithms using data. Some popular products that use machine learning include handwriting readers, speech recognition, movie recommendation systems, and spam detectors.

Data Science: Productivity Tools

(12 hours)
A typical data analysis project involves managing files, directories, and scripts. This course teaches you how to use Unix/Linux to organize your file system and git for version control. You'll also learn about GitHub, R markdown, and RStudio.

Save this collection

Save Data Science to your list so you can find it easily later:
Save
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2024 OpenCourser