We may earn an affiliate commission when you visit our partners.
Course image
Carrie Wright, PhD, Shannon Ellis, PhD, Stephanie Hicks, PhD, and Roger D. Peng, PhD

Data never arrive in the condition that you need them in order to do effective data analysis. Data need to be re-shaped, re-arranged, and re-formatted, so that they can be visualized or be inputted into a machine learning algorithm. This course addresses the problem of wrangling your data so that you can bring them under control and analyze them effectively. The key goal in data wrangling is transforming non-tidy data into tidy data.

Read more

Data never arrive in the condition that you need them in order to do effective data analysis. Data need to be re-shaped, re-arranged, and re-formatted, so that they can be visualized or be inputted into a machine learning algorithm. This course addresses the problem of wrangling your data so that you can bring them under control and analyze them effectively. The key goal in data wrangling is transforming non-tidy data into tidy data.

This course covers many of the critical details about handling tidy and non-tidy data in R such as converting from wide to long formats, manipulating tables with the dplyr package, understanding different R data types, processing text data with regular expressions, and conducting basic exploratory data analyses. Investing the time to learn these data wrangling techniques will make your analyses more efficient, more reproducible, and more understandable to your data science team.

In this specialization we assume familiarity with the R programming language. If you are not yet familiar with R, we suggest you first complete R Programming before returning to complete this course.

Enroll now

What's inside

Syllabus

Wrangling Data in the Tidyverse
Data never arrive in the condition that you need them in order to do effective data analysis. Data need to be re-shaped, re-arranged, and re-formatted, so that they can be visualized or be inputted into a machine learning algorithm. This module addresses the problem of wrangling your data so that you can bring them under control and analyze them effectively. The key goal in data wrangling is transforming non-tidy data into tidy data.
Read more
Working With Factors, Dates, and Times
In R, categorical data are handled as factors. By definition, categorical data are limited in that they have a set number of possible values they can take. For example, there are 12 months in a calendar year. In a month variable, each observation is limited to taking one of these twelve values. Thus, with a limited number of possible values, month is a categorical variable. Categorical data, which will be referred to as factors for the rest of this lesson, are regularly found in data. Learning how to work with this type of variable effectively will be incredibly helpful.
Working With Strings and Text and Functional Programming
Working with text data is increasingly common in data science projects. Text manipulation is often needed to clean up messy datasets and to create numerical measurements out of text input. In addition, often the text themselves are the data and this module covers tools to extract information from the text.
Exploratory Data Analysis
The goal of an exploratory analysis is to examine, or explore the data and find relationships that weren’t previously known. Exploratory analyses explore how different measures might be related to each other but do not confirm that relationship as causal, i.e., one variable causing another. You’ve probably heard the phrase “Correlation does not imply causation,” and exploratory analyses lie at the root of this saying. Just because you observe a relationship between two variables during exploratory analysis, it does not mean that one necessarily causes the other.
Case Studies
Now we will demonstrate how to import data using our case study examples. When working through the steps of the case studies, you can use either RStudio on your own computer or Coursera lab spaces provided for each case study.
Project: Wrangling data in the Tidyverse
In this project, you will practice data exploration and data wrangling with the tidyverse using consumer complaint data from the Consumer Financial Protection Bureau (CFPB).

Good to know

Know what's good
, what to watch for
, and possible dealbreakers
Develops tidyverse skills, which most industries that use R value
Taught by professors from Johns Hopkins University, who are known for their work in biostatistics
Covers data wrangling and transformation, which helps learners clean and manipulate data efficiently
Uses the dplyr package, a widely used tool for data manipulation in R
Provides hands-on practice through a project using real-world data
Requires familiarity with R programming, which may be a barrier for beginners

Save this course

Save Wrangling Data in the Tidyverse to your list so you can find it easily later:
Save

Reviews summary

Practice-oriented data manipulation course

Learners say this course offers a balance of engaging assignments and clear didactic material. While learners largely appreciated the opportunity to put their skills to use with frequent quizzes and assignments, some learners felt bogged down by the specific subject matter (e.g. healthcare data). However, students appreciated the real-world examples and found the lectures easy to understand and follow.
Examples and assignments are taken from real-world scenarios.
"Loved it! I really liked that it was all reading and based in real examples!"
Clear lectures coupled with frequent assignments and quizzes.
"Great course with clearly understandable lectures."
"This course strikes a good balance between didactic material and the work required to do the Quizzes."
Forum questions sometimes went unanswered.
"In addition, there are a year or more,questions in the forum that no one bothered to answered them.."
A particular assignment was unclear and caused frustration.
"However especially the last assignment is really unclear and was quite annoying."

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Wrangling Data in the Tidyverse with these activities:
Review R Basics
Refresh your understanding of R programming concepts before starting the course.
Browse courses on R Programming
Show steps
  • Review the RStudio cheatsheet
  • Complete the 'Introduction to R' exercises on DataCamp
Watch Tutorials on Data Reshaping
Gain a visual understanding of data reshaping techniques through interactive tutorials.
Browse courses on Data Reshaping
Show steps
  • Watch the 'Reshaping Data with R' tutorial on DataCamp
  • Follow along with the 'tidyr' package tutorial on RStudio
Create a Data Dictionary
Organize and document your data to facilitate efficient and accurate analysis.
Show steps
  • Identify the variables in your dataset
  • Define the name, type, and description of each variable
  • Create a table or spreadsheet to document the data dictionary
Four other activities
Expand to see all activities and additional details
Show all seven activities
Read The Data Science Handbook
Learn the fundamentals of data wrangling and tidy data from a trusted resource.
Show steps
  • Read chapters 1-4 to understand the basics of data wrangling and tidy data
  • Work through the exercises in the book to practice your skills
Practice Data Manipulation
Refine your data manipulation skills through guided exercises.
Browse courses on Data Manipulation
Show steps
  • Complete the data manipulation exercises on DataCamp
  • Work through the R for Data Science cheat sheet and practice the provided exercises
Attend a Data Science Meetup
Connect with professionals in the field and exchange knowledge and insights.
Browse courses on Networking
Show steps
  • Find a local data science meetup group on Meetup.com
  • Attend a meetup and participate in discussions
Data Analysis Project
Apply your data wrangling and analysis skills to a real-world dataset.
Browse courses on Data Analysis
Show steps
  • Identify a dataset of interest
  • Clean and prepare the data
  • Perform exploratory data analysis
  • Create visualizations and report on your findings

Career center

Learners who complete Wrangling Data in the Tidyverse will develop knowledge and skills that may be useful to these careers:
Data Analyst
Data Analysts gather and analyze data to identify trends and patterns. They use this information to help businesses make better decisions. The Wrangling Data in the Tidyverse course can help you develop valuable skills for Data Analysts, such as data cleaning, manipulation, and analysis. With these skills, you can help businesses understand their data and make better use of it.
Data Scientist
Data Scientists use data to solve business problems. They develop and implement models to predict outcomes and make recommendations. The Wrangling Data in the Tidyverse course can help you develop the skills you need to be a successful Data Scientist, such as data wrangling, analysis, and visualization. With these skills, you can help businesses make better decisions and improve their bottom line.
Data Engineer
Data Engineers build and maintain the systems that store and process data. They ensure that data is accurate, accessible, and secure. The Wrangling Data in the Tidyverse course can help you develop the skills you need to be a successful Data Engineer, such as data wrangling, management, and analysis. With these skills, you can help businesses build and maintain the data systems they need to succeed.
Statistician
Statisticians collect, analyze, and interpret data to draw conclusions. They use their findings to help businesses make better decisions. The Wrangling Data in the Tidyverse course can help you develop the skills you need to be a successful Statistician, such as data wrangling, analysis, and visualization. With these skills, you can help businesses understand their data and make better use of it.
Machine Learning Engineer
Machine Learning Engineers develop and implement machine learning models. They use these models to automate tasks and make predictions. The Wrangling Data in the Tidyverse course can help you develop the skills you need to be a successful Machine Learning Engineer, such as data wrangling, analysis, and visualization. With these skills, you can help businesses automate tasks and make better decisions.
Software Engineer
Software Engineers design, develop, and maintain software systems. They use their skills to solve business problems and improve efficiency. The Wrangling Data in the Tidyverse course can help you develop the skills you need to be a successful Software Engineer, such as data wrangling, analysis, and visualization. With these skills, you can help businesses build and maintain the software systems they need to succeed.
Business Analyst
Business Analysts help businesses understand their data and make better decisions. They use their skills to identify trends, patterns, and opportunities. The Wrangling Data in the Tidyverse course can help you develop the skills you need to be a successful Business Analyst, such as data wrangling, analysis, and visualization. With these skills, you can help businesses understand their data and make better use of it.
Consultant
Consultants help businesses improve their performance. They use their expertise to identify problems and develop solutions. The Wrangling Data in the Tidyverse course can help you develop the skills you need to be a successful Consultant, such as data wrangling, analysis, and visualization. With these skills, you can help businesses understand their data and make better use of it.
Researcher
Researchers conduct studies to answer questions and solve problems. They use their findings to improve our understanding of the world. The Wrangling Data in the Tidyverse course can help you develop the skills you need to be a successful Researcher, such as data wrangling, analysis, and visualization. With these skills, you can help conduct studies and solve problems.
Teacher
Teachers help students learn and grow. They use their knowledge and skills to create a positive learning environment. The Wrangling Data in the Tidyverse course can help you develop the skills you need to be a successful Teacher, such as data wrangling, analysis, and visualization. With these skills, you can help students understand data and make better use of it.
Journalist
Journalists report on news and current events. They use their skills to inform the public and hold those in power accountable. The Wrangling Data in the Tidyverse course can help you develop the skills you need to be a successful Journalist, such as data wrangling, analysis, and visualization. With these skills, you can help the public understand data and make better use of it.
Librarian
Librarians help people find and use information. They use their skills to organize and manage libraries and archives. The Wrangling Data in the Tidyverse course can help you develop the skills you need to be a successful Librarian, such as data wrangling, analysis, and visualization. With these skills, you can help people find and use data more effectively.
Museum curator
Museum Curators manage and care for museum collections. They use their skills to preserve and interpret cultural artifacts. The Wrangling Data in the Tidyverse course can help you develop the skills you need to be a successful Museum Curator, such as data wrangling, analysis, and visualization. With these skills, you can help preserve and interpret cultural artifacts for the public.
Archivist
Archivists manage and care for historical records. They use their skills to preserve and interpret these records for future generations. The Wrangling Data in the Tidyverse course can help you develop the skills you need to be a successful Archivist, such as data wrangling, analysis, and visualization. With these skills, you can help preserve and interpret historical records for the public.
Historian
Historians study the past to understand the present. They use their skills to research and write about historical events. The Wrangling Data in the Tidyverse course may help you develop some skills that are useful for Historians, such as data wrangling, analysis, and visualization. With these skills, you can help research and write about historical events more effectively.

Reading list

We've selected 13 books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Wrangling Data in the Tidyverse.
Provides a comprehensive introduction to the R programming language, with a focus on data science applications. It covers the basics of data manipulation, visualization, and statistical modeling, and it assumes no prior knowledge of R or programming.
Provides an in-depth look at the R programming language, with a focus on advanced topics such as data manipulation, visualization, and statistical modeling. It assumes some prior knowledge of R and programming.
Provides a comprehensive introduction to data manipulation in R. It covers the basics of data import, cleaning, and transformation, and it assumes no prior knowledge of R or programming.
Provides a comprehensive introduction to data visualization in R. It covers the basics of creating different types of plots, and it assumes no prior knowledge of R or programming.
Provides a comprehensive introduction to exploratory data analysis in R. It covers the basics of data exploration, visualization, and statistical modeling, and it assumes no prior knowledge of R or programming.
Provides a comprehensive introduction to deep learning in R. It covers the basics of deep learning models, and it assumes some prior knowledge of machine learning and R.
Provides a comprehensive introduction to natural language processing in R. It covers the basics of NLP tasks, and it assumes some prior knowledge of R.
Provides a comprehensive introduction to time series analysis in R. It covers the basics of time series models, and it assumes some prior knowledge of statistics and R.
Provides a comprehensive introduction to statistical learning. It covers the basics of supervised and unsupervised learning, and it assumes some prior knowledge of statistics and R.
Provides a comprehensive introduction to applied predictive modeling. It covers the basics of supervised and unsupervised learning, and it assumes some prior knowledge of statistics and R.
Provides a comprehensive introduction to statistical learning. It covers the basics of supervised and unsupervised learning, and it assumes some prior knowledge of statistics and R.
Provides a comprehensive introduction to applied statistics. It covers the basics of statistical modeling, and it assumes some prior knowledge of statistics and R.

Share

Help others find this course page by sharing it with your friends and followers:
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2024 OpenCourser