Data Science
Wrangling
In this course, part of our Professional Certificate Program in Data Science,we cover several standard steps of the data wrangling process like importing data into R, tidying data, string processing, HTML parsing, working with dates and times, and text mining. Rarely are all these wrangling steps necessary in a single analysis, but a data scientist will likely face them all at some point.
Very rarely is data easily accessible in a data science project. It's more likely for the data to be in a file, a database, or extracted from documents such as web pages, tweets, or PDFs. In these cases, the first step is to import the data into R and tidy the data, using the tidyverse package. The steps that convert data from its raw form to the tidy form is called data wrangling.
This process is a critical step for any data scientist. Knowing how to wrangle and clean data will enable you to make critical insights that would otherwise be hidden.
What you'll learn
- Importing data into R fromdifferent file formats
- Web scraping
- How to tidy data using the tidyverse tobetter facilitateanalysis
- String processing with regular expressions (regex)
- Wrangling data using dplyr
- How to workwith dates and times as file formats
- Text mining
Get a Reminder
Rating | Not enough ratings |
---|---|
Length | 8 weeks |
Effort | 8 weeks, 1–2 hours per week |
Starts | On Demand (Start anytime) |
Cost | $109 |
From | Harvard University, HarvardX via edX |
Instructor | Rafael Irizarry |
Download Videos | On all desktop and mobile devices |
Language | English |
Subjects | Programming Data Science |
Tags | Computer Science Data Analysis & Statistics |
Get a Reminder
Similar Courses
Careers
An overview of related careers and their average salaries in the US. Bars indicate income percentile.
Event Specialist and Data Collection Specialist $40k
Data Steward/Analyst $52k
Sn Data Analyst $75k
LAN/WAN Data Analyst $84k
Mobile Data Analyst $92k
Analyst, Fixed Income Data Management Consultant $101k
Assistant Research and Data Analyst Consultant $104k
Vice Assistant President Data Scientist Development Program $106k
Deputy Data Management Specialist $127k
Big Data Analytics Platform Architect and Business Intelligence Leader $143k
Data warehouse & BI professional $175k
Data Warehouse Solutions Architect $192k
Write a review
Your opinion matters. Tell us what you think.
Please login to leave a review
Rating | Not enough ratings |
---|---|
Length | 8 weeks |
Effort | 8 weeks, 1–2 hours per week |
Starts | On Demand (Start anytime) |
Cost | $109 |
From | Harvard University, HarvardX via edX |
Instructor | Rafael Irizarry |
Download Videos | On all desktop and mobile devices |
Language | English |
Subjects | Programming Data Science |
Tags | Computer Science Data Analysis & Statistics |
Similar Courses
Sorted by relevance
Like this course?
Here's what to do next:
- Save this course for later
- Get more details from the course provider
- Enroll in this course