Save for later

Getting and Cleaning Data

This course is a part of Data Science, a 11-course Specialization series from Coursera.

Before you can work with data you have to get some. This course will cover the basic ways that data can be obtained. The course will cover obtaining data from the web, from APIs, from databases and from colleagues in various formats. It will also cover the basics of data cleaning and how to make data “tidy”. Tidy data dramatically speed downstream data analysis tasks. The course will also cover the components of a complete data set including raw data, processing instructions, codebooks, and processed data. The course will cover the basics needed for collecting, cleaning, and sharing data.
Get Details and Enroll Now

OpenCourser is an affiliate partner of Coursera.

Get a Reminder

Not ready to enroll yet? We'll send you an email reminder for this course

Send to:

Coursera

&

Johns Hopkins University

Rating 4.2 based on 805 ratings
Length 5 weeks
Starts Jun 15 (8 weeks ago)
Cost $49
From Johns Hopkins University via Coursera
Instructors Roger D. Peng, PhD, Jeff Leek, PhD, Brian Caffo, PhD
Download Videos On all desktop and mobile devices
Language English
Subjects Data Science Mathematics
Tags Data Science Data Analysis Probability And Statistics

Get a Reminder

Get an email reminder about this course

Send to:

Similar Courses

What people are saying

According to other learners, here's what you need to know

course project in 25 reviews

The course project seemed a little funky , especially creating the codebook for an already existing set of data but was a useful teaching aid.

I struggled with the Course Project Assignment because I didn't understand what I was supposed to do exactly.

I just found Week 2 to be a bit redundant in comparison to Week 1 and the Course Project instructions to be a bit vague.

The Course Project was daunting at first, but I reviewed my notes over and over again, tried reading from the site where the raw data was made available and constructed images of how the TIDY data should look like.

So honestly, I feel like I'm being ripped off a bit here.I did really enjoy the course project though.

Excelente Course Project was absolutely brilliant!

So, when I took a course project, I was struggling to find 'what should I do'.

Read more

learned a lot in 17 reviews

Learned a lot.

I learned a lot but my usually happy & grateful attitude was sorely challenged by the fact that so many facts in the videos and obvious course material was, well, wrong.

I learned a lot!

good intro to R yes very difficult, but I learned a lot The course material needs update.

Nice and challenging exercices best course ever Great course, I've learned a lot about analyzing data sets and creating tidy data sets.

It is a topic which is very often underestimated and we all need to learn to get more productive on this, as most of the time is spend on it in the "real world".Thanks Learned a lot This course covers the essential Explanation could be more elaborated like the earlier courses Great course to take !

Have learned a lot and discovered powerful tools and approaches.

Read more

getting and cleaning in 15 reviews

Getting and cleaning data is the third course in the first wave of John Hopkins’s data science specialization track on Coursera.

Getting and Cleaning Data promises to teach students how to extract data from common data storage formats (including databases, specifically SQL, XML, JSON, and HDF5), and from the web using API's and web scraping.

Nice Course Excellent course It gave me an overview of how to getting and cleaning data.The course is best suited for anyone who is a novice in data science A good course great course, very useful and insightful; challenging final project.

Before taking "Getting and Cleaning Data", I had no prior R programming experience aside from completing the R programming course in the data science specialization on Coursera.

I feel that getting and cleaning data course is highly important to know since this is where 80% of the work in data science is being done.

Excellent information on getting and cleaning data.

gets you through the basics and beyond in getting and cleaning data from diverse sources.

Read more

discussion forums in 8 reviews

All the help is in the discussion forums already anyway, so I'm not sure why they need more Mentors.

I spent way too much time on the exams and projects, because i believe not enough information was given (had to spend a lot of time searching through discussion forums, stackoverflow, help files etc...and while that is useful experience, it was a lot more time commitment than expected from course description) Doesn't worth the effort.

The instructors didn't respond to questions on the discussion forums about quiz items, the majority of assessment items seem to be available on Google and 50% of the peer reviewed assessment I checked used plagiarized solutions.

To make the situation even worse, THE TEACHERS SHOW ABSOLUTELY NO SUPPORT on the discussion forums.

My issues in dealing with these errors would have been alleviated had the professor advocated use of the discussion forums as a good place to discuss issues with the quizzes, errors, software, and more.

As I found in the Maps Coursera course I took, the discussion forums are a great place to supplement your learning from the course.

Overall I learned: -methods for good variable naming -lots about R programming -manipulating data and reading various forms (csv, xml, sql, etc... ) into R -merging data -keeping just the data you need, deleting what you don't need -dealing with null values While the course had frequent frustrating moments, I would say that I did learn a lot, but in order for the course to be more effective, the lectures need to be drastically re-tooled and the discussion forums need to be used to their full potential.

Read more

data science specialization in 7 reviews

This is the third course in the Data Science specialization.

Thanks to coursera This course is one another step forward towards my data science specialization.

It is recommended as the third course on data science specialization.

It's a required part of the Data Science specialization, but at this point I am seriously contemplating dropping it and not finishing the specialization.

Read more

real world in 7 reviews

I learn a lot of skills I need for my job and university thank very much and this is very good course Very useful to get hands on experience in data science to solve real world problems!

learnt a lot of real world applications - fetching data from databases, apis, internet etc and basics of how to tidy data.

They do not even bother provide any useful information (god knows why, maybe they're trying to mimic "real world conditions" but in real world you can interact with users...

The final project pretty fairly replicated what happens in the real world when you are given a disgustingly awful looking data set and are asked to do something with it.

Read more

Careers

An overview of related careers and their average salaries in the US. Bars indicate income percentile.

Graphic Designer/Book Cover Designer $37k

Veterinary/Processing/Animal Care Technician Also Enrichment Coordinator $40k

Cover Rep. $47k

Assistant Cover Editor $54k

Supervisor Concurrent Review Nurse and also Case management $60k

Seasonal Cover Art Designer $67k

cover designer $73k

Cover Producer $81k

Senior Cover Story Editor $84k

Recruiter (Also held role of District Lease Analyst ) $86k

Write a review

Your opinion matters. Tell us what you think.

Coursera

&

Johns Hopkins University

Rating 4.2 based on 805 ratings
Length 5 weeks
Starts Jun 15 (8 weeks ago)
Cost $49
From Johns Hopkins University via Coursera
Instructors Roger D. Peng, PhD, Jeff Leek, PhD, Brian Caffo, PhD
Download Videos On all desktop and mobile devices
Language English
Subjects Data Science Mathematics
Tags Data Science Data Analysis Probability And Statistics

Similar Courses

Sorted by relevance

Like this course?

Here's what to do next:

  • Save this course for later
  • Get more details from the course provider
  • Enroll in this course
Enroll Now