Save for later

Principles, Statistical and Computational Tools for Reproducible Data Science

Today the principles and techniques of reproducible research are more important than ever, across diverse disciplines from astrophysics to political science. No one wants to do research that can’t be reproduced. Thus, this course is really for anyone who is doing any data intensive research. While many of us come from a biomedical background, this course is for a broad audience of data scientists.

To meet the needs of the scientific community, this course will examine the fundamentals of methods and tools for reproducible research. Led by experienced faculty from the Harvard T.H. Chan School of Public Health, you will participate in six modules that will include several case studies that illustrate the significant impact of reproducible research methods on scientific discovery.

This course will appeal to students and professionals in biostatistics, computational biology, bioinformatics, and data science. The course content will blend video lectures, case studies, peer-to-peer engagements and use of computational tools and platforms (such as R/RStudio, and Git/Github), culminating in a final presentation of a final reproducible research project.

We’ll cover Fundamentals of Reproducible Science; Case Studies; Data Provenance; Statistical Methods for Reproducible Science; Computational Tools for Reproducible Science; and Reproducible Reporting Science. These concepts are intended to translate to fields throughout the data sciences: physical and life sciences, applied mathematics and statistics, and computing.

Consider this course a survey of best practices: we’d like to make you aware of pitfalls in reproducible data science, some failure - and success - stories in the past, and tools and design patterns that might help make it all easier. But ultimately it’ll be up to you to take the skills you learn from this course to create your own environment in which you can easily carry out reproducible research, and to encourage and integrate with similar environments for your collaborators and colleagues. We look forward to seeing you in this course and the research you do in the future!

What you'll learn

  • Understand a series of concepts, thought patterns, analysis paradigms, and computational and statistical tools, that together support data science and reproducible research.
  • Fundamentals of reproducible science using case studies that illustrate various practices
  • Key elements for ensuring data provenance and reproducible experimental design
  • Statistical methods for reproducible data analysis
  • Computational tools for reproducible data analysis and version control (Git/GitHub, Emacs/RStudio/Spyder), reproducible data (Data repositories/Dataverse) and reproducible dynamic report generation (Rmarkdown/R Notebook/Jupyter/Pandoc), and workflows.
  • How to develop new methods and tools for reproducible research and reporting
  • How to write your own reproducible paper.

Get Details and Enroll Now

OpenCourser is an affiliate partner of edX and may earn a commission when you buy through our links.

Get a Reminder

Send to:
Rating Not enough ratings
Length 8 weeks
Effort 8 weeks, 3–8 hours per week
Starts On Demand (Start anytime)
Cost $149
From Harvard University, HarvardX via edX
Instructors Curtis Huttenhower, John Quackenbush, Lorenzo Trippa, Christine Choirat
Download Videos On all desktop and mobile devices
Language English
Subjects Data Science Science
Tags Data Analysis & Statistics Science

Get a Reminder

Send to:

Similar Courses

Careers

An overview of related careers and their average salaries in the US. Bars indicate income percentile.

Communications and Research Coordinator $49k

Junior Research Recruiter $62k

Assistant Faculty Research Analyst $63k

Research Associate - Asner Lab - Carnegie Spectranomics $63k

Adjunct Professor - Statistics and Research Methods $69k

Coordinator of Oncology Research $71k

Ph.D. Candidate in Industrial Engineering and Operations Research Manager $96k

Assistant Research and Data Analyst Consultant $104k

Technical Research and Development Engineer $109k

Research economist / Econometrician $117k

Research Plant Pathologist Lead $153k

Distinguished Associate Research Professor $289k

Write a review

Your opinion matters. Tell us what you think.

Rating Not enough ratings
Length 8 weeks
Effort 8 weeks, 3–8 hours per week
Starts On Demand (Start anytime)
Cost $149
From Harvard University, HarvardX via edX
Instructors Curtis Huttenhower, John Quackenbush, Lorenzo Trippa, Christine Choirat
Download Videos On all desktop and mobile devices
Language English
Subjects Data Science Science
Tags Data Analysis & Statistics Science

Similar Courses

Sorted by relevance

Like this course?

Here's what to do next:

  • Save this course for later
  • Get more details from the course provider
  • Enroll in this course
Enroll Now