We may earn an affiliate commission when you visit our partners.
Course image
Roger D. Peng, PhD, Jeff Leek, PhD, and Brian Caffo, PhD

This course focuses on the concepts and tools behind reporting modern data analyses in a reproducible manner. Reproducible research is the idea that data analyses, and more generally, scientific claims, are published with their data and software code so that others may verify the findings and build upon them. The need for reproducibility is increasing dramatically as data analyses become more complex, involving larger datasets and more sophisticated computations. Reproducibility allows for people to focus on the actual content of a data analysis, rather than on superficial details reported in a written summary. In addition, reproducibility makes an analysis more useful to others because the data and code that actually conducted the analysis are available. This course will focus on literate statistical analysis tools which allow one to publish data analyses in a single document that allows others to easily execute the same analysis to obtain the same results.

Enroll now

What's inside

Syllabus

Week 1: Concepts, Ideas, & Structure
This week will cover the basic ideas of reproducible research since they may be unfamiliar to some of you. We also cover structuring and organizing a data analysis to help make it more reproducible. I recommend that you watch the videos in the order that they are listed on the web page, but watching the videos out of order isn't going to ruin the story.
Read more
Week 2: Markdown & knitr
This week we cover some of the core tools for developing reproducible documents. We cover the literate programming tool knitr and show how to integrate it with Markdown to publish reproducible web documents. We also introduce the first peer assessment which will require you to write up a reproducible data analysis using knitr.
Week 3: Reproducible Research Checklist & Evidence-based Data Analysis
This week covers what one could call a basic check list for ensuring that a data analysis is reproducible. While it's not absolutely sufficient to follow the check list, it provides a necessary minimum standard that would be applicable to almost any area of analysis.
Week 4: Case Studies & Commentaries
This week there are two case studies involving the importance of reproducibility in science for you to watch.

Good to know

Know what's good
, what to watch for
, and possible dealbreakers
Teaches skills highly relevant in academia
Develops skills for publishing same outcome analysis as peers
Taught by instructors recognized for their work in reproducible research
Part of a multi-modal learning experience
Builds a foundation for beginners in reproducible research
Provides hands-on experience in reproducible research tools

Save this course

Save Reproducible Research to your list so you can find it easily later:
Save

Reviews summary

Introductory r course

Learners say that Reproducible Research is a largely positive introduction to knitr and research reproducibility. R Markdown, R coding, and basic statistical knowledge are assumed and may be helpful to possess going in, according to students. Roger Peng is engaging, and his content is well-organized and provides numerous case studies. However, some students feel there is not enough content and that the course is repetitive.
Instructor is engaging.
"Thankfully, Roger Peng has added in a little box with his face in at as he talks over his slides for many of his videos, which makes the content a lot more engaging than it is in some of the other John Hopkins courses that only have voiceovers."
Content is well-organized and provides case studies.
"The first 2.5 weeks of lecture material is great. It provides a well-organized overview of how to create reproducible research in R using R markdown and the knitr package, taking plenty of time to talk about best practices."
R Markdown, R coding, and basic statistical knowledge are helpful.
"Background knowledge in using R and running basic stats is necessary, as the course assumes you already have that going in."
Insufficient content.
"Not much content. Only introduced and taught one main topic: knitr package in R."
Lectures are repetitive.
"Much of course spent repetitively advocating for reproducible research with case studies and peer reviewed assignments. Second peer reviewed assignment was essentially the same as the first in terms of learning new techniques."

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Reproducible Research with these activities:
Find a mentor in the field of reproducible research
Seeking guidance from experienced professionals in the field can greatly enhance your learning journey.
Browse courses on Mentorship
Show steps
  • Identify potential mentors in your network or through online platforms
  • Reach out to mentors and express your interest
  • Set up regular meetings to discuss your progress and seek advice
  • Attend events or workshops where you can connect with potential mentors
Review R markdown and knitr
These tools will be used extensively throughout this course and can serve as a foundation to help improve development of reproducible documents.
Browse courses on R Markdown
Show steps
  • Review the R Markdown documentation
  • Review the knitr documentation
  • Create a simple R Markdown document
  • Knit the R Markdown document to HTML
  • Explore the features of R Markdown and knitr
Create a reproducible report using R Markdown
This course focuses on the usage of R Markdown for creating reproducible data analyses and scientific reports, and this activity serves as a practical application of those concepts.
Browse courses on Report Writing
Show steps
  • Choose a dataset and analyze it
  • Write a report in R Markdown
  • Knit the report to HTML
  • Share the report with others
  • Get feedback on the report
Four other activities
Expand to see all activities and additional details
Show all seven activities
Participate in peer review of reproducible reports
The course emphasizes the importance of peer review in the process of creating reproducible reports, and this activity allows students to practice this.
Browse courses on Peer Review
Show steps
  • Find a peer to review your report
  • Review the report and provide feedback
  • Incorporate the feedback into your report
  • Discuss the review process with your peer
Create a collection of resources on reproducible research
This activity encourages students to immerse themselves in the topic by compiling relevant resources that expand upon the course's materials.
Show steps
  • Search for articles, tutorials, and other resources on reproducible research
  • Organize the resources into a structured format
  • Add annotations and summaries to each resource
  • Share the collection with other students
Contribute to an open-source data science project
Contributing to an open-source project not only reinforces the concepts learned in the course but also provides an opportunity to make a tangible impact on the community.
Browse courses on Collaboration
Show steps
  • Find an open-source data science project to contribute to
  • Review the project's documentation and codebase
  • Identify an area where you can contribute
  • Make your contributions and submit a pull request
  • Collaborate with other contributors and the project maintainers
Develop a reproducible data analysis workflow for a real-world dataset
This activity provides an opportunity for students to apply their skills to a practical problem.
Browse courses on Project-Based Learning
Show steps
  • Choose a real-world dataset
  • Define the research question or problem you want to solve
  • Develop a reproducible data analysis workflow
  • Write a report summarizing your findings
  • Share your project with others

Career center

Learners who complete Reproducible Research will develop knowledge and skills that may be useful to these careers:
Data Scientist
Data Scientists are responsible for collecting, cleaning, and analyzing data to extract insights and inform decision-making. This course in Reproducible Research would be a valuable asset to aspiring Data Scientists as it provides a foundation in the principles and tools for conducting and reporting reproducible data analyses. The course emphasizes the importance of transparency, replicability, and collaboration in data science, which are essential skills for professionals in this field.
Statistician
Statisticians use statistical methods to collect, analyze, interpret, and present data. The course in Reproducible Research is highly relevant to this field, as it focuses on the concepts and tools for reporting modern data analyses in a reproducible manner. The course covers topics such as literate statistical analysis tools, which allow Statisticians to publish data analyses in a single document that can be easily executed by others to obtain the same results.
Data Analyst
Data Analysts use data to identify trends, patterns, and insights that can help businesses make better decisions. This course in Reproducible Research is highly relevant to this field, as it provides a foundation in the principles and tools for conducting and reporting reproducible data analyses. The course emphasizes the importance of transparency, accuracy, and collaboration in data analysis, which are essential skills for Data Analysts.
Research Scientist
Research Scientists conduct scientific research to advance knowledge and develop new technologies. The course in Reproducible Research is highly applicable to this field, as it provides a foundation in the principles and practices of reproducible research. The course emphasizes the importance of transparency, replicability, and collaboration in scientific research, which are essential skills for Research Scientists.
Data Engineer
Data Engineers design, build, and maintain data pipelines and infrastructure. This course in Reproducible Research would be beneficial to aspiring Data Engineers as it provides a foundation in the principles and practices of data engineering. The course emphasizes the importance of data quality, data integration, and data security in data engineering, which are essential skills for Data Engineers.
Quantitative Analyst
Quantitative Analysts use mathematical and statistical methods to analyze financial data. This course in Reproducible Research would be beneficial to aspiring Quantitative Analysts as it provides a foundation in the principles and practices of quantitative analysis. The course emphasizes the importance of data analysis, modeling, and forecasting in quantitative analysis, which are essential skills for Quantitative Analysts.
Machine Learning Engineer
Machine Learning Engineers build and maintain machine learning models. This course in Reproducible Research would be beneficial to aspiring Machine Learning Engineers as it provides a foundation in the principles and practices of machine learning. The course emphasizes the importance of data quality, feature engineering, and model evaluation in machine learning, which are essential skills for Machine Learning Engineers.
Business Analyst
Business Analysts use data and analysis to help businesses make better decisions. This course in Reproducible Research would be beneficial to aspiring Business Analysts as it provides a foundation in the principles and practices of data analysis. The course emphasizes the importance of data collection, data analysis, and data visualization in business analysis, which are essential skills for Business Analysts.
Software Engineer
Software Engineers design, develop, and maintain software systems. This course in Reproducible Research would be beneficial to aspiring Software Engineers as it provides a foundation in the principles and practices of software development. The course emphasizes the importance of modularity, testing, and documentation in software development, which are essential skills for Software Engineers.
Epidemiologist
Epidemiologists investigate the causes and patterns of disease and injury in populations. This course in Reproducible Research would be beneficial to aspiring Epidemiologists as it provides a foundation in the principles and practices of epidemiology. The course emphasizes the importance of data collection, data analysis, and data interpretation in epidemiology, which are essential skills for Epidemiologists.
Biostatistician
Biostatisticians use statistical methods to design and analyze studies in the health sciences. This course in Reproducible Research would be beneficial to aspiring Biostatisticians as it provides a foundation in the principles and practices of biostatistics. The course emphasizes the importance of data analysis, modeling, and forecasting in biostatistics, which are essential skills for Biostatisticians.
UX Researcher
UX Researchers conduct research to improve the user experience of products and services. This course in Reproducible Research would be beneficial to aspiring UX Researchers as it provides a foundation in the principles and practices of UX research. The course emphasizes the importance of data collection, data analysis, and data interpretation in UX research, which are essential skills for UX Researchers.
Market Researcher
Market Researchers conduct research to understand the needs and wants of consumers. This course in Reproducible Research would be beneficial to aspiring Market Researchers as it provides a foundation in the principles and practices of market research. The course emphasizes the importance of data collection, data analysis, and data interpretation in market research, which are essential skills for Market Researchers.
Actuary
Actuaries use mathematical and statistical methods to assess risk and uncertainty. This course in Reproducible Research would be beneficial to aspiring Actuaries as it provides a foundation in the principles and practices of actuarial science. The course emphasizes the importance of data analysis, modeling, and forecasting in actuarial science, which are essential skills for Actuaries.
Data Journalist
Data Journalists use data to tell stories and inform the public. This course in Reproducible Research would be beneficial to aspiring Data Journalists as it provides a foundation in the principles and practices of data journalism. The course emphasizes the importance of data collection, data analysis, and data visualization in data journalism, which are essential skills for Data Journalists.

Reading list

We've selected 12 books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Reproducible Research.
Provides a comprehensive introduction to reproducible research in R. It covers topics such as version control, data management, and literate programming. It valuable resource for practitioners who want to learn how to conduct reproducible research.
Classic and authoritative reference for data science practitioners who use R. Beginners will likely need to read this book slowly while learning and experimenting with R, but it is an excellent reference. It has recently been updated for the latest version of R. This book is commonly used as a textbook for university-level courses.
Provides a comprehensive introduction to all of the skills a professional needs to become an R programmer. It covers topics such as data manipulation, statistical modeling, and graphics. It valuable resource for practitioners of all levels.
Provides a comprehensive introduction to ggplot2, a popular R package for data visualization. It covers topics such as creating graphs, customizing themes, and adding annotations. It valuable resource for practitioners who want to learn how to create beautiful and informative visualizations with R.
Provides a comprehensive introduction to R Markdown. It covers topics such as creating documents, including code, and visualizations. It valuable resource for practitioners who want to learn how to use R Markdown.
Provides a comprehensive introduction to deep learning. It covers topics such as convolutional neural networks, recurrent neural networks, and deep learning architectures. It valuable resource for practitioners who want to learn the foundations of deep learning.
Provides a Bayesian perspective on statistical modeling. It covers topics such as hierarchical models, Markov chain Monte Carlo, and model checking. It valuable resource for practitioners who want to learn Bayesian methods. It is commonly used as a textbook in university-level courses, and it popular reference among professional statisticians.
Provides a comprehensive introduction to statistical inference. It covers topics such as point estimation, hypothesis testing, and confidence intervals. It valuable resource for practitioners who want to learn the foundations of statistical inference.
Provides a comprehensive introduction to causal inference. It covers topics such as graph theory, counterfactuals, and causal models. It valuable resource for practitioners who want to learn how to draw causal conclusions from data.
Provides a comprehensive introduction to deep learning in R. It covers topics such as convolutional neural networks, recurrent neural networks, and deep learning architectures. It valuable resource for practitioners who want to learn deep learning.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Here are nine courses similar to Reproducible Research.
Introduction to Reproducibility in Cancer Informatics
Most relevant
Advanced Reproducibility in Cancer Informatics
Most relevant
Wrangling Data in the Tidyverse
Most relevant
Data Visualization and Transformation with R
Advanced Bioconductor
Communicating Data Science Results
Principles, Statistical and Computational Tools for...
Getting Started with MLflow
Transparent and Open Social Science Research
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2024 OpenCourser