Advanced Reproducibility in Cancer Informatics from Coursera

This course introduces tools that help enhance reproducibility and replicability in the context of cancer informatics. It uses hands-on exercises to demonstrate in practical terms how to get acquainted with these tools but is by no means meant to be a comprehensive dive into these tools. The course introduces tools and their concepts such as git and GitHub, code review, Docker, and GitHub actions.

Target Audience

The course is intended for students in the biomedical sciences and researchers who use informatics tools in their research. It is the follow up course to the Introduction to Reproducibility in Cancer Informatics course. Learners who take this course should:

- Have some familiarity with R or Python

- Have take the Introductory Reproducibility in Cancer Informatics course

- Have some familiarity with GitHub

Motivation

Data analyses are generally not reproducible without direct contact with the original researchers and a substantial amount of time and effort (BeaulieuJones, 2017). Reproducibility in cancer informatics (as with other fields) is still not monitored or incentivized despite that it is fundamental to the scientific method. Despite the lack of incentive, many researchers strive for reproducibility in their own work but often lack the skills or training to do so effectively.

Equipping researchers with the skills to create reproducible data analyses increases the efficiency of everyone involved. Reproducible analyses are more likely to be understood, applied, and replicated by others. This helps expedite the scientific process by helping researchers avoid false positive dead ends. Open source clarity in reproducible methods also saves researchers' time so they don't have to reinvent the proverbial wheel for methods that everyone in the field is already performing.

Curriculum

The course includes hands-on exercises for how to apply reproducible code concepts to their code. Individuals who take this course are encouraged to complete these activities as they follow along with the course material to help increase the reproducibility of their analyses.

**Goal of this course:**

To equip learners with a deeper knowledge of the capabilities of reproducibility tools and how they can apply to their existing analyses scripts and projects.

**What is NOT the goal of this course:**

To be a comprehensive dive into each of the tools discussed. .

How to use the course

Each chapter has associated exercises that you are encourage to complete in order to get the full benefit of the course

This course is designed with busy professional learners in mind -- who may have to pick up and put down the course when their schedule allows. In general, you are able to skip to chapters you find a most useful to (One incidence where a prior chapter is required is noted).

Each chapter has associated exercises that you are encourage to complete in order to get the full benefit of the course

What's inside

Syllabus

Getting started in this course

This section describes the rationale and context for this course as well as its target audience.

Defining Reproducibility

This section defines reproducibility for the purposes of this course.

Version control with GitHub

This section discusses how to get started with creating branches and pull requests on GitHub.

Code review - as an author

In this section we discuss the responsibility of an author of a pull request in code review.

Code review -- as a reviewer

In this section we discuss the responsibility of a reviewer of a pull request in code review.

Launching Docker

This section walks through how to get started with Docker.

Modifying a Docker image

This section describes how to modify an existing Docker image

Automation as a reproducibility tool

This section describes the motivation for using automation tools to enhance reproducibility.

Good to know

Know what's good

, what to watch for

, and possible dealbreakers

Helps equip learners with reproducibility tools such as Git, Docker, and GitHub actions

Increases efficiency for everyone involved by equipping researchers with skills to create reproducible data analyses

Designed for learners with some familiarity with Git, GitHub, R, and Python

Provides hands-on exercises for applying reproducible code concepts

Suitable for students in biomedical sciences and researchers using informatics tools

Assumes some familiarity with GitHub and Python or R

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Advanced Reproducibility in Cancer Informatics with these activities:

Organize Course Resources

Show steps

Organizing your course resources will help you stay on top of the material and easily access important information.

Browse courses on Organization

Show steps

Create a dedicated folder or notebook for course materials
Download and save all relevant lecture notes, slides, and assignments
Categorize and label your materials for easy retrieval

Review Version Control

Show steps

Reviewing the concepts and fundamentals of version control in advance will help you complete code challenges more effectively.

Browse courses on Git

Show steps

Read the documentation of the version control tool you plan to use for this course
Watch a video tutorial on version control
Install the version control tool on your computer
Create a new repository and commit your changes
Fork and clone an existing repository

Find a Kubernetes Mentor

Show steps

Finding a Kubernetes mentor can help you navigate the course material more effectively and gain valuable insights from an experienced professional.

Browse courses on Kubernetes

Show steps

Identify potential mentors through online platforms and professional networks
Reach out to your mentors and express your interest
Establish clear expectations and a meeting schedule

Eight other activities

Expand to see all activities and additional details

Show all 11 activities

Guided Tutorials: Docker

Show steps

Engaging in guided tutorials will help reinforce your understanding of Docker by providing you with hands-on experience.

Browse courses on Docker

Show steps

Find and access a relevant and reputable tutorial on Docker
Set up your development environment
Follow the tutorial to create your first Docker image
Run your Docker image and verify its operation

Solve Docker Exercises

Show steps

Sharpen your Docker skills by completing practice exercises and reinforce the concepts covered in the course.

Browse courses on Docker

Show steps

Find online Docker exercises or tutorials
Solve exercises to create and manage Docker containers

Code Review Practice

Show steps

Practicing code review will improve your ability to critically evaluate and identify areas for improvement in your own code.

Browse courses on Code Review

Show steps

Find resources and tutorials on code review best practices
Identify a codebase or project where you can contribute code reviews
Review and provide feedback on code changes
Incorporate the feedback you receive on your own code

Explore GitHub Actions Tutorial

Show steps

Enhance your understanding of GitHub Actions by following guided tutorials and deepen your knowledge beyond the course materials.

Browse courses on Github Actions

Show steps

Find and follow a comprehensive GitHub Actions tutorial
Build and test a workflow using GitHub Actions

Write a Docker Tutorial

Show steps

Creating a Docker tutorial will deepen your understanding of the subject and reinforce your ability to explain complex concepts clearly.

Browse courses on Docker

Show steps

Identify the specific topic or aspect of Docker you want to cover
Research and gather relevant information and resources
Outline the structure and content of your tutorial
Write the tutorial, ensuring clarity and accessibility
Publish and share your tutorial with others

Mentor a Junior Coder

Show steps

Mentoring a junior coder will reinforce your knowledge of the course material and enhance your communication and teaching skills.

Browse courses on Mentoring

Show steps

Identify a junior coder who is interested in learning about reproducible code practices
Establish a regular meeting schedule and communication channel
Review their code, provide constructive feedback, and guide their learning
Share resources, articles, and tutorials to support their growth

Build a Reproducible Analysis Pipeline

Show steps

Apply the concepts learned in the course by building a reproducible analysis pipeline for a real-world dataset.

Show steps

Choose a dataset and define the analysis goals
Write a code script or notebook for the analysis
Containerize the analysis environment with Docker
Automate the pipeline using GitHub Actions

Contribute to a Docker Project

Show steps

Contributing to an open-source Docker project will give you valuable hands-on experience and allow you to learn from the best in the field.

Browse courses on Docker

Show steps

Identify an open-source Docker project that aligns with your interests
Review the project's documentation and codebase
Identify an area where you can make a contribution
Submit a pull request with your proposed changes
Collaborate with the project maintainers to refine and merge your contribution

Career center

Learners who complete Advanced Reproducibility in Cancer Informatics will develop knowledge and skills that may be useful to these careers:

Data Scientist

A Data Scientist uses scientific methods, processes, algorithms, and systems to extract knowledge and insights from data in various forms, both structured and unstructured. Studying the Advanced Reproducibility in Cancer Informatics course may be useful, as it can help build a foundation in using tools that enhance reproducibility and replicability in the context of cancer informatics.

See salaries and explore the career path for Data Scientist

Statistician

A Statistician collects, analyzes, interprets, and presents data to provide insights and make predictions. The Advanced Reproducibility in Cancer Informatics course can help build a foundation in using tools that enhance reproducibility and replicability in the context of cancer informatics, which may be useful for this role.

See salaries and explore the career path for Statistician

Computational Biologist

A Computational Biologist uses computational tools to analyze and interpret biological data. The Advanced Reproducibility in Cancer Informatics course may be useful, as it can help build a foundation in using tools that enhance reproducibility and replicability in the context of cancer informatics.

See salaries and explore the career path for Computational Biologist

Research Scientist

A Research Scientist conducts research to advance scientific knowledge and develop new technologies. The Advanced Reproducibility in Cancer Informatics course can help build a foundation in using tools that enhance reproducibility and replicability in the context of cancer informatics, which may be useful for this role.

See salaries and explore the career path for Research Scientist

Bioinformatics Scientist

A Bioinformatics Scientist develops software tools to collect, manage, analyze, and interpret biological data. Studying the Advanced Reproducibility in Cancer Informatics course may be useful, as it can help build a foundation in using tools that enhance reproducibility and replicability in the context of cancer informatics.

See salaries and explore the career path for Bioinformatics Scientist

Research Analyst

A Research Analyst conducts research and analyzes data to provide insights and recommendations to clients. Studying the Advanced Reproducibility in Cancer Informatics course may be useful, as it can help build a foundation in using tools that enhance reproducibility and replicability in the context of cancer informatics.

See salaries and explore the career path for Research Analyst

Quantitative Analyst

A Quantitative Analyst develops and implements mathematical and statistical models to analyze data and make predictions. The Advanced Reproducibility in Cancer Informatics course may be useful, as it can help build a foundation in using tools that enhance reproducibility and replicability in the context of cancer informatics.

See salaries and explore the career path for Quantitative Analyst

Data Analyst

A Data Analyst collects, processes, and analyzes data to extract meaningful insights. The Advanced Reproducibility in Cancer Informatics course may be useful, as it can help build a foundation in using tools that enhance reproducibility and replicability in the context of cancer informatics.

See salaries and explore the career path for Data Analyst

Database Administrator

A Database Administrator designs, develops, and maintains databases. The Advanced Reproducibility in Cancer Informatics course may be useful, as it can help build a foundation in using tools that enhance reproducibility and replicability in the context of cancer informatics.

See salaries and explore the career path for Database Administrator

Software Developer

A Software Developer designs, develops, and maintains software applications. Studying the Advanced Reproducibility in Cancer Informatics course may be useful, as it can help build a foundation in using tools that enhance reproducibility and replicability in the context of cancer informatics.

See salaries and explore the career path for Software Developer

Systems Analyst

A Systems Analyst designs, develops, and maintains computer systems. The Advanced Reproducibility in Cancer Informatics course can help build a foundation in using tools that enhance reproducibility and replicability in the context of cancer informatics, which may be useful for this role.

See salaries and explore the career path for Systems Analyst

Machine Learning Engineer

A Machine Learning Engineer designs, develops, and deploys machine learning models to solve real-world problems. The Advanced Reproducibility in Cancer Informatics course may be useful, as it can help build a foundation in using tools that enhance reproducibility and replicability in the context of cancer informatics.

See salaries and explore the career path for Machine Learning Engineer

Health Informatics Specialist

A Health Informatics Specialist uses health information technology to improve healthcare delivery and patient outcomes. The Advanced Reproducibility in Cancer Informatics course can help build a foundation in using tools that enhance reproducibility and replicability in the context of cancer informatics, which may be useful for this role.

See salaries and explore the career path for Health Informatics Specialist

User Experience Designer

A User Experience Designer designs and evaluates user interfaces to ensure that they are user-friendly and efficient. The Advanced Reproducibility in Cancer Informatics course may be useful, as it can help build a foundation in using tools that enhance reproducibility and replicability in the context of cancer informatics.

See salaries and explore the career path for User Experience Designer

Web Developer

A Web Developer designs, develops, and maintains websites. The Advanced Reproducibility in Cancer Informatics course may be useful, as it can help build a foundation in using tools that enhance reproducibility and replicability in the context of cancer informatics.

See salaries and explore the career path for Web Developer