We may earn an affiliate commission when you visit our partners.
Course image
Candace Savonen, MS

Goal of this course:

Equip learners with basics skills and confidence to utilize containers within the context of scientific software analyses.

Expectations:

This course is not meant to teach learners how to create complex containers, but instead introduce learners to basic fundamentals of continuous integration and continuous deployment (CI/CD). This course focuses on containers (Docker or Podman) and will not cover any other (perfectly fine) tools for CI/CD.

Read more

Goal of this course:

Equip learners with basics skills and confidence to utilize containers within the context of scientific software analyses.

Expectations:

This course is not meant to teach learners how to create complex containers, but instead introduce learners to basic fundamentals of continuous integration and continuous deployment (CI/CD). This course focuses on containers (Docker or Podman) and will not cover any other (perfectly fine) tools for CI/CD.

Equipping researchers with the skills to create reproducible data analyses increases the efficiency of everyone involved. By recognizing that biological data analysis code is a form of software development, we can try to adapt good development practices in scientific analyses and software contexts.

Scientific software projects may include (but aren’t limited to):

- Software built as tools to be utilized by others to analyze biologically derived data

- Code that is built primarily for analyzing one project’s data

- Code that is built as a workflow for a series of steps and analyses that might be reused among collaborators or within a lab

- Any scripts and code that are built to handle data in a research setting

- Any scripts and code a researcher might interact with

Containers are one tool among many for creating reproducible analyses. A container is a lightweight, portable, and isolated environment that encapsulates an application and its dependencies, enabling it to run consistently across different computing environments. Many individuals performing analyses on cancer data may not have formal training in software development and may be unfamiliar with the idea of containers.

Unique Features of This Course

- Hands-on exercises exploring real uses of containers for scientific research and software

- Activities to demonstrate the common pitfalls using containers

- Information about how to use two of the most common tools for containers: Docker and Podman

Key Words

Reproducibility, Containers, Podman, Docker, Scientific Software Development, Biomedical Research

Intended Audience/Required Knowledge

- The course is intended for researchers and research staff who might be interested in learning about using containers to make their research or scientific software more reproducible.

- Some familiarity with biomedical or health-related research, as well as some familiarity with programming (including bash and command line) is required.

Learning Objectives

- Understand that computing environments are moving targets

- Use containers to share a controlled computing environment

- Pull and use a Docker image from online

- Modify a Docker image

- Build a Docker image from scratch

- Troubleshoot the most common Docker related errors

Accessibility

We are committed to making our content accessible and available to all. We welcome any feedback you might have at https://forms.gle/SzuZjct4ZQyt3Cos7. Questions related to accessibility accommodations should be directed to https://studentserviceportal.force.com/s/.

Enroll now

What's inside

Syllabus

Introduction
In this first module, we will cover how this course will work and the motivation for using containers for reproducible research
Getting started with Containers
Read more

Save this course

Create your own learning path. Save this course to your list so you can find it easily later.
Save

Activities

Coming soon We're preparing activities for Wrangling Computing Environments: Using Docker for Research. These are activities you can do either before, during, or after a course.

Career center

Learners who complete Wrangling Computing Environments: Using Docker for Research will develop knowledge and skills that may be useful to these careers:
Bioinformatics Scientist
A Bioinformatics Scientist applies computational methods to understand biological data, often developing and utilizing complex analysis pipelines. This course helps build a foundation for a Bioinformatics Scientist by directly addressing the critical need for reproducible research. With hands-on exercises using Docker and Podman, it equips learners with essential skills to manage computing environments, ensuring analyses are consistent and shareable. Understanding how to pull, modify, build, and troubleshoot containers will be particularly helpful in deploying scientific software and collaborating effectively within research teams. The course is especially relevant given its focus on adapting good development practices to scientific analyses, a key aspect of modern bioinformatics. This role typically requires an advanced degree.
Research Software Engineer
As a Research Software Engineer, you bridge the gap between scientific research and robust software development, creating tools and platforms that enable cutting-edge discoveries. This course is highly relevant for a Research Software Engineer because it focuses on adapting good development practices to scientific software contexts, emphasizing reproducibility. The practical experience gained in understanding and using containers like Docker and Podman provides a strong foundation for managing complex software dependencies and ensuring consistent execution environments. Learning to troubleshoot container-related errors, modify images, and implement CI/CD fundamentals directly supports the development of reliable and shareable scientific software. This role often requires an advanced degree or extensive experience.
Computational Biologist
A Computational Biologist develops and applies computational models and algorithms to solve biological problems, often requiring advanced programming and data analysis skills. This course helps a Computational Biologist by equipping you with the fundamental skills to utilize containers for creating reproducible data analyses, which is crucial for validating and sharing your models. The hands-on work with Docker and Podman, including learning to use, modify, and build images, provides a portable solution for encapsulating your computational environments. Understanding container best practices and troubleshooting common issues will enhance your ability to maintain consistency across different computing setups, a key aspect of scientific software development in this field. This role typically requires a Master's or PhD degree.
Bioinformatics Software Developer
A Bioinformatics Software Developer designs, builds, and maintains specialized software tools and pipelines for analyzing biological and genomic data, contributing significantly to scientific research. This course is exceptionally relevant for a Bioinformatics Software Developer because it emphasizes adapting good development practices to scientific software contexts, with a strong focus on containerization. The hands-on experience with Docker and Podman, including learning to build and modify images, use them as a development space, and troubleshoot issues, directly translates to creating robust, reproducible, and shareable bioinformatics tools. These skills are crucial for ensuring the reliability and consistency of complex analytical workflows. This role may require an advanced degree, or strong development experience.
Genomics Data Scientist
A Genomics Data Scientist specializes in analyzing large-scale genomic data, applying computational and statistical methods to uncover biological insights and contribute to precision medicine. For a Genomics Data Scientist, mastering reproducible data analyses is paramount due to the complexity and volume of genomic information. This course directly addresses this by providing hands-on experience with Docker and Podman, allowing you to create stable and shareable computing environments for your intricate pipelines. The ability to pull, modify, and build Docker images, alongside troubleshooting skills, will be particularly helpful in managing diverse bioinformatics tools and ensuring consistent results across research projects. This role usually requires a Master's or PhD.
Biomedical Data Analyst
A Biomedical Data Analyst extracts insights from complex health-related datasets, such as cancer data, playing a crucial role in scientific discovery and clinical translation. This course is particularly valuable for a Biomedical Data Analyst as it directly addresses the challenge of ensuring reproducible data analyses within research settings. By gaining hands-on experience with Docker and Podman, you will develop the capacity to create and share controlled computing environments, ensuring that your analytical pipelines yield consistent results regardless of the platform. The focus on troubleshooting containers and understanding common pitfalls will also be very helpful in maintaining robust and reliable data analysis workflows. This role often requires a Master's degree.
Scientific Programmer
A Scientific Programmer develops specialized software and scripts to support scientific research, translating complex algorithms into functional and efficient code. For a Scientific Programmer, this course helps build a foundation in adapting good development practices to scientific analyses and software contexts, particularly through the use of containers. The hands-on exercises with Docker and Podman, focusing on how to use, modify, and troubleshoot containers, directly empower you to create isolated and reproducible environments for your code. This ensures that your software runs consistently and is easily shareable among collaborators, increasing efficiency and reliability in research settings where scripts and code are built to handle data. This role may require an advanced degree depending on the specialization.
Data Scientist
A Data Scientist extracts insights from vast datasets, building models and performing analyses to inform decisions across various industries and research fields. This course helps a Data Scientist by addressing the fundamental challenge of ensuring reproducible data analyses, which is vital for maintaining the integrity and trustworthiness of your work. The practical skills gained in utilizing Docker and Podman to manage computing environments mean you can confidently share your analytical setups and models, knowing they will perform consistently elsewhere. Learning about CI/CD fundamentals and troubleshooting container errors will be particularly helpful in streamlining your workflow and deploying robust analytical solutions. This role often requires a Master's degree, and sometimes a PhD.
DevOps Engineer
A DevOps Engineer focuses on optimizing the software development lifecycle, emphasizing continuous integration, continuous delivery, and infrastructure as code. This course may be useful for a DevOps Engineer as it introduces learners to basic fundamentals of continuous integration and continuous deployment, with a strong focus on containers. The hands-on experience with Docker and Podman, including learning to use, modify, and troubleshoot these environments, provides a foundational understanding of key containerization technologies. While not a comprehensive DevOps curriculum, the course offers valuable skills in managing reproducible computing environments, which are essential for creating efficient and reliable deployment pipelines. This role typically does not require an advanced degree.
Systems Administrator Research Computing
A Systems Administrator Research Computing manages and maintains the computing infrastructure specifically tailored for scientific research, ensuring optimal performance and support for complex computational workflows. This course is highly relevant for a Systems Administrator Research Computing, as it provides crucial skills in utilizing containers to manage diverse scientific software. The hands-on experience with Docker and Podman, from pulling and modifying images to troubleshooting common errors, directly equips you to provide stable, isolated, and reproducible computing environments for researchers. This capability significantly streamlines software deployment and enhances collaborative efforts within a research setting. This role typically does not require an advanced degree, but deep technical knowledge is essential.
Cloud Engineer
As a Cloud Engineer, you design, implement, and manage cloud infrastructure and services, often leveraging virtualization and containerization for scalable applications. This course may be useful for a Cloud Engineer because it provides hands-on experience with Docker and Podman, which are foundational technologies in modern cloud environments. Learning to pull, modify, and build container images, as well as troubleshoot common errors, directly supports deploying and managing containerized applications on cloud platforms. While not covering the full scope of cloud infrastructure, the skills in creating reproducible and isolated computing environments are directly transferable to building robust and efficient cloud solutions. This role typically does not require an advanced degree.
Data Engineer
A Data Engineer designs, builds, and maintains robust data pipelines and infrastructure, ensuring data is accessible, reliable, and optimized for analysis and machine learning. This course may be useful for a Data Engineer seeking to enhance reproducibility within data processing workflows. The practical skills gained in creating and managing containerized environments using Docker and Podman are directly applicable to isolating dependencies and ensuring consistent execution of data transformations and analytical jobs. Learning about basic CI/CD fundamentals and troubleshooting containers will be helpful in building more reliable and portable data solutions, especially when dealing with complex scientific data. This role typically does not require an advanced degree.
Technical Support Specialist Scientific Software
A Technical Support Specialist Scientific Software provides crucial assistance to researchers, helping them resolve issues with specialized scientific applications and computing environments. This course may be useful for a Technical Support Specialist Scientific Software by equipping you with a foundational understanding of containers, which are often at the core of complex scientific software deployments. The hands-on training in troubleshooting the most common Docker-related errors, recognizing pitfalls, and understanding how to share and modify containerized environments directly enhances your ability to diagnose and resolve user issues related to software reproducibility and environment configuration. This role typically does not require an advanced degree, but domain-specific knowledge is important.
Machine Learning Engineer
A Machine Learning Engineer builds and deploys machine learning models, ensuring they are robust, scalable, and operate effectively in production environments. This course may be useful for a Machine Learning Engineer, particularly in research settings, by helping to address the critical need for reproducible experimental environments. The practical skills learned with Docker and Podman, including creating, modifying, and sharing controlled computing environments, are directly applicable to packaging and deploying machine learning models and their complex dependencies. Understanding troubleshooting for container issues will be helpful in maintaining consistent model performance and facilitating collaboration. This role often requires a Master's degree.
Quantitative Analyst
A Quantitative Analyst uses mathematical, statistical, and computational methods to develop models and analyze data, often in complex domains like finance or scientific research. This course may be useful for a Quantitative Analyst, particularly one involved in research, by providing essential skills for ensuring the reproducibility of computational models and analyses. The hands-on experience with Docker and Podman enables you to create isolated and consistent environments for your simulations and data processing, which is crucial for validation and collaboration. Understanding how to manage and troubleshoot containerized setups will be helpful in maintaining the integrity and reliability of your quantitative workflows. This role typically requires an advanced degree, such as a Master's or PhD.

Reading list

We haven't picked any books for this reading list yet.
Collection of recipes that show you how to solve common problems with Docker. It covers a wide range of topics, from building and running containers to deploying applications in production. It is an excellent resource for anyone who wants to learn more about Docker.
Great introduction to Docker for developers. It covers the basics of Docker, as well as how to use it to build and deploy applications. It is ideal for anyone who wants to get started with Docker quickly.
Comprehensive guide to Docker. It covers everything from the basics to advanced topics like Docker Swarm and Kubernetes. It is perfect for anyone who wants to learn more about Docker and how to use it to build and deploy applications.
Collection of recipes that show you how to solve common problems with Docker. It covers a wide range of topics, from building and running containers to deploying applications in production. It is an excellent resource for anyone who wants to learn more about Docker.
Great introduction to Docker for cloud developers. It covers the basics of Docker, as well as how to use it to build and deploy applications in the cloud. It is ideal for anyone who wants to get started with Docker quickly.
Collection of best practices for using Docker. It covers a wide range of topics, from security to performance. It is an excellent resource for anyone who wants to learn more about Docker.
Great introduction to Docker for DevOps engineers. It covers the basics of Docker, as well as how to use it to build and deploy applications in a DevOps environment. It is ideal for anyone who wants to get started with Docker quickly.
Provides a hands-on approach to learning Docker. It covers a wide range of topics, from setting up a Docker environment to deploying applications in production. It is ideal for anyone who wants to get started with Docker quickly.
Written by experts from Google, this book offers practical advice on running containerized applications in production. It discusses topics such as performance optimization, monitoring, and disaster recovery, providing valuable insights for system administrators and DevOps engineers.
This specialized book focuses on the Kubernetes Pod Security Standard (KPSS), a critical aspect of container security. It provides a comprehensive overview of KPSS, its components, and best practices for implementing it in Kubernetes clusters.
This in-depth reference dives deep into the internals of Docker, exploring its architecture, storage drivers, networking, and security features. It's recommended for experienced container professionals seeking advanced knowledge and troubleshooting techniques.
Written by a leading expert in cloud-native technologies, this book explores design patterns for building resilient and scalable container-based systems. Its insights into distributed systems make it valuable for understanding the challenges and best practices of containerization.
This comprehensive guide provides an in-depth overview of containerization, Docker, and Kubernetes, covering all aspects from installation to advanced features. Its detailed explanations and practical examples make it an excellent resource for understanding the fundamentals of containers.
This guide explores the integration of serverless computing with Kubernetes, enabling developers to build and deploy event-driven applications on a managed platform. It's a valuable resource for understanding the benefits and challenges of combining these technologies.
This practical guide focuses on the implementation and management of Kubernetes, the leading container orchestration platform. It offers hands-on guidance on configuring, deploying, and scaling containerized applications with Kubernetes.
Provides a comprehensive overview of causal inference methods, including topics such as directed acyclic graphs, propensity score matching, and instrumental variables. It valuable resource for researchers who want to learn more about how to make causal inferences from data.
Provides a comprehensive overview of reinforcement learning methods, including topics such as Markov decision processes, Q-learning, and policy gradients. It valuable resource for researchers who want to learn more about how to use reinforcement learning to solve problems.
Provides a comprehensive overview of deep learning methods, including topics such as convolutional neural networks, recurrent neural networks, and deep reinforcement learning. It valuable resource for researchers who want to learn more about how to use deep learning to solve problems.
Provides a comprehensive overview of statistical learning methods, including topics such as linear regression, logistic regression, and decision trees. It valuable resource for researchers who want to learn more about how to analyze data.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Similar courses are unavailable at this time. Please try again later.
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2025 OpenCourser