We may earn an affiliate commission when you visit our partners.
Course image
Rafael Irizarry, Michael Love, and Vincent Carey

We begin with an introduction to the relevant biology, explaining what we measure and why. Then we focus on the two main measurement technologies: next generation sequencing and microarrays. We then move on to describing how raw data and experimental information are imported into R and how we use Bioconductor classes to organize these data, whether generated locally, or harvested from public repositories or institutional archives. Genomic features are generally identified using intervals in genomic coordinates, and highly efficient algorithms for computing with genomic intervals will be examined in detail. Statistical methods for testing gene-centric or pathway-centric hypotheses with genome-scale data are found in packages such as limma, some of these techniques will be illustrated in lectures and labs.

Read more

We begin with an introduction to the relevant biology, explaining what we measure and why. Then we focus on the two main measurement technologies: next generation sequencing and microarrays. We then move on to describing how raw data and experimental information are imported into R and how we use Bioconductor classes to organize these data, whether generated locally, or harvested from public repositories or institutional archives. Genomic features are generally identified using intervals in genomic coordinates, and highly efficient algorithms for computing with genomic intervals will be examined in detail. Statistical methods for testing gene-centric or pathway-centric hypotheses with genome-scale data are found in packages such as limma, some of these techniques will be illustrated in lectures and labs.

Given the diversity in educational background of our students we have divided the series into seven parts. You can take the entire series or individual courses that interest you. If you are a statistician you should consider skipping the first two or three courses, similarly, if you are biologists you should consider skipping some of the introductory biology lectures. Note that the statistics and programming aspects of the class ramp up in difficulty relatively quickly across the first three courses. By the third course will be teaching advanced statistical concepts such as hierarchical models and by the fourth advanced software engineering skills, such as parallel computing and reproducible research concepts.

These courses make up two Professional Certificates and are self-paced:

Data Analysis for Life Sciences:

Genomics Data Analysis:

This class was supported in part by NIH grant R25GM114818.

HarvardX requires individuals who enroll in its courses on edX to abide by the terms of the edX honor code. HarvardX will take appropriate corrective action in response to violations of the edX honor code, which may include dismissal from the HarvardX course; revocation of any certificates received for the HarvardX course; or other remedies as circumstances warrant. No refunds will be issued in the case of corrective action for such violations. Enrollees who are taking HarvardX courses as part of another program will also be governed by the academic policies of those programs.

HarvardX pursues the science of learning. By registering as an online learner in an HX course, you will also participate in research about learning. Read our research statement to learn more.

Harvard University and HarvardX are committed to maintaining a safe and healthy educational and work environment in which no member of the community is excluded from participation in, denied the benefits of, or subjected to discrimination or harassment in our program. All members of the HarvardX community are expected to abide by Harvard policies on nondiscrimination, including sexual harassment, and the edX Terms of Service. If you have any questions or concerns, please contact [email protected] and/or report your experience through the edX contact form.

What's inside

Learning objectives

  • What we measure with high-throughput technologies and why
  • Introduction to high-throughput technologies
  • Next generation sequencing
  • Microarrays
  • Preprocessing and normalization
  • The bioconductor genomic ranges utilities
  • Genomic annotation

Traffic lights

Read about what's good
what should give you pause
and possible dealbreakers
Examines next-generation sequencing and microarrays, which are core technologies in bioinformatics
Focuses on real-world applications in biology
Introduces students to Bioconductor, a powerful open-source software platform for bioinformatics
Provides a strong foundation for advanced statistical concepts such as hierarchical models
Develops advanced software engineering skills, including parallel computing and reproducible research concepts
Taught by Rafael Irizarry, Vincent Carey, and Michael Love, who are recognized for their work in genomics and bioinformatics

Save this course

Create your own learning path. Save this course to your list so you can find it easily later.
Save

Reviews summary

Bioconductor for genomics data analysis

According to learners, Introduction to Bioconductor provides a solid foundation for analyzing high-throughput biological data using R and the Bioconductor ecosystem. Students highlight that it effectively covers key concepts and packages such as GenomicRanges and limma. The practical labs and exercises are frequently mentioned as very helpful for applying the concepts. However, many reviewers caution about the steep learning curve, noting that the course assumes significant prerequisites in R programming and statistics, which can make the pacing feel fast and lead to a challenging experience for those without sufficient background. It is considered a crucial and essential course for researchers in genomics and life sciences.
Labs and exercises are very helpful.
"The labs are helpful."
"The exercises were practical."
"Labs are well-designed."
Essential for genomics data analysis.
"Essential for anyone in genomics."
"Crucial course for the Genomics Data Analysis certificate."
"Best course on Bioconductor I've found. Practical and theoretically sound."
Introduces essential Bioconductor tools.
"Covers key concepts and Bioconductor packages like `GenomicRanges` and `limma` effectively."
"Made learning Bioconductor accessible and covered essential tools."
"Helpful for getting started. `GenomicRanges` section was excellent."
Content moves quickly.
"Some parts felt rushed."
"Could use more depth on advanced topics."
"`limma` part felt a bit brief."
Needs solid R and stats foundation.
"Assumes more background in stats/R than expected from an 'Introduction'."
"Too difficult. Expected a gentler introduction to R/stats before diving into Bioconductor."
"Pacing is quick on the stats/programming side, as warned."

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Introduction to Bioconductor with these activities:
Find a mentor
Finding a mentor can provide you with guidance and support throughout your learning journey.
Browse courses on Statistics
Show steps
  • Identify the skills or knowledge you want to develop.
  • Find someone who has those skills or knowledge and is willing to mentor you.
  • Meet with your mentor regularly and ask questions.
Review prerequisites
Reviewing the prerequisites for this course will help you refresh your knowledge and make sure you have a solid foundation before starting the course.
Browse courses on Statistics
Show steps
  • Identify the prerequisites for the course.
  • Review your notes or textbooks from previous courses.
  • Complete practice problems or exercises.
Review the textbook
Reading the textbook will help you reinforce the conceptos learned in class and gain a deeper understanding of the material.
Show steps
  • Read the textbook chapters assigned for each module.
  • Take notes and highlight important passages
  • Complete the end of chapter exercises
Five other activities
Expand to see all activities and additional details
Show all eight activities
Follow online tutorials
Following online tutorials will allow you to learn new skills and concepts at your own pace and reinforce what you have learned in class.
Show steps
  • Identify the topics you need to learn more about.
  • Find online tutorials on these topics.
  • Follow the tutorials and complete the exercises.
Participate in study groups
Participating in study groups will allow you to discuss the course material with your peers, ask questions, and get feedback on your work.
Browse courses on Genomics
Show steps
  • Find a study group to join.
  • Attend study group meetings regularly.
  • Participate in discussions and ask questions.
Solve practice problems
Solving practice problems will help you reinforce the concepts learned in class and improve your problem-solving skills.
Browse courses on Linear Models
Show steps
  • Identify the topic you need to practice.
  • Find practice problems online or in textbooks.
  • Solve the problems and check your answers.
Create a cheat sheet
Creating a cheat sheet will help you summarize the key concepts and formulas from the course, which can be useful for quick reference during exams or when working on projects.
Browse courses on Genomics
Show steps
  • Identify the key concepts and formulas you need to include.
  • Create a visually appealing and easy-to-read cheat sheet.
  • Review your cheat sheet regularly.
Develop a data analysis pipeline
Developing a data analysis pipeline will give you hands-on experience with the entire data analysis process, from data preprocessing to statistical analysis.
Browse courses on Genomics
Show steps
  • Identify the data you need to analyze.
  • Preprocess the data.
  • Perform statistical analysis on the data.
  • Interpret the results of your analysis.

Career center

Learners who complete Introduction to Bioconductor will develop knowledge and skills that may be useful to these careers:
Laboratory Technician
Laboratory technicians perform experiments and tests in a laboratory setting. They use their knowledge of science and technology to operate and maintain laboratory equipment, and to collect and analyze data. This course provides a strong foundation in the laboratory techniques used in the life sciences, and it would be particularly useful for laboratory technicians who want to work with high-throughput data. The course covers topics such as preprocessing and normalization of data, the use of genomic ranges utilities, and genomic annotation.
Computational Biologist
Computational biologists use their knowledge of computer science and biology to develop and apply computational methods to answer questions about the natural world. This course provides a strong foundation in the computational methods used in the life sciences, and it would be particularly useful for computational biologists who want to work with high-throughput data. The course covers topics such as preprocessing and normalization of data, the use of genomic ranges utilities, and genomic annotation.
Research Scientist
Research scientists conduct experiments and studies to investigate natural phenomena and develop new knowledge. They use their knowledge of science and technology to design and conduct experiments, and to analyze and interpret data. This course provides a strong foundation in the scientific principles used in research, and it would be particularly useful for research scientists who want to work with high-throughput data in the life sciences. The course covers topics such as preprocessing and normalization of data, the use of genomic ranges utilities, and genomic annotation.
Data Scientist
Data scientists use their knowledge of statistics, computer science, and business to solve problems in a variety of industries. This course provides a strong foundation in the statistical and computational methods used in data science, and it would be particularly useful for data scientists who want to work with high-throughput data in the life sciences. The course covers topics such as preprocessing and normalization of data, the use of genomic ranges utilities, and genomic annotation.
Scientific Programmer
Scientific programmers use their knowledge of computer science and science to develop and apply computational methods to solve problems in science and engineering. This course provides a strong foundation in the computational methods used in the life sciences, and it would be particularly useful for scientific programmers who want to work with high-throughput data. The course covers topics such as preprocessing and normalization of data, the use of genomic ranges utilities, and genomic annotation.
Molecular Biologist
Molecular biologists study the structure and function of molecules in living organisms. They use their knowledge of molecular biology to understand how cells work and how they interact with each other. This course provides a strong foundation in the molecular principles used in the life sciences, and it would be particularly useful for molecular biologists who want to work with high-throughput data. The course covers topics such as preprocessing and normalization of data, the use of genomic ranges utilities, and genomic annotation.
Biostatistician
Biostatisticians play a vital role in the design and analysis of experiments in the life sciences. They use their knowledge of statistics and biology to develop and apply statistical methods to answer questions about the natural world. This course provides a strong foundation in the statistical methods used in the life sciences, and it would be particularly useful for biostatisticians who want to work with high-throughput data. The course covers topics such as preprocessing and normalization of data, the use of genomic ranges utilities, and genomic annotation.
Statistician
Statisticians collect, analyze, and interpret data. They use their knowledge of statistics to design and conduct studies, and to analyze and interpret data. This course provides a strong foundation in the statistical methods used in the life sciences, and it would be particularly useful for statisticians who want to work with high-throughput data. The course covers topics such as preprocessing and normalization of data, the use of genomic ranges utilities, and genomic annotation.
Genomicist
Genomicists study the genomes of living organisms. They use their knowledge of genomics to understand how organisms function and how they evolve. This course provides a strong foundation in the genomic principles used in the life sciences, and it would be particularly useful for genomicists who want to work with high-throughput data. The course covers topics such as preprocessing and normalization of data, the use of genomic ranges utilities, and genomic annotation.
Geneticist
Geneticists study the genes and chromosomes of living organisms. They use their knowledge of genetics to understand how organisms inherit traits and how traits are passed down from one generation to the next. This course provides a strong foundation in the genetic principles used in the life sciences, and it would be particularly useful for geneticists who want to work with high-throughput data. The course covers topics such as preprocessing and normalization of data, the use of genomic ranges utilities, and genomic annotation.
Software Engineer
Software engineers design, develop, and maintain software applications. They use their knowledge of computer science and software engineering to create software that meets the needs of users. This course provides a strong foundation in the software engineering principles used in the life sciences, and it would be particularly useful for software engineers who want to work with high-throughput data. The course covers topics such as preprocessing and normalization of data, the use of genomic ranges utilities, and genomic annotation.
Pharmacologist
Pharmacologists study the effects of drugs on living organisms. They use their knowledge of pharmacology to develop new drugs and treatments for diseases. This course provides a strong foundation in the pharmacological principles used in the life sciences, and it would be particularly useful for pharmacologists who want to work with high-throughput data. The course covers topics such as preprocessing and normalization of data, the use of genomic ranges utilities, and genomic annotation.
Healthcare Data Analyst
Healthcare data analysts use their knowledge of statistics, computer science, and healthcare to solve problems in the healthcare industry. This course provides a strong foundation in the statistical and computational methods used in healthcare data analysis, and it would be particularly useful for healthcare data analysts who want to work with high-throughput data in the life sciences. The course covers topics such as preprocessing and normalization of data, the use of genomic ranges utilities, and genomic annotation.
Medical Scientist
Medical scientists use their knowledge of science and medicine to develop new drugs and treatments for diseases. This course provides a strong foundation in the scientific principles used in medical research, and it would be particularly useful for medical scientists who want to work with high-throughput data in the life sciences. The course covers topics such as preprocessing and normalization of data, the use of genomic ranges utilities, and genomic annotation.
Physician
Physicians diagnose and treat diseases in patients. They use their knowledge of medicine to provide medical care to patients and to help them maintain their health. This course provides a strong foundation in the medical principles used in the life sciences, and it would be particularly useful for physicians who want to work with high-throughput data. The course covers topics such as preprocessing and normalization of data, the use of genomic ranges utilities, and genomic annotation.

Reading list

We've selected 21 books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Introduction to Bioconductor.
This text from the authors of the popular Bioconductor package manager and developer tools provides practical examples of data processing.
Provides a comprehensive introduction to the Bioconductor software suite, which is widely used for the analysis and visualization of high-throughput biological data. It covers a wide range of topics, from data preprocessing and normalization to statistical analysis and visualization.
Provides case studies that illustrate how to use Bioconductor for real-world genomics data analysis tasks. It valuable resource for anyone who wants to learn how to use Bioconductor effectively.
Provides a broad overview of bioinformatics and functional genomics, from fundamental concepts to advanced topics. Serves as a valuable reference for learners with a background in the field.
Covers the methods and software used in the analysis of gene expression data. A comprehensive resource for learners who want to gain a deeper understanding of this topic.
This text introduces the methods for the analysis of gene expression data from microarrays.
Provides a comprehensive overview of statistical methods for analyzing data from high-throughput biological experiments.
Covers the core concepts and tools used in bioinformatics. Suitable for learners with little to no prior knowledge in the field.
Provides a comprehensive introduction to statistical methods and R programming. Valuable for learners who need to strengthen their statistical foundation.
Provides a comprehensive overview of statistical learning methods. It covers a wide range of topics, from data preprocessing and model selection to statistical inference and prediction.
Provides a comprehensive overview of genomics. It covers a wide range of topics, from the structure and function of DNA to the analysis of genomic data.
Provides a comprehensive overview of the methods used for analyzing next-generation sequencing data. It valuable resource for anyone who wants to learn more about the computational challenges of genomics data analysis.
Provides a comprehensive overview of molecular biology. It covers a wide range of topics, from the structure and function of cells to the regulation of gene expression.
Provides a comprehensive overview of biochemistry. It covers a wide range of topics, from the structure and function of proteins to the metabolism of carbohydrates, lipids, and nucleic acids.
Provides a comprehensive overview of statistical inference. It covers a wide range of topics, from the foundations of probability to the analysis of variance and regression.
Provides a comprehensive overview of linear models. It covers a wide range of topics, from the basics of linear regression to the analysis of variance and covariance.
Provides a comprehensive overview of multivariate statistical analysis. It covers a wide range of topics, from the basics of multivariate regression to the analysis of variance and covariance.
Provides a comprehensive overview of the statistical methods used for analyzing bioinformatics data. It valuable resource for anyone who wants to learn more about the statistical foundations of genomics data analysis.
Provides a comprehensive overview of the algorithms used for analyzing bioinformatics data. It valuable resource for anyone who wants to learn more about the computational challenges of genomics data analysis.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Similar courses are unavailable at this time. Please try again later.
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2025 OpenCourser