We may earn an affiliate commission when you visit our partners.
Course image
Pavel Pevzner and Phillip Compeau

In previous courses in the Specialization, we have discussed how to sequence and compare genomes. This course will cover advanced topics in finding mutations lurking within DNA and proteins.

Read more

In previous courses in the Specialization, we have discussed how to sequence and compare genomes. This course will cover advanced topics in finding mutations lurking within DNA and proteins.

In the first half of the course, we would like to ask how an individual's genome differs from the "reference genome" of the species. Our goal is to take small fragments of DNA from the individual and "map" them to the reference genome. We will see that the combinatorial pattern matching algorithms solving this problem are elegant and extremely efficient, requiring a surprisingly small amount of runtime and memory.

In the second half of the course, we will learn how to identify the function of a protein even if it has been bombarded by so many mutations compared to similar proteins with known functions that it has become barely recognizable. This is the case, for example, in HIV studies, since the virus often mutates so quickly that researchers can struggle to study it. The approach we will use is based on a powerful machine learning tool called a hidden Markov model.

Finally, you will learn how to apply popular bioinformatics software tools applying hidden Markov models to compare a protein against a related family of proteins.

Enroll now

What's inside

Syllabus

Week 1: Introduction to Read Mapping

Welcome to our class! We are glad that you decided to join us.

In this class, we will consider the following two central biological questions (the computational approaches needed to solve them are shown in parentheses):

  1. How Do We Locate Disease-Causing Mutations? (Combinatorial Pattern Matching)
  2. Why Have Biologists Still Not Developed an HIV Vaccine? (Hidden Markov Models)

As in previous courses, each of these two chapters is accompanied by a Bioinformatics Cartoon created by talented artist Randall Christopher and serving as a chapter header in the Specialization's bestselling print companion. You can find the first chapter's cartoon at the bottom of this message.

Read more
Week 2: The Burrows-Wheeler Transform

Welcome to week 2 of the class!

This week, we will introduce a paradigm called the Burrows-Wheeler transform; after seeing how it can be used in string compression, we will demonstrate that it is also the foundation of modern read-mapping algorithms.

Week 3: Speeding Up Burrows-Wheeler Read Mapping

Welcome to week 3 of class!

Last week, we saw how the Burrows-Wheeler transform could be applied to multiple pattern matching. This week, we will speed up our algorithm and generalize it to the case that patterns have errors, which models the biological problem of mapping reads with errors to a reference genome.

Week 4: Introduction to Hidden Markov Models

Welcome to week 4 of class!

This week, we will start examining the case of aligning sequences with many mutations -- such as related genes from different HIV strains -- and see that our problem formulation for sequence alignment is not adequate for highly diverged sequences.

To improve our algorithms, we will introduce a machine-learning paradigm called a hidden Markov model and see how dynamic programming helps us answer questions about these models.

Week 5: Profile HMMs for Sequence Alignment

Welcome to week 5 of class!

Last week, we introduced hidden Markov models. This week, we will see how hidden Markov models can be applied to sequence alignment with a profile HMM. We will then consider some advanced topics in this area, which are related to advanced methods that we considered in a previous course for clustering.

Week 6: Bioinformatics Application Challenge

Welcome to the sixth and final week of class!

This week brings our Application Challenge, in which we apply the HMM sequence alignment algorithms that we have developed.

Good to know

Know what's good
, what to watch for
, and possible dealbreakers
Strengthens an existing foundation for intermediate learners in bioinformatics
Develops advanced knowledge, specifically in mutation detection using hidden Markov models and pattern matching
Led by Pavel Pevzner and Phillip Compeau, respected researchers and instructors in bioinformatics
Provides hands-on instruction through bioinformatics software tools, such as those focused on hidden Markov models
Teaches combinatorial pattern matching algorithms for mapping DNA fragments to a reference genome

Save this course

Save Finding Mutations in DNA and Proteins (Bioinformatics VI) to your list so you can find it easily later:
Save

Reviews summary

Compelling course on mutations in dna and proteins

Learners say this course is engaging, well-structured, and a favorite in its specialization. They especially enjoyed learning about suffix trees, BWT, pattern matching, and HMMs. Overall, students describe this course as both great and helpful.
Engaging topics include suffix trees, BWT, pattern matching, and HMMs.
"Really loved learning about suffix trees/arrays, BWT, Pattern matching, HMMs!"
Content is well-organized and helpful.
"The contents were so well organized and helpful to develop a proper insight"
Course is a favorite among students.
"It was probably my favorite in this specialization (at least, out of first six)."
"It was the greatest course I've ever attended."
Course may be less engaging than others in the specialization.
"I did this course after the course "Algorithms for DNA Sequencing" of Ben Langmead. Since Ben Langmead was excellent in his explanations, my expectations were the same for this course but I felt that it is not as good as Ben Langmead course maybe because there are no practical videos about programming ideas."

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Finding Mutations in DNA and Proteins (Bioinformatics VI) with these activities:
Organize Course Materials
Enhance comprehension through efficient organization of course materials.
Show steps
  • Create a dedicated folder or notebook for course materials.
  • Categorize and organize lecture slides, notes, readings, and other resources.
  • Use a consistent naming convention to easily locate files.
  • Review organized materials regularly to reinforce learning.
Review Bioinformatics Databases
Refresh knowledge of bioinformatics databases to enhance information retrieval.
Browse courses on Bioinformatics
Show steps
  • Visit popular bioinformatics databases such as NCBI GenBank and UniProt.
  • Explore different search options and filters to retrieve relevant data.
  • Practice downloading and analyzing sequence data.
Review: Designing Algorithms - A Computational Perspective
Get an overview of useful data structures and algorithms typically used for combinatorial pattern matching.
View Algorithm Design on Amazon
Show steps
  • Read the table of contents and article summaries.
  • Read the first two chapters of each part.
  • Choose one particularly relevant chapter and read it in its entirety.
Five other activities
Expand to see all activities and additional details
Show all eight activities
Participate in a Study Group on HMMs
Reinforce understanding of HMMs through collaborative engagement.
Browse courses on Hidden Markov Models
Show steps
  • Join or form a study group with fellow learners.
  • Establish regular meeting times.
  • Choose a specific aspect of HMMs to focus on each session.
  • Work together to solve problems and deepen comprehension.
Follow Tutorials on Advanced Read Mapping Algorithms
Expand knowledge of read mapping techniques for more accurate sequence alignment.
Browse courses on Read Mapping
Show steps
  • Identify online tutorials or courses on advanced read mapping algorithms.
  • Select a tutorial that aligns with your learning goals.
  • Follow the tutorial, completing any exercises or assignments.
Practice Map-Reduce Problems
Study combinatorial pattern matching in the context of map-reduce algorithms.
Show steps
  • Review the basic concepts of MapReduce.
  • Study how to use Hadoop or a similar framework to implement MapReduce programs.
  • Solve a series of practice problems involving combinatorial pattern matching using MapReduce.
Develop a Profile HMM for Protein Sequence Alignment
Apply advanced concepts of HMMs and sequence alignment to protein analysis.
Browse courses on Sequence Alignment
Show steps
  • Review the principles of profile HMMs and protein sequence alignment.
  • Choose a dataset of protein sequences and align them using a multiple sequence alignment tool.
  • Construct a profile HMM from the aligned sequences.
  • Use the profile HMM to align new protein sequences.
Create a Tutorial on Burrows-Wheeler Transform
Enhance understanding through deeper engagement with the BWT.
Browse courses on Burrows-Wheeler Transform
Show steps
  • Understand theoretical foundations of the Burrows-Wheeler Transform.
  • Choose and study an implementation of BWT in a programming language.
  • Create a step-by-step tutorial explaining the implementation and its applications.

Career center

Learners who complete Finding Mutations in DNA and Proteins (Bioinformatics VI) will develop knowledge and skills that may be useful to these careers:
Professor
A Professor teaches and conducts research in a specific field. This course will provide you with the skills to develop and teach courses in bioinformatics, a key skill for any Professor. The course will cover topics such as bioinformatics algorithms and applications, which are essential topics for any Professor teaching in the field of bioinformatics.
Research Scientist
A Research Scientist conducts research in a specific field. This course will provide you with the skills to develop and conduct research in bioinformatics, a key skill for any Research Scientist. The course will cover topics such as bioinformatics algorithms and applications, which are essential topics for any Research Scientist working in the field of bioinformatics.
Systems Biologist
A Systems Biologist studies the complex interactions between molecules in living organisms. This course will provide you with the skills to understand the systems-level properties of biological systems, a key skill for any Systems Biologist. The course will cover topics such as network analysis and systems biology modeling, which are essential topics for any Systems Biologist working in the field of bioinformatics.
Computational Biologist
A Computational Biologist uses mathematical and computational techniques to solve biological problems. This course will provide you with the skills to develop and apply computational methods to solve biological problems, a key skill for any Computational Biologist. The course will cover topics such as sequence alignment and protein structure prediction, which are essential topics for any Computational Biologist working in the field of bioinformatics.
Technical Writer
A Technical Writer writes technical documentation. This course will provide you with the skills to write technical documentation for bioinformatics software and applications, a key skill for any Technical Writer. The course will cover topics such as technical writing and documentation, which are essential topics for any Technical Writer working in the field of bioinformatics.
Biostatistician
A Biostatistician analyzes and interprets large biological datasets to answer complex biological questions. This course will provide you with the skills to analyze and interpret data, a key skill for any Biostatistician. The course will cover topics such as statistical methods for analyzing DNA and protein sequences, which are essential topics for any Biostatistician working in the field of bioinformatics.
Teacher
A Teacher teaches students in a specific subject. This course will provide you with the skills to develop and teach courses in bioinformatics, a key skill for any Teacher. The course will cover topics such as bioinformatics algorithms and applications, which are essential topics for any Teacher teaching in the field of bioinformatics.
Geneticist
A Geneticist studies the genes and heredity of organisms. This course will provide you with the skills to understand the genetic basis of diseases, a key skill for any Geneticist. The course will cover topics such as DNA sequencing and genome analysis, which are essential topics for any Geneticist working in the field of bioinformatics.
Molecular Biologist
A Molecular Biologist studies the structure and function of molecules in living organisms. This course will provide you with the skills to understand the molecular basis of diseases, a key skill for any Molecular Biologist. The course will cover topics such as protein structure and function, which are essential topics for any Molecular Biologist working in the field of bioinformatics.
Statistician
A Statistician collects, analyzes, and interprets data. This course will provide you with the skills to analyze and interpret data, a key skill for any Statistician. The course will cover topics such as statistical methods for analyzing DNA and protein sequences, which are essential topics for any Statistician working in the field of bioinformatics.
Software Engineer
A Software Engineer designs, develops, and maintains software systems. This course will provide you with the skills to develop software tools for bioinformatics, a key skill for any Software Engineer. The course will cover topics such as software design and development, which are essential topics for any Software Engineer working in the field of bioinformatics.
Web Developer
A Web Developer designs and develops websites. This course will provide you with the skills to develop web applications for bioinformatics, a key skill for any Web Developer. The course will cover topics such as web design and development, which are essential topics for any Web Developer working in the field of bioinformatics.
Pharmacist
A Pharmacist dispenses and advises on the use of medications. This course will provide you with the skills to understand the molecular basis of drugs, a key skill for any Pharmacist. The course will cover topics such as drug metabolism and pharmacokinetics, which are essential topics for any Pharmacist working in the field of bioinformatics.
Physician
A Physician diagnoses and treats diseases. This course will provide you with the skills to understand the molecular basis of diseases, a key skill for any Physician. The course will cover topics such as disease genomics and personalized medicine, which are essential topics for any Physician working in the field of bioinformatics.
Data Scientist
A Data Scientist uses data to solve business problems. This course will provide you with the skills to collect, clean, and analyze data, a key skill for any Data Scientist. The course will cover topics such as machine learning and statistical modeling, which are essential topics for any Data Scientist working in the field of bioinformatics.

Reading list

We've selected 16 books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Finding Mutations in DNA and Proteins (Bioinformatics VI).
Provides a comprehensive overview of algorithms on strings, trees, and sequences. It valuable reference for more advanced topics in bioinformatics, and it is also a good resource for understanding the algorithms used in this course.
Provides a comprehensive overview of probabilistic models for biological sequences, including hidden Markov models. It valuable resource for understanding the algorithms used in this course.
Provides a comprehensive overview of hidden Markov models. It valuable resource for understanding the theory and applications of hidden Markov models, and it is also a good reference for the algorithms used in this course.
Provides a comprehensive overview of bioinformatics algorithms, including those covered in this course, such as pattern matching, hidden Markov models, and multiple sequence alignment.
Provides a comprehensive overview of bioinformatics algorithms and their applications. It would be a valuable resource for students interested in learning more about bioinformatics algorithms.
Provides a comprehensive overview of machine learning techniques in bioinformatics. It would be a valuable resource for students interested in learning more about machine learning techniques.
Provides a good introduction to the algorithms used in bioinformatics, and it useful resource for understanding the basics of this field.
Provides a comprehensive overview of bioinformatics, including both a theoretical and practical perspective. It would be a valuable resource for students taking this course, as it provides a solid foundation in the underlying algorithms and techniques used in bioinformatics.
Provides a comprehensive overview of data mining concepts and techniques. It would be a useful resource for students interested in learning more about data mining techniques.
Provides a comprehensive overview of statistical methods in bioinformatics. It would be a valuable resource for students interested in learning more about statistical methods.
Provides a practical guide to bioinformatics programming using Python. It would be a useful resource for students interested in applying bioinformatics algorithms to real-world problems.
Provides a comprehensive overview of machine learning, including hidden Markov models. It good resource for students who want to learn more about the theoretical foundations of these models.
Provides a broad overview of bioinformatics, including topics such as sequence analysis, protein structure prediction, and phylogenetics. It good resource for students who want to learn more about the field.
Provides a comprehensive overview of sequence alignment methods. It good resource for students who want to learn more about the different approaches to aligning sequences.
Provides a practical guide to bioinformatics analysis. It includes chapters on sequence alignment, hidden Markov models, and other topics related to this course.
Provides a concise overview of bioinformatics. It good resource for students who want to learn about the basics of the field.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Here are nine courses similar to Finding Mutations in DNA and Proteins (Bioinformatics VI).
Comparing Genes, Proteins, and Genomes (Bioinformatics...
Most relevant
Genome Sequencing (Bioinformatics II)
Most relevant
Finding Hidden Messages in DNA (Bioinformatics I)
Most relevant
Bioinformatics Mastery: Immunoinformatics
Most relevant
Bacterial Genomes: Accessing and Analysing Microbial...
Most relevant
Graph Algorithms in Genome Sequencing
Most relevant
Plant Bioinformatics Capstone
Most relevant
Molecular Biology - Part 1: DNA Replication and Repair
Most relevant
Epigenetic Control of Gene Expression
Most relevant
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2024 OpenCourser