We may earn an affiliate commission when you visit our partners.
Course image
Alexander S. Kulikov, Michael Levin, Pavel Pevzner, and Neil Rhodes

In Spring 2011, thousands of people in Germany were hospitalized with a deadly disease that started as food poisoning with bloody diarrhea and often led to kidney failure. It was the beginning of the deadliest outbreak in recent history, caused by a mysterious bacterial strain that we will refer to as E. coli X. Soon, German officials linked the outbreak to a restaurant in Lübeck, where nearly 20% of the patrons had developed bloody diarrhea in a single week. At this point, biologists knew that they were facing a previously unknown pathogen and that traditional methods would not suffice – computational biologists would be needed to assemble and analyze the genome of the newly emerged pathogen.

Read more

In Spring 2011, thousands of people in Germany were hospitalized with a deadly disease that started as food poisoning with bloody diarrhea and often led to kidney failure. It was the beginning of the deadliest outbreak in recent history, caused by a mysterious bacterial strain that we will refer to as E. coli X. Soon, German officials linked the outbreak to a restaurant in Lübeck, where nearly 20% of the patrons had developed bloody diarrhea in a single week. At this point, biologists knew that they were facing a previously unknown pathogen and that traditional methods would not suffice – computational biologists would be needed to assemble and analyze the genome of the newly emerged pathogen.

To investigate the evolutionary origin and pathogenic potential of the outbreak strain, researchers started a crowdsourced research program. They released bacterial DNA sequencing data from one of a patient, which elicited a burst of analyses carried out by computational biologists on four continents. They even used GitHub for the project: https://github.com/ehec-outbreak-crowdsourced/BGI-data-analysis/wiki

The 2011 German outbreak represented an early example of epidemiologists collaborating with computational biologists to stop an outbreak. In this online course you will follow in the footsteps of the bioinformaticians investigating the outbreak by developing a program to assemble the genome of the E. coli X from millions of overlapping substrings of the E.coli X genome.

Enroll now

What's inside

Syllabus

The 2011 European E. coli Outbreak
In April 2011, hundreds of people in Germany were hospitalized with a deadly disease that often started as food poisoning with bloody diarrhea. It was the beginning of the deadliest outbreak in recent history, caused by a mysterious bacterial strain that we will refer to as E. coli X. Within a few months, the outbreak had infected thousands and killed 53 people. To prevent the further spread of the outbreak, computational biologists all over the world had to answer the question “What is the genome sequence of E. coli X?” in order to figure out what new genes it acquired to become pathogenic. The 2011 German outbreak represented an early example of epidemiologists collaborating with computational biologists to stop an outbreak. In this Genome Assembly Programming Challenge, you will follow in the footsteps of the bioinformaticians investigating the outbreak by developing a program to assemble the genome of the deadly E. coli X strain. However, before you embark on building a program for assembling the E. coli X strain, we have to explain some genomic concepts and warm you up by having you solve a simpler problem of assembling a small virus.
Read more
Assembling Genomes Using de Bruijn Graphs
DNA sequencing approach that led to assembly of a small virus in 1977 went through a series of transformations that contributed to the emergence of personalized medicine a few years ago. By the late 1980s, biologists were routinely sequencing viral genomes containing hundreds of thousands of nucleotides, but the idea of sequencing a bacterial (let alone the human) genome containing millions (or even billions) of nucleotides remained preposterous and would cost billions of dollars. In 1988, three biologists (independently and simultaneously!) came up with an idea to reduce sequencing cost and proposed the futuristic and at the time completely implausible method of DNA arrays. None of these three biologists could have possibly imagined that the implications of his own experimental research would eventually bring him face-to-face with challenging algorithmic problems. In this module you will learn about the algorithmic challenge of DNA sequencing using information about short k-mers provided by DNA arrays. You will also travel to the 18the century to learn about the Bridges of Konigsberg and solve a related problem of assembling a jigsaw puzzle!
Genome Assembly Faces Real Sequencing Data
Our discussion of genome assembly has thus far relied upon various assumptions. In this module, we will face practical challenges introduced by quirks in modern sequencing technologies and discuss some algorithmic techniques that have been devised to address these challenges. Afterwards, you will assemble the smallest bacterial genome that lives symbiotically inside leafhoppers. Its sheltered life has allowed it to reduce its genome to only about 112,091 nucleotides and 137 genes. And afterwards, you will be ready to assemble the E. coli X genome!

Good to know

Know what's good
, what to watch for
, and possible dealbreakers
Features a hands-on assembly programming challenge to assemble the genome of E. coli X, providing learners with practical experience in genome assembly
Taught by highly accomplished instructors Pavel Pevzner, Alexander S. Kulikov, Neil Rhodes, and Michael Levin, who are recognized for their work in the field of bioinformatics
Explores the 2011 European E. coli Outbreak, a real-world example of how epidemiologists and computational biologists collaborated to stop an outbreak
Develops core skills in genome assembly and computational biology, which are in high demand in the fields of bioinformatics, medicine, and research
Suitable for learners with a background in biology, computer science, or a related field
Requires learners to have some familiarity with bioinformatics concepts and tools, but provides resources for those who need a refresher

Save this course

Save Genome Assembly Programming Challenge to your list so you can find it easily later:
Save

Reviews summary

Engaging capstone in bioinformatics

Learners say this challenging capstone is great and very rewarding. The course pulls together everything learned in bioinformatics and makes it come together. Learners note that the assignments are difficult but feel that it is worth the effort. Some learners also wish there was more information available for the assignment.
Learners find the capstone very rewarding.
"Excellent."
"Great course, hard but it is worth."
"It was great! Really tough, but rewarding."
Learners wish there was more information available for the assignments.
"The last assignment was extremely frustrating."
"hi, this is a hard course and the videos are not sufficient."
"You're on your own here about how you approach the problems, although there is some nice supplemental material available in the form of introductory videos and a booklet walking you through the project."
Learners say this course is very challenging.
"Too hard."
"Why we are learn this is biology part"
"It is a challenging course, which makes it great."

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Genome Assembly Programming Challenge with these activities:
Review Sequencing Technology Concepts
Refresh your memory on the core concepts of DNA sequencing. This will make learning about genome assembly in this course easier.
Browse courses on DNA Sequencing
Show steps
  • Read a short summary of the Sanger sequencing method
  • Watch a video on next-generation sequencing methods
  • Review your notes or textbook on sequencing technologies
Learn about de Bruijn graphs
A de Bruijn graph is a data structure that is central to genome assembly. Working through a tutorial on it will strengthen your understanding of this concept.
Show steps
  • Find a tutorial on de Bruijn graphs
  • Work through the tutorial, taking notes
  • Try to apply the concepts you learned to a simple example
Participate in the course discussion forums
Helping others understand genome assembly will reinforce your own learning and identify areas you need to improve.
Show steps
  • Visit the course discussion forums
  • Answer questions asked by other students
  • Ask for help if you have questions
  • Provide feedback on other students' responses
Four other activities
Expand to see all activities and additional details
Show all seven activities
Practice assembling small sequences using de Bruijn graphs
Getting hands-on experience with assembling small sequences will prepare you for developing your own assembler for E. coli X.
Show steps
  • Find a practice problem or dataset with small sequences
  • Assemble the sequences using a de Bruijn graph
  • Check your results
Attend a workshop on genome assembly
Attending a workshop will provide you with insights from experts in the field and allow you to network with other professionals.
Show steps
  • Find a workshop on genome assembly that fits your schedule and interests
  • Register for the workshop
  • Attend the workshop and participate actively
  • Follow up with any contacts you make at the workshop
Build a genome assembler in your favorite programming language
Applying your knowledge to build something concrete will help you deeply understand genome assembly. You'll also develop valuable coding skills.
Show steps
  • Choose a programming language
  • Design the architecture of your assembler
  • Implement the assembler
  • Test your assembler on small sequences
  • Optimize your assembler for performance
Volunteer at a local biotech lab
Working in a lab will provide hands-on experience in the techniques and technologies used in genome assembly.
Show steps
  • Find a local biotech lab that is willing to take volunteers
  • Contact the lab and ask about volunteer opportunities
  • Complete any required training or orientation
  • Perform tasks assigned by the lab staff

Career center

Learners who complete Genome Assembly Programming Challenge will develop knowledge and skills that may be useful to these careers:
Genome Assembler
Genome assemblers are responsible for assembling genomes from raw sequencing data. This is a complex and challenging task, as genomes are often large and complex. The Genome Assembly Programming Challenge is a course that teaches students how to use computational techniques to assemble genomes. This course is a valuable resource for anyone interested in a career in Genome Assembly.
Computational Biologist
Computational Biology is the study of biology using mathematical and computational techniques. It combines computer science, applied mathematics, biology, and software engineering to understand biological systems. The Genome Assembly Programming Challenge is a course designed to teach students how to use computational techniques to assemble genomes. This course is a valuable resource for anyone interested in a career in Computational Biology.
Bioinformatics Scientist
Bioinformatics is the use of computer science and information technology to analyze and interpret biological data. Bioinformatics is used in a variety of fields, including genetics, genomics, and drug discovery. The Genome Assembly Programming Challenge is a course that teaches students how to use computational techniques to assemble genomes. This course is a valuable resource for anyone interested in a career in Bioinformatics.
Biomedical Engineer
Biomedical engineers use engineering principles to design and build medical devices and equipment. They also work on developing new treatments and therapies for diseases. The Genome Assembly Programming Challenge is a course that teaches students how to use computational techniques to assemble genomes. This course is a valuable resource for anyone interested in a career in Biomedical Engineering.
Biostatistician
Biostatisticians use statistical methods to analyze biological data. They work on a variety of projects, including drug development, clinical trials, and public health research. The Genome Assembly Programming Challenge is a course that teaches students how to use computational techniques to assemble genomes. This course is a valuable resource for anyone interested in a career in Biostatistics.
Molecular Biologist
Molecular biologists study the structure and function of molecules, such as DNA, RNA, and proteins. They work on a variety of projects, including gene expression, genetic engineering, and drug development. Genome assembly is a central technique in molecular biology and is essential for understanding the genetic makeup of living organisms.
Researcher
Researchers conduct scientific research to advance knowledge and understanding. They work on a variety of projects, including basic research, applied research, and development. The Genome Assembly Programming Challenge is a course that teaches students how to use computational techniques to assemble genomes. This course is a valuable resource for anyone interested in a career in Research.
Data Scientist
Data scientists use data to solve problems. They work on a variety of projects, including fraud detection, customer segmentation, and product development. The Genome Assembly Programming Challenge is a course that teaches students how to use computational techniques to assemble genomes. This course is a valuable resource for anyone interested in a career in Data Science.
Microbiologist
Microbiologists study microorganisms, such as bacteria, viruses, and fungi. They work on a variety of projects, including infectious disease research, drug development, and environmental microbiology. The Genome Assembly Programming Challenge is a course that teaches students how to use computational techniques to assemble genomes. This course is a valuable resource for anyone interested in a career in Microbiology.
Geneticist
Geneticists study genes and heredity. They work on a variety of projects, including genetic testing, gene therapy, and the development of new drugs. The Genome Assembly Programming Challenge is a course that teaches students how to use computational techniques to assemble genomes. This course is a valuable resource for anyone interested in a career in Genetics.
Pharmaceutical Scientist
Pharmaceutical scientists develop and test new drugs. They work on a variety of projects, including drug discovery, drug development, and clinical trials. The Genome Assembly Programming Challenge is a course that teaches students how to use computational techniques to assemble genomes. This course is a valuable resource for anyone interested in a career in Pharmaceutical Science.
Physician
Physicians diagnose and treat diseases. They work on a variety of projects, including patient care, research, and teaching. The Genome Assembly Programming Challenge is a course that teaches students how to use computational techniques to assemble genomes. This course is a valuable resource for anyone interested in a career in Medicine.
Medical Physicist
Medical physicists use physics to develop and improve medical treatments. They work on a variety of projects, including cancer treatment, imaging, and radiation therapy. The Genome Assembly Programming Challenge is a course that teaches students how to use computational techniques to assemble genomes. This course is a valuable resource for anyone interested in a career in Medical Physics.
Software Engineer
Software engineers design, develop, and maintain software applications. They work on a variety of projects, including web development, mobile development, and data science. The Genome Assembly Programming Challenge is a course that teaches students how to use computational techniques to assemble genomes. This course is a valuable resource for anyone interested in a career in Software Engineering.
Epidemiologist
Epidemiologists study the distribution and determinants of health-related states or events (including disease), and the application of this study to the control of diseases and other health problems. The Genome Assembly Programming Challenge is a course that teaches students how to use computational techniques to assemble genomes. This course is a valuable resource for anyone interested in a career in Epidemiology.

Reading list

We've selected 11 books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Genome Assembly Programming Challenge.
Provides a comprehensive overview of bioinformatics and functional genomics. It covers a wide range of topics, including sequence analysis, genome annotation, and gene expression analysis.
Provides a comprehensive overview of probabilistic models for biological sequences. It valuable resource for students and researchers who are interested in learning more about this topic.
Provides a comprehensive overview of bioinformatics algorithms. It covers a wide range of topics, including sequence alignment, genome assembly, and gene expression analysis.
Provides a comprehensive overview of bioinformatics. It covers a wide range of topics, including sequence analysis, genome annotation, and gene expression analysis.
Provides a problem-solving approach to learning about computational biology. It good resource for students who want to learn more about the applications of computational methods in this field.
Provides a comprehensive overview of bioinformatics. It covers a wide range of topics, including sequence analysis, genome annotation, and gene expression analysis.
Provides a practical overview of bioinformatics. It covers a wide range of topics, including sequence analysis, genome annotation, and gene expression analysis.
Covers the theoretical foundations of many of the algorithms that are used in genome assembly and analysis. It good resource for students who want to learn more about the algorithms that are used in this field.
Provides a good introduction to Python for bioinformatics. It covers the basics of Python programming, as well as more advanced topics such as data structures, algorithms, and machine learning.
Provides a practical guide to bioinformatics for beginners. It covers a wide range of topics, including sequence analysis, genome annotation, and gene expression analysis.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Here are nine courses similar to Genome Assembly Programming Challenge.
Bacterial Genomes: Accessing and Analysing Microbial...
Most relevant
Hacking COVID-19 — Course 1: Identifying a Deadly Pathogen
Most relevant
Hacking COVID-19 — Course 2: Decoding SARS-CoV-2's Secrets
Most relevant
Bacterial Genomes: Comparative Genomics using Artemis...
Hacking COVID-19 — Course 5: Tracing SARS-CoV-2's...
Whole genome sequencing of bacterial genomes - tools and...
Bacterial Genomes: Disease Outbreaks and Antimicrobial...
Genome Sequencing (Bioinformatics II)
Bacterial Bioinformatics
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2024 OpenCourser