We may earn an affiliate commission when you visit our partners.
Course image
Coursera logo

Genome Assembly Programming Challenge

Alexander S. Kulikov, Michael Levin, Pavel Pevzner, and Neil Rhodes

In Spring 2011, thousands of people in Germany were hospitalized with a deadly disease that started as food poisoning with bloody diarrhea and often led to kidney failure. It was the beginning of the deadliest outbreak in recent history, caused by a mysterious bacterial strain that we will refer to as E. coli X. Soon, German officials linked the outbreak to a restaurant in Lübeck, where nearly 20% of the patrons had developed bloody diarrhea in a single week. At this point, biologists knew that they were facing a previously unknown pathogen and that traditional methods would not suffice – computational biologists would be needed to assemble and analyze the genome of the newly emerged pathogen.

Read more

In Spring 2011, thousands of people in Germany were hospitalized with a deadly disease that started as food poisoning with bloody diarrhea and often led to kidney failure. It was the beginning of the deadliest outbreak in recent history, caused by a mysterious bacterial strain that we will refer to as E. coli X. Soon, German officials linked the outbreak to a restaurant in Lübeck, where nearly 20% of the patrons had developed bloody diarrhea in a single week. At this point, biologists knew that they were facing a previously unknown pathogen and that traditional methods would not suffice – computational biologists would be needed to assemble and analyze the genome of the newly emerged pathogen.

To investigate the evolutionary origin and pathogenic potential of the outbreak strain, researchers started a crowdsourced research program. They released bacterial DNA sequencing data from one of a patient, which elicited a burst of analyses carried out by computational biologists on four continents. They even used GitHub for the project: https://github.com/ehec-outbreak-crowdsourced/BGI-data-analysis/wiki

The 2011 German outbreak represented an early example of epidemiologists collaborating with computational biologists to stop an outbreak. In this online course you will follow in the footsteps of the bioinformaticians investigating the outbreak by developing a program to assemble the genome of the E. coli X from millions of overlapping substrings of the E.coli X genome.

Enroll now

What's inside

Syllabus

The 2011 European E. coli Outbreak
In April 2011, hundreds of people in Germany were hospitalized with a deadly disease that often started as food poisoning with bloody diarrhea. It was the beginning of the deadliest outbreak in recent history, caused by a mysterious bacterial strain that we will refer to as E. coli X. Within a few months, the outbreak had infected thousands and killed 53 people. To prevent the further spread of the outbreak, computational biologists all over the world had to answer the question “What is the genome sequence of E. coli X?” in order to figure out what new genes it acquired to become pathogenic. The 2011 German outbreak represented an early example of epidemiologists collaborating with computational biologists to stop an outbreak. In this Genome Assembly Programming Challenge, you will follow in the footsteps of the bioinformaticians investigating the outbreak by developing a program to assemble the genome of the deadly E. coli X strain. However, before you embark on building a program for assembling the E. coli X strain, we have to explain some genomic concepts and warm you up by having you solve a simpler problem of assembling a small virus.
Read more
Assembling Genomes Using de Bruijn Graphs
DNA sequencing approach that led to assembly of a small virus in 1977 went through a series of transformations that contributed to the emergence of personalized medicine a few years ago. By the late 1980s, biologists were routinely sequencing viral genomes containing hundreds of thousands of nucleotides, but the idea of sequencing a bacterial (let alone the human) genome containing millions (or even billions) of nucleotides remained preposterous and would cost billions of dollars. In 1988, three biologists (independently and simultaneously!) came up with an idea to reduce sequencing cost and proposed the futuristic and at the time completely implausible method of DNA arrays. None of these three biologists could have possibly imagined that the implications of his own experimental research would eventually bring him face-to-face with challenging algorithmic problems. In this module you will learn about the algorithmic challenge of DNA sequencing using information about short k-mers provided by DNA arrays. You will also travel to the 18the century to learn about the Bridges of Konigsberg and solve a related problem of assembling a jigsaw puzzle!
Genome Assembly Faces Real Sequencing Data
Our discussion of genome assembly has thus far relied upon various assumptions. In this module, we will face practical challenges introduced by quirks in modern sequencing technologies and discuss some algorithmic techniques that have been devised to address these challenges. Afterwards, you will assemble the smallest bacterial genome that lives symbiotically inside leafhoppers. Its sheltered life has allowed it to reduce its genome to only about 112,091 nucleotides and 137 genes. And afterwards, you will be ready to assemble the E. coli X genome!

Good to know

Know what's good
, what to watch for
, and possible dealbreakers
Features a hands-on assembly programming challenge to assemble the genome of E. coli X, providing learners with practical experience in genome assembly
Taught by highly accomplished instructors Pavel Pevzner, Alexander S. Kulikov, Neil Rhodes, and Michael Levin, who are recognized for their work in the field of bioinformatics
Explores the 2011 European E. coli Outbreak, a real-world example of how epidemiologists and computational biologists collaborated to stop an outbreak
Develops core skills in genome assembly and computational biology, which are in high demand in the fields of bioinformatics, medicine, and research
Suitable for learners with a background in biology, computer science, or a related field
Requires learners to have some familiarity with bioinformatics concepts and tools, but provides resources for those who need a refresher

Save this course

Save Genome Assembly Programming Challenge to your list so you can find it easily later:
Save

Reviews summary

Engaging capstone in bioinformatics

Learners say this challenging capstone is great and very rewarding. The course pulls together everything learned in bioinformatics and makes it come together. Learners note that the assignments are difficult but feel that it is worth the effort. Some learners also wish there was more information available for the assignment.
Learners find the capstone very rewarding.
"Excellent."
"Great course, hard but it is worth."
"It was great! Really tough, but rewarding."
Learners wish there was more information available for the assignments.
"The last assignment was extremely frustrating."
"hi, this is a hard course and the videos are not sufficient."
"You're on your own here about how you approach the problems, although there is some nice supplemental material available in the form of introductory videos and a booklet walking you through the project."
Learners say this course is very challenging.
"Too hard."
"Why we are learn this is biology part"
"It is a challenging course, which makes it great."

Career center

Learners who complete Genome Assembly Programming Challenge will develop knowledge and skills that may be useful to these careers:
Genome Assembler
Genome assemblers are responsible for assembling genomes from raw sequencing data. This is a complex and challenging task, as genomes are often large and complex. The Genome Assembly Programming Challenge is a course that teaches students how to use computational techniques to assemble genomes. This course is a valuable resource for anyone interested in a career in Genome Assembly.
Computational Biologist
Computational Biology is the study of biology using mathematical and computational techniques. It combines computer science, applied mathematics, biology, and software engineering to understand biological systems. The Genome Assembly Programming Challenge is a course designed to teach students how to use computational techniques to assemble genomes. This course is a valuable resource for anyone interested in a career in Computational Biology.
Bioinformatics Scientist
Bioinformatics is the use of computer science and information technology to analyze and interpret biological data. Bioinformatics is used in a variety of fields, including genetics, genomics, and drug discovery. The Genome Assembly Programming Challenge is a course that teaches students how to use computational techniques to assemble genomes. This course is a valuable resource for anyone interested in a career in Bioinformatics.
Biomedical Engineer
Biomedical engineers use engineering principles to design and build medical devices and equipment. They also work on developing new treatments and therapies for diseases. The Genome Assembly Programming Challenge is a course that teaches students how to use computational techniques to assemble genomes. This course is a valuable resource for anyone interested in a career in Biomedical Engineering.
Biostatistician
Biostatisticians use statistical methods to analyze biological data. They work on a variety of projects, including drug development, clinical trials, and public health research. The Genome Assembly Programming Challenge is a course that teaches students how to use computational techniques to assemble genomes. This course is a valuable resource for anyone interested in a career in Biostatistics.
Molecular Biologist
Molecular biologists study the structure and function of molecules, such as DNA, RNA, and proteins. They work on a variety of projects, including gene expression, genetic engineering, and drug development. Genome assembly is a central technique in molecular biology and is essential for understanding the genetic makeup of living organisms.
Researcher
Researchers conduct scientific research to advance knowledge and understanding. They work on a variety of projects, including basic research, applied research, and development. The Genome Assembly Programming Challenge is a course that teaches students how to use computational techniques to assemble genomes. This course is a valuable resource for anyone interested in a career in Research.
Data Scientist
Data scientists use data to solve problems. They work on a variety of projects, including fraud detection, customer segmentation, and product development. The Genome Assembly Programming Challenge is a course that teaches students how to use computational techniques to assemble genomes. This course is a valuable resource for anyone interested in a career in Data Science.
Microbiologist
Microbiologists study microorganisms, such as bacteria, viruses, and fungi. They work on a variety of projects, including infectious disease research, drug development, and environmental microbiology. The Genome Assembly Programming Challenge is a course that teaches students how to use computational techniques to assemble genomes. This course is a valuable resource for anyone interested in a career in Microbiology.
Geneticist
Geneticists study genes and heredity. They work on a variety of projects, including genetic testing, gene therapy, and the development of new drugs. The Genome Assembly Programming Challenge is a course that teaches students how to use computational techniques to assemble genomes. This course is a valuable resource for anyone interested in a career in Genetics.
Pharmaceutical Scientist
Pharmaceutical scientists develop and test new drugs. They work on a variety of projects, including drug discovery, drug development, and clinical trials. The Genome Assembly Programming Challenge is a course that teaches students how to use computational techniques to assemble genomes. This course is a valuable resource for anyone interested in a career in Pharmaceutical Science.
Physician
Physicians diagnose and treat diseases. They work on a variety of projects, including patient care, research, and teaching. The Genome Assembly Programming Challenge is a course that teaches students how to use computational techniques to assemble genomes. This course is a valuable resource for anyone interested in a career in Medicine.
Medical Physicist
Medical physicists use physics to develop and improve medical treatments. They work on a variety of projects, including cancer treatment, imaging, and radiation therapy. The Genome Assembly Programming Challenge is a course that teaches students how to use computational techniques to assemble genomes. This course is a valuable resource for anyone interested in a career in Medical Physics.
Software Engineer
Software engineers design, develop, and maintain software applications. They work on a variety of projects, including web development, mobile development, and data science. The Genome Assembly Programming Challenge is a course that teaches students how to use computational techniques to assemble genomes. This course is a valuable resource for anyone interested in a career in Software Engineering.
Epidemiologist
Epidemiologists study the distribution and determinants of health-related states or events (including disease), and the application of this study to the control of diseases and other health problems. The Genome Assembly Programming Challenge is a course that teaches students how to use computational techniques to assemble genomes. This course is a valuable resource for anyone interested in a career in Epidemiology.

Reading list

We've selected 11 books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Genome Assembly Programming Challenge.
Provides a comprehensive overview of bioinformatics and functional genomics. It covers a wide range of topics, including sequence analysis, genome annotation, and gene expression analysis.
Provides a comprehensive overview of probabilistic models for biological sequences. It valuable resource for students and researchers who are interested in learning more about this topic.
Provides a comprehensive overview of bioinformatics algorithms. It covers a wide range of topics, including sequence alignment, genome assembly, and gene expression analysis.
Provides a comprehensive overview of bioinformatics. It covers a wide range of topics, including sequence analysis, genome annotation, and gene expression analysis.
Provides a problem-solving approach to learning about computational biology. It good resource for students who want to learn more about the applications of computational methods in this field.
Provides a comprehensive overview of bioinformatics. It covers a wide range of topics, including sequence analysis, genome annotation, and gene expression analysis.
Provides a practical overview of bioinformatics. It covers a wide range of topics, including sequence analysis, genome annotation, and gene expression analysis.
Covers the theoretical foundations of many of the algorithms that are used in genome assembly and analysis. It good resource for students who want to learn more about the algorithms that are used in this field.
Provides a good introduction to Python for bioinformatics. It covers the basics of Python programming, as well as more advanced topics such as data structures, algorithms, and machine learning.
Provides a practical guide to bioinformatics for beginners. It covers a wide range of topics, including sequence analysis, genome annotation, and gene expression analysis.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Here are nine courses similar to Genome Assembly Programming Challenge.
Bacterial Genomes: Accessing and Analysing Microbial...
Most relevant
Hacking COVID-19 — Course 1: Identifying a Deadly Pathogen
Most relevant
Hacking COVID-19 — Course 2: Decoding SARS-CoV-2's Secrets
Most relevant
Bacterial Genomes: Comparative Genomics using Artemis...
Hacking COVID-19 — Course 5: Tracing SARS-CoV-2's...
Bacterial Genomes: Disease Outbreaks and Antimicrobial...
Whole genome sequencing of bacterial genomes - tools and...
Genome Sequencing (Bioinformatics II)
Bacterial Bioinformatics
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2024 OpenCourser