We may earn an affiliate commission when you visit our partners.
Course image
Alexander S. Kulikov, Michael Levin, Pavel Pevzner, and Neil Rhodes

World and internet is full of textual information. We search for information using textual queries, we read websites, books, e-mails. All those are strings from the point of view of computer science. To make sense of all that information and make search efficient, search engines use many string algorithms. Moreover, the emerging field of personalized medicine uses many search algorithms to find disease-causing mutations in the human genome. In this online course you will learn key pattern matching concepts: tries, suffix trees, suffix arrays and even the Burrows-Wheeler transform.

Enroll now

What's inside

Syllabus

Suffix Trees
How would you search for a longest repeat in a string in LINEAR time? In 1973, Peter Weiner came up with a surprising solution that was based on suffix trees, the key data structure in pattern matching. Computer scientists were so impressed with his algorithm that they called it the Algorithm of the Year. In this lesson, we will explore some key ideas for pattern matching that will - through a series of trials and errors - bring us to suffix trees.
Read more
Burrows-Wheeler Transform and Suffix Arrays
Although EXACT pattern matching with suffix trees is fast, it is not clear how to use suffix trees for APPROXIMATE pattern matching. In 1994, Michael Burrows and David Wheeler invented an ingenious algorithm for text compression that is now known as Burrows-Wheeler Transform. They knew nothing about genomics, and they could not have imagined that 15 years later their algorithm will become the workhorse of biologists searching for genomic mutations. But what text compression has to do with pattern matching??? In this lesson you will learn that the fate of an algorithm is often hard to predict – its applications may appear in a field that has nothing to do with the original plan of its inventors.
Knuth–Morris–Pratt Algorithm
Congratulations, you have now learned the key pattern matching concepts: tries, suffix trees, suffix arrays and even the Burrows-Wheeler transform! However, some of the results Pavel mentioned remain mysterious: e.g., how can we perform exact pattern matching in O(|Text|) time rather than in O(|Text|*|Pattern|) time as in the naïve brute force algorithm? How can it be that matching a 1000-nucleotide pattern against the human genome is nearly as fast as matching a 3-nucleotide pattern??? Also, even though Pavel showed how to quickly construct the suffix array given the suffix tree, he has not revealed the magic behind the fast algorithms for the suffix tree construction!In this module, Miсhael will address some algorithmic challenges that Pavel tried to hide from you :) such as the Knuth-Morris-Pratt algorithm for exact pattern matching and more efficient algorithms for suffix tree and suffix array construction.
Constructing Suffix Arrays and Suffix Trees
In this module we continue studying algorithmic challenges of the string algorithms. You will learn an O(n log n) algorithm for suffix array construction and a linear time algorithm for construction of suffix tree from a suffix array. You will also implement these algorithms and the Knuth-Morris-Pratt algorithm in the last Programming Assignment in this course.

Good to know

Know what's good
, what to watch for
, and possible dealbreakers
Taught by Pavel Pevzner and Neil Rhodes, who are recognized for their work in algorithm design in genomics
Offers hands-on practice through programming assignments
Provides a foundational understanding of key pattern matching concepts used in text search and genomic research
Covers advanced topics such as the Burrows-Wheeler Transform and suffix arrays
May require prerequisites in data structures, algorithms, and potentially biology, depending on the student's background

Save this course

Save Algorithms on Strings to your list so you can find it easily later:
Save

Reviews summary

Student-approved algorithms course

Learners say Algorithms on Strings is a well-received course that teaches string algorithms. The lectures are helpful and the assignments are engaging. Some students have said difficult exams may be expected. Students also say Michael Levin and Pavel are great instructors and that the exercises really helped them understand the material.
This course is difficult, but worthwhile
"Very nice course, the toughest one up until now in the specialization. But a rewarding one"
"Course is really tough, but I'm glad by the information I have got from this course!"
"The course is the best until now. The explanations area clear, with lots of examples and great challenges."
Michael Levin is very knowledgeable
"The instruction team is amazing. Thank you!"
"The lectures by Michael on suffix arrays, suffix trees, and KMP, were simply awesome! Pedagocical masterpieces, really!"
Be prepared for difficult exams
"A bit more rushed than the other ones. Luckily with the notes, the assignments can be passed."

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Algorithms on Strings with these activities:
Organize and Review Course Materials
Organize your lecture notes, assignments, and other course materials to enhance your ability to locate and review important information.
Show steps
  • Gather all relevant course materials
  • Create an organized system for storage and retrieval
Review Foundations of Data Structures and Algorithms
Refresh your knowledge of basic data structures and algorithms to ensure a solid foundation for understanding advanced pattern matching techniques.
Browse courses on Data Structures
Show steps
  • Review lecture notes or textbooks
  • Solve practice problems
Read Algorithms on Strings, Trees, and Sequences
Review this foundational text on algorithms for strings, trees, and sequences to enhance your understanding of key concepts covered in the course.
Show steps
  • Read and understand the relevant chapters
  • Solve exercises and practice problems
Four other activities
Expand to see all activities and additional details
Show all seven activities
Explore Burrows-Wheeler Transform Applications
Seek out and follow tutorials to gain insights into the practical applications of the Burrows-Wheeler Transform in various fields.
Browse courses on Burrows-Wheeler Transform
Show steps
  • Identify relevant tutorials
  • Watch and follow the tutorials
  • Apply the learned concepts to practice examples
Discuss Implementation Strategies for Suffix Arrays
Engage with peers to exchange ideas and strategies for implementing suffix arrays, deepening your understanding and broadening your knowledge.
Browse courses on Suffix Arrays
Show steps
  • Form a study group with peers
  • Research and gather different implementation approaches
  • Share and discuss your findings
Visualize Suffix Trees
Create a visual representation of suffix trees to enhance your comprehension and strengthen your mental model of their structure and functionality.
Browse courses on Suffix Trees
Show steps
  • Choose a visualization tool or technique
  • Gather data on suffix tree structures
  • Design and create your visualization
Knuth-Morris-Pratt Algorithm Practice
Practice using the Knuth-Morris-Pratt Algorithm for exact pattern matching to solidify your understanding and improve your proficiency.
Show steps
  • Review the Knuth-Morris-Pratt Algorithm
  • Implement the algorithm in your preferred programming language
  • Solve practice problems and test cases

Career center

Learners who complete Algorithms on Strings will develop knowledge and skills that may be useful to these careers:
Computational Biologist
A Computational Biologist develops and applies computational tools and techniques to analyze biological data. A course in Algorithms on Strings will provide you with the skills necessary to develop and apply these tools. The course covers topics such as suffix trees, suffix arrays, and the Burrows-Wheeler transform, which are all important topics in bioinformatics. This course will help you build a foundation for a career in computational biology.
Biostatistician
A Biostatistician designs and analyzes experiments, collects and analyzes data, and interprets the results to answer biological or medical questions. A course in Algorithms on Strings will provide you with the skills necessary to perform data analysis and interpretation. The course covers topics such as suffix trees, suffix arrays, and the Burrows-Wheeler transform, which are all important topics in bioinformatics. This course will help you build a foundation for a career in biostatistics.
Bioinformatician
A Bioinformatics develops and applies computational tools and techniques to analyze biological data. A course in Algorithms on Strings will provide you with the skills necessary to develop and apply these tools. The course covers topics such as suffix trees, suffix arrays, and the Burrows-Wheeler transform, which are all important topics in bioinformatics. This course will help you build a foundation for a career in bioinformatics.
Data Scientist
A Data Scientist collects, analyzes, and interprets data to answer business questions. A course in Algorithms on Strings will provide you with the skills necessary to analyze and interpret data. The course covers topics such as suffix trees, suffix arrays, and the Burrows-Wheeler transform, which are all important topics in data science. This course will help you build a foundation for a career in data science.
Statistician
A Statistician collects, analyzes, and interprets data to answer questions. A course in Algorithms on Strings will provide you with the skills necessary to analyze and interpret data. The course covers topics such as suffix trees, suffix arrays, and the Burrows-Wheeler transform, which are all important topics in statistics. This course will help you build a foundation for a career in statistics.
Network Analyst
A Network Analyst designs, implements, and maintains computer networks. A course in Algorithms on Strings will provide you with the skills necessary to design, implement, and maintain computer networks. The course covering topics such as suffix trees, suffix arrays, and the Burrows-Wheeler transform, which are all important topics in network analysis. This course will help you build a foundation for a career in network analysis.
Information Security Analyst
An Information Security Analyst protects computer systems and data from unauthorized access, use, disclosure, disruption, modification, or destruction. A course in Algorithms on Strings will provide you with the skills necessary to protect data from unauthorized access. The course covers topics such as suffix trees, suffix arrays, and the Burrows-Wheeler transform, which are all important topics in information security. This course will help you build a foundation for a career in information security.
Database Administrator
A Database Administrator designs, implements, and maintains databases. A course in Algorithms on Strings will provide you with the skills necessary to design, implement, and maintain databases. The course covers topics such as suffix trees, suffix arrays, and the Burrows-Wheeler transform, which are all important topics in database management. This course will help you build a foundation for a career in database administration.
Software Engineer
A Software Engineer designs, develops, and maintains software systems. A course in Algorithms on Strings will provide you with the skills necessary to design, develop, and maintain software systems. The course covers topics such as suffix trees, suffix arrays, and the Burrows-Wheeler transform, which are all important topics in software engineering. This course will help you build a foundation for a career in software engineering.
Systems Analyst
A Systems Analyst studies how a business works to identify opportunities for improvement. A course in Algorithms on Strings will provide you with the skills necessary to analyze business processes and identify opportunities for improvement. The course covers topics such as suffix trees, suffix arrays, and the Burrows-Wheeler transform, which are all important topics in systems analysis. This course will help you build a foundation for a career in systems analysis.
Security Analyst
A Security Analyst identifies and protects computer systems and data from unauthorized access, use, disclosure, disruption, modification, or destruction. A course in Algorithms on Strings will provide you with the skills necessary to protect computer systems and data. The course covers topics such as suffix trees, suffix arrays, and the Burrows-Wheeler transform, which are all important topics in security analysis. This course will help you build a foundation for a career in security analysis.
Quality Assurance Analyst
A Quality Assurance Analyst tests software to ensure that it meets the requirements of the user. A course in Algorithms on Strings will provide you with the skills necessary to test software. The course covers topics such as suffix trees, suffix arrays, and the Burrows-Wheeler transform, which are all important topics in software testing. This course will help you build a foundation for a career in quality assurance.
Machine Learning Engineer
A Machine Learning Engineer develops and deploys machine learning models to solve business problems. A course in Algorithms on Strings will provide you with the skills necessary to develop and deploy machine learning models. The course covers topics such as suffix trees, suffix arrays, and the Burrows-Wheeler transform, which are all important topics in machine learning. This course will help you build a foundation for a career in machine learning engineering.
Information Architect
An Information Architect designs and develops the structure and organization of websites. A course in Algorithms on Strings will provide you with the skills necessary to design and develop the structure and organization of websites. The course covers topics such as suffix trees, suffix arrays, and the Burrows-Wheeler transform, which are all important topics in information architecture. This course will help you build a foundation for a career in information architecture.
Web Developer
A Web Developer designs and develops websites. A course in Algorithms on Strings will provide you with the skills necessary to design and develop websites. The course covers topics such as suffix trees, suffix arrays, and the Burrows-Wheeler transform, which are all important topics in web development. This course will help you build a foundation for a career in web development.

Reading list

We've selected 12 books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Algorithms on Strings.
Covers many of the key topics in the course, including suffix trees, the Burrows-Wheeler transform, and approximate pattern matching. It can be seen as a more in-depth theoretical treatment of many of the concepts covered in the course.
Focuses specifically on string algorithms in bioinformatics. It covers many of the same topics as the course, but with a more applied focus. It would be a valuable resource for anyone interested in learning more about the practical applications of string algorithms in bioinformatics.
It provides a comprehensive overview of the field of bioinformatics, including sections on algorithms, databases, and tools. It would be a valuable resource for anyone interested in learning more about the broader field of bioinformatics.
Provides a comprehensive overview of pattern recognition and machine learning. It would be a valuable resource for anyone interested in learning more about the theoretical foundations of these fields.
Provides a comprehensive overview of information theory, inference, and learning algorithms. It would be a valuable resource for anyone interested in learning more about the theoretical foundations of these fields.
Provides a comprehensive overview of statistical learning. It would be a valuable resource for anyone interested in learning more about the theoretical foundations of this field.
Provides a comprehensive overview of deep learning. It would be a valuable resource for anyone interested in learning more about the theoretical foundations of this field.
Provides a comprehensive overview of reinforcement learning. It would be a valuable resource for anyone interested in learning more about the theoretical foundations of this field.
Provides a comprehensive overview of natural language processing. It would be a valuable resource for anyone interested in learning more about the theoretical foundations of this field.
Provides a comprehensive overview of algorithms for computational biology. It would be a valuable resource for anyone interested in learning more about the theoretical foundations of this field.
Provides a comprehensive overview of genomics. It would be a valuable resource for anyone interested in learning more about the theoretical foundations of this field.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Here are nine courses similar to Algorithms on Strings.
String Processing and Pattern Matching Algorithms
Most relevant
Algorithms Data Structures in Java #2 (+INTERVIEW...
Data Structures and Performance
Building Knowledge Graphs with Python
Classical Search
Data Structures and Algorithms: In-Depth using Python
Searching on the Internet
Data Structures & Algorithms II: Binary Trees, Heaps,...
Exploratory Data Analysis with Textual Data in R /...
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2024 OpenCourser