We may earn an affiliate commission when you visit our partners.
Course image
Avi Ma’ayan, PhD

The Library of Integrative Network-based Cellular Signatures (LINCS) was an NIH Common Fund program that lasted for 10 years from 2012-2021. The idea behind the LINCS program was to perturb different types of human cells with many different types of perturbations such as drugs and other small molecules, genetic manipulations such as single gene knockdown, knockout, or overexpression, manipulation of the extracellular microenvironment conditions, for example, growing cells on different surfaces, and more. These perturbations are applied to various types of human cells including cancer cell lines or induced pluripotent stem cells (iPSCs) from patients, differentiated into various lineages such as neurons or cardiomyocytes. Then, to better understand the molecular networks that are affected by these perturbations, changes in levels of many different molecules within the human cells were measured including: mRNAs, proteins, and metabolites, as well as cellular phenotypic changes such as cell morphology. The BD2K-LINCS Data Coordination and Integration Center (DCIC) was commissioned to organize, analyze, visualize, and integrate this data with other publicly available relevant resources. In this course, we introduce the LINCS DCIC and the various Data and Signature Generation Centers (DSGCs) that collected data for LINCS. We then cover the LINCS metadata, and how the metadata is linked to ontologies and dictionaries. We then present the data processing and data normalization methods used to clean and harmonize the LINCS data. This follows by discussions about how the LINCS data is served with RESTful APIs. Most importantly, the course covers computational bioinformatics methods that can be applied to other multi-omics datasets and projects including dimensionality reduction, clustering, gene-set enrichment analysis, interactive data visualization, and supervised learning. Finally, we introduce crowdsourcing/citizen-science projects where students can work together in teams to extract gene expression signatures from public databases, and then query such collections of signatures against the LINCS data for predicting small molecules as potential therapeutics for a collection of complex human diseases.

Enroll now

What's inside

Syllabus

The Library of Integrated Network-based Cellular Signatures (LINCS) Program Overview
This module provides an overview of the concept behind the LINCS program; and tutorials on how to get started with using the LINCS L1000 dataset.
Read more
Metadata and Ontologies
This module includes a broad high level description of the concepts behind metadata and ontologies and how these are applied to LINCS datasets.
Serving Data with APIs
In this module we explain the concept of accessing data through an application programming interface (API).
Bioinformatics Pipelines
This module describes the important concept of a Bioinformatics pipeline.
The Harmonizome
This module describes a project that integrates many resources that contain knowledge about genes and proteins. The project is called the Harmonizome, and it is implemented as a web-server application available at: http://amp.pharm.mssm.edu/Harmonizome/
Data Normalization
This module describes the mathematical concepts behind data normalization.
Data Clustering
This module describes the mathematical concepts behind data clustering, or in other words unsupervised learning - the identification of patterns within data without considering the labels associated with the data.
Midterm Exam
The Midterm Exam consists of 45 multiple choice questions which covers modules 1-7. Some of the questions may require you to perform some analysis with the methods you learned throughout the course on new datasets.
Enrichment Analysis
This module introduces the important concept of performing gene set enrichment analyses. Enrichment analysis is the process of querying gene sets from genomics and proteomics studies against annotated gene sets collected from prior biological knowledge.
Machine Learning
This module describes the mathematical concepts of supervised machine learning, the process of making predictions from examples that associate observations/features/attribute with one or more properties that we wish to learn/predict.
Benchmarking
This module discusses how Bioinformatics pipelines can be compared and evaluated.
Interactive Data Visualization
This module provides programming examples on how to get started with creating interactive web-based data visualization elements/figures.
Crowdsourcing Projects
This final module describes opportunities to work on LINCS related projects that go beyond the course.
Final Exam
The Final Exam consists of 60 multiple choice questions which covers all of the modules of the course. Some of the questions may require you to perform some analysis with the methods you learned throughout the course on new datasets.

Good to know

Know what's good
, what to watch for
, and possible dealbreakers
Develops knowledge of the LINCS program, including its goals, methods, and impact on biomedical research
Enhances understanding of the LINCS DCIC and its role in organizing, analyzing, and integrating multi-omics data
Provides practical skills in accessing and using the LINCS data through RESTful APIs
Covers computational bioinformatics methods for analyzing multi-omics data, including dimensionality reduction, clustering, and enrichment analysis
Involves students in crowdsourcing projects to extract gene expression signatures and predict potential therapeutics for complex human diseases

Save this course

Save Big Data Science with the BD2K-LINCS Data Coordination and Integration Center to your list so you can find it easily later:
Save

Reviews summary

Well-received big data course

Learners say this introductory big data course effectively covers the basics of computational biology and is taught by knowledgeable instructors. Reviewers found the course easy to follow despite having little prior knowledge on the topic. However, some reviewers stated that certain topics were brief and the practical assessment tested on various programming languages instead of focusing solely on the course material.
Course content found to be directly useful in the field.
"Some of these concepts are directly relevant to my job."
Course instructors are praised for their expertise.
"Excellent course! Thoroughly enjoyed learning from these excellent instructors."
"The lecturer was great."
Some topics felt rushed; practical test covered too many languages.
"certain topics were brief"
"practical assessment test was more focused on different languages like R, JSON, Python and many more"

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Big Data Science with the BD2K-LINCS Data Coordination and Integration Center with these activities:
Explore the LINCS website
Get hands-on experience with the LINCS data and tools.
Show steps
  • Visit the LINCS website at https://www.lincscloud.org/.
  • Create a free account.
  • Explore the different data sets and tools available.
  • Follow the tutorials to learn how to use the LINCS data and tools.
Form a study group with other students in the course
Learn from and collaborate with other students in the course.
Show steps
  • Identify other students in the course who are interested in forming a study group.
  • Meet regularly to discuss the course material, work on assignments together, and prepare for exams.
Practice using the LINCS API
Become proficient in accessing and using the LINCS data through the API.
Show steps
  • Get the Swagger documentation for the LINCS API at https://lincs.hms.harvard.edu/api-docs/.
  • Write a Python script to query the LINCS API for a specific data set.
  • Parse the JSON response from the API and store the data in a Pandas DataFrame.
  • Visualize the data using a library such as matplotlib or Seaborn.
Four other activities
Expand to see all activities and additional details
Show all seven activities
Attend a workshop on using the LINCS data
Learn from experts in the field and get hands-on experience with the LINCS data.
Show steps
  • Find a workshop on using the LINCS data.
  • Register for the workshop.
  • Attend the workshop and take notes.
  • Apply what you learned in the workshop to your own research.
Find a mentor who can provide guidance on using the LINCS data
Connect with someone who can help you learn more about the LINCS data and how to use it in your research.
Show steps
  • Identify potential mentors at your institution or through professional organizations.
  • Contact potential mentors and ask for a meeting.
  • Prepare for your meeting by learning about the mentor's research and interests.
  • Come to your meeting with specific questions about the LINCS data and how to use it.
Create a tutorial or blog post about the LINCS data and tools
Share your knowledge of the LINCS data and tools with others.
Show steps
  • Choose a topic for your tutorial or blog post.
  • Write your tutorial or blog post in a clear and concise style.
  • Promote your tutorial or blog post on social media or other online platforms.
Contribute to the LINCS open-source project
Gain experience in open-source development and contribute to the LINCS community.
Show steps
  • Find an issue or feature request on the LINCS GitHub repository.
  • Fork the LINCS GitHub repository.
  • Make changes to the code to address the issue or feature request.
  • Create a pull request to merge your changes back into the LINCS GitHub repository.

Career center

Learners who complete Big Data Science with the BD2K-LINCS Data Coordination and Integration Center will develop knowledge and skills that may be useful to these careers:
Computational Biologist
Computational Biologists are scientists who use computer science and mathematics to solve problems in biology. They develop and apply computational techniques to analyze large datasets, such as those produced by high-throughput sequencing and other experimental methods. This course provides a strong foundation in the computational methods that are used in Bioinformatics, making it a valuable resource for Computational Biologists.
Biostatistician
Biostatisticians apply statistical methods to solve problems in biology and medicine. They design and analyze experiments, develop statistical models, and interpret data. This course provides a strong foundation in the statistical methods that are used in Bioinformatics, making it a valuable resource for Biostatisticians.
Data Scientist
Data Scientists use their knowledge of mathematics, statistics, and computer science to extract insights from data. They develop and apply machine learning algorithms to solve problems in a variety of fields, including healthcare, finance, and marketing. This course provides a strong foundation in the data science methods that are used in Bioinformatics, making it a valuable resource for Data Scientists.
Bioinformatician
Bioinformaticians are scientists who use computer science and mathematics to solve problems in biology. They develop and apply computational techniques to analyze large datasets, such as those produced by high-throughput sequencing and other experimental methods. This course provides a strong foundation in the computational methods that are used in Bioinformatics, making it a valuable resource for Bioinformaticians.
Machine Learning Engineer
Machine Learning Engineers design and implement machine learning algorithms to solve problems in a variety of fields. They work with Data Scientists and other engineers to develop and deploy machine learning models. This course provides a strong foundation in the machine learning methods that are used in Bioinformatics, making it a valuable resource for Machine Learning Engineers.
Database Administrator
Database Administrators are responsible for managing and maintaining databases. They ensure that databases are running smoothly and efficiently, and that data is secure. This course provides a strong foundation in the database management techniques that are used in Bioinformatics, making it a valuable resource for Database Administrators.
Software Engineer
Software Engineers design, develop, and maintain software applications. They work with other engineers and scientists to develop and deploy software solutions for a variety of problems. This course provides a strong foundation in the software engineering methods that are used in Bioinformatics, making it a valuable resource for Software Engineers.
Systems Analyst
Systems Analysts design and implement computer systems. They work with users to understand their needs and develop systems that meet those needs. This course provides a strong foundation in the systems analysis methods that are used in Bioinformatics, making it a valuable resource for Systems Analysts.
Project Manager
Project Managers plan and execute projects. They work with stakeholders to define project goals, develop project plans, and track project progress. This course provides a strong foundation in the project management methods that are used in Bioinformatics, making it a valuable resource for Project Managers.
Science Writer
Science Writers communicate scientific research to a general audience. They write articles, blog posts, and other content that explains complex scientific concepts in a clear and engaging way. This course provides a strong foundation in the science writing methods that are used in Bioinformatics, making it a valuable resource for Science Writers.
Technical Writer
Technical Writers create documentation for software and other technical products. They work with engineers and scientists to understand the products and develop documentation that is clear and concise. This course provides a strong foundation in the technical writing methods that are used in Bioinformatics, making it a valuable resource for Technical Writers.
Healthcare Data Analyst
Healthcare Data Analysts use their knowledge of mathematics, statistics, and computer science to analyze healthcare data. They develop and apply data analysis techniques to solve problems in healthcare, such as improving patient care and reducing costs. This course provides a strong foundation in the data analysis methods that are used in Bioinformatics, making it a valuable resource for Healthcare Data Analysts.
Health Informatics Specialist
Health Informatics Specialists use their knowledge of computer science and healthcare to improve the delivery of healthcare. They work with healthcare providers and other stakeholders to develop and implement health information systems. This course provides a strong foundation in the health informatics methods that are used in Bioinformatics, making it a valuable resource for Health Informatics Specialists.
Biomedical Engineer
Biomedical Engineers use their knowledge of engineering and biology to develop and implement medical devices and technologies. They work with doctors and other healthcare providers to design and build devices that can improve patient care. This course provides a strong foundation in the biomedical engineering methods that are used in Bioinformatics, making it a valuable resource for Biomedical Engineers.
Clinical Research Coordinator
Clinical Research Coordinators work with doctors and other healthcare providers to conduct clinical trials. They manage the day-to-day operations of clinical trials, including recruiting patients, collecting data, and ensuring that the trial is conducted according to protocol. This course provides a strong foundation in the clinical research methods that are used in Bioinformatics, making it a valuable resource for Clinical Research Coordinators.

Reading list

We've selected 16 books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Big Data Science with the BD2K-LINCS Data Coordination and Integration Center.
This textbook provides a comprehensive overview of the field of bioinformatics, covering topics such as molecular biology, genomics, proteomics, and bioinformatics algorithms. It would be useful as a reference for learners who are new to the field or who want to refresh their knowledge.
Provides a strong foundation in cell biology, covering molecular biology, biochemistry, genetics, genomics, and cell physiology. Useful for building background knowledge of the LINCS program.
Is commonly used as a textbook in academic institutions. It provides a broad and detailed coverage of bioinformatics.
A comprehensive guide to data analysis using R, covering data exploration, statistical modeling, and visualization. Useful for understanding the bioinformatics pipelines and machine learning algorithms used in LINCS.
Covers machine learning in bioinformatics, another topic of increasing importance in the field.
Provides a comprehensive overview of bioinformatics data skills, including data management, analysis, and visualization. Useful for building a foundation in data handling and analysis.
Covers advanced machine learning techniques for bioinformatics, including supervised learning, unsupervised learning, and deep learning. Provides a more specialized understanding of the machine learning algorithms used in LINCS.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Here are nine courses similar to Big Data Science with the BD2K-LINCS Data Coordination and Integration Center.
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2024 OpenCourser