We may earn an affiliate commission when you visit our partners.
Course image
Roger D. Peng, PhD, Jeff Leek, PhD, and Brian Caffo, PhD

This course covers the essential exploratory techniques for summarizing data. These techniques are typically applied before formal modeling commences and can help inform the development of more complex statistical models. Exploratory techniques are also important for eliminating or sharpening potential hypotheses about the world that can be addressed by the data. We will cover in detail the plotting systems in R as well as some of the basic principles of constructing data graphics. We will also cover some of the common multivariate statistical techniques used to visualize high-dimensional data.

Enroll now

What's inside

Syllabus

Week 1
This week covers the basics of analytic graphics and the base plotting system in R. We've also included some background material to help you install R if you haven't done so already.
Read more
Week 2
Welcome to Week 2 of Exploratory Data Analysis. This week covers some of the more advanced graphing systems available in R: the Lattice system and the ggplot2 system. While the base graphics system provides many important tools for visualizing data, it was part of the original R system and lacks many features that may be desirable in a plotting system, particularly when visualizing high dimensional data. The Lattice and ggplot2 systems also simplify the laying out of plots making it a much less tedious process.
Week 3
Welcome to Week 3 of Exploratory Data Analysis. This week covers some of the workhorse statistical methods for exploratory analysis. These methods include clustering and dimension reduction techniques that allow you to make graphical displays of very high dimensional data (many many variables). We also cover novel ways to specify colors in R so that you can use color as an important and useful dimension when making data graphics. All of this material is covered in chapters 9-12 of my book Exploratory Data Analysis with R.
Week 4
This week, we'll look at two case studies in exploratory data analysis. The first involves the use of cluster analysis techniques, and the second is a more involved analysis of some air pollution data. How one goes about doing EDA is often personal, but I'm providing these videos to give you a sense of how you might proceed with a specific type of dataset.

Good to know

Know what's good
, what to watch for
, and possible dealbreakers
Provides foundational knowledge of exploratory data analysis techniques
Covers specific multivariate statistical techniques for visualizing high-dimensional data
Instructors are recognized for their expertise in exploratory data analysis
Taught using R, which has strong support for exploratory data analysis
Provides case studies to demonstrate real-world applications of exploratory data analysis

Save this course

Save Exploratory Data Analysis to your list so you can find it easily later:
Save

Reviews summary

Thorough exploratory data analysis basics

According to students, Exploratory Data Analysis covers EDA basics, including an overview of plotting, using the three most common packages in R: base R, lattice, and ggplot2.
Helpful case studies in week 4.
"Week 4 consists of 2 case studies where the professor shows you how to perform an exploratory analysis on a couple different data sets."
Thorough overview of EDA basics.
"This is a good starting point for any data analysis work"
"The course covers the basics, and a bit more, rather well."
"The first two weeks of the course provide a thorough overview of plotting in R"
Abrupt detour into data clustering in week 3.
"Week 3 takes a sudden detour into data clustering"
"The clustering section seems a little out of place"
"What's worse the SVD and PCA sections require a fairly high level of linear algebra knowledge to understand, which are not prerequisites for this course."

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Exploratory Data Analysis with these activities:
Read 'Exploratory Data Analysis with R' by Roger D. Peng
Gain a deeper understanding of the course concepts by reading the foundational book on exploratory data analysis with R.
Show steps
  • Read chapters covering basic exploratory techniques, data visualization, and multivariate statistical methods.
  • Work through the practice exercises provided in the book.
Review probability theory
Brush up on probability theory to strengthen your foundation for this course's statistical concepts.
Browse courses on Probability Theory
Show steps
  • Review probability concepts such as conditional probability, Bayes' theorem, and random variables.
  • Solve practice problems involving probability distributions and statistical inference.
Practice plotting techniques in R
Improve your ability to create clear and informative data visualizations for exploratory data analysis.
Browse courses on R
Show steps
  • Follow the tutorials in Week 1 and Week 2 of the course
  • Experiment with the different plotting systems in R
  • Try creating your own data visualizations from scratch
Eight other activities
Expand to see all activities and additional details
Show all 11 activities
Visualize Data with R
Practice visualizing data with R to enhance your understanding of data distributions and relationships.
Browse courses on Data Visualization
Show steps
  • Import data into R
  • Create basic plots (e.g., histograms, scatterplots)
  • Customize plot appearance (e.g., colors, labels)
  • Use R packages for advanced visualization (e.g., ggplot2)
Complete practice exercises on data visualization
Reinforce your understanding of data visualization techniques by completing practice exercises.
Browse courses on Data Visualization
Show steps
  • Use the ggplot2 library to create various types of plots, such as scatterplots, histograms, and box plots.
  • Practice customizing plots by changing colors, adding titles, and adjusting axes.
Attend study groups with classmates
Enhance your understanding of course concepts by discussing them with peers.
Show steps
  • Form study groups with classmates who have similar interests or learning styles.
  • Discuss course material, work on practice problems together, and provide mutual support.
Analyze real-world datasets with R
Develop your skills in using R to explore and analyze real-world datasets, enhancing your understanding of data analysis techniques.
Browse courses on Data Analysis
Show steps
  • Find a dataset of interest from a reputable source
  • Load the dataset into R and explore its structure
  • Perform exploratory data analysis using the techniques covered in the course
  • Write a report or presentation summarizing your findings
Follow tutorials on advanced R programming
Expand your R programming skills by following tutorials on advanced topics.
Browse courses on R Programming
Show steps
  • Learn how to work with large datasets using the dplyr and tidyr libraries.
  • Practice creating interactive data visualizations using the Shiny package.
Attend a workshop on data wrangling
Develop practical skills in data wrangling by attending a workshop dedicated to the topic.
Browse courses on Data Wrangling
Show steps
  • Participate in hands-on exercises to learn techniques for data cleaning, transformation, and preparation.
  • Interact with experts and ask questions to enhance your understanding.
Create a data visualization project
Apply your data visualization skills to a real-world dataset by creating a project.
Browse courses on Data Analysis
Show steps
  • Choose a dataset and explore it using exploratory data analysis techniques.
  • Create a set of visualizations that effectively communicate the insights gained from the data analysis.
  • Present your project to a peer or mentor for feedback.
Contribute to an open-source data visualization project
Apply your skills and contribute to the open-source community by participating in a data visualization project.
Browse courses on Open Source
Show steps
  • Identify an open-source data visualization project that aligns with your interests.
  • Join the project community, learn about their codebase, and identify areas where you can contribute.
  • Make code contributions, participate in discussions, and collaborate with other developers.

Career center

Learners who complete Exploratory Data Analysis will develop knowledge and skills that may be useful to these careers:
Data Analyst
A Data Analyst collects, processes, and analyzes data to extract meaningful insights. The Exploratory Data Analysis course from Johns Hopkins University can help aspiring Data Analysts develop the skills needed to summarize data, identify patterns, and visualize results. This course covers essential techniques for exploratory data analysis, including plotting, clustering, and dimension reduction, which are crucial for understanding data and making informed decisions.
Data Visualization Specialist
A Data Visualization Specialist designs and creates visual representations of data to communicate insights and trends. The Exploratory Data Analysis course from Johns Hopkins University provides a strong foundation for aspiring Data Visualization Specialists by covering essential techniques for summarizing data, identifying patterns, and visualizing results. This course helps individuals develop the skills needed to create effective and engaging data visualizations that clearly communicate insights and support decision-making.
Statistician
A Statistician uses mathematical and statistical methods to analyze data, draw conclusions, and make predictions. The Exploratory Data Analysis course from Johns Hopkins University provides a solid foundation for aspiring Statisticians by covering essential exploratory techniques for summarizing data, identifying patterns, and visualizing results. This course helps build a strong understanding of data analysis principles and prepares individuals for more advanced statistical modeling and inference.
Data Scientist
A Data Scientist combines expertise in data analysis, programming, and machine learning to extract insights from data. The Exploratory Data Analysis course from Johns Hopkins University can be beneficial for aspiring Data Scientists as it provides a strong foundation in data exploration and visualization. By learning essential techniques for summarizing data, identifying patterns, and visualizing results, individuals can gain valuable skills for data exploration and analysis.
Information Architect
An Information Architect designs and organizes information systems to ensure they are easy to find and use. The Exploratory Data Analysis course from Johns Hopkins University may be helpful for aspiring Information Architects as it provides a foundation in data exploration and visualization. By learning essential techniques for summarizing data, identifying patterns, and visualizing results, individuals can gain valuable skills for understanding user needs, organizing information effectively, and designing user-friendly information systems.
Machine Learning Engineer
A Machine Learning Engineer designs, develops, and deploys machine learning models to solve real-world problems. The Exploratory Data Analysis course from Johns Hopkins University may be helpful for aspiring Machine Learning Engineers as it provides a foundation in data exploration and visualization. By learning essential techniques for summarizing data, identifying patterns, and visualizing results, individuals can gain valuable skills for understanding and manipulating data, which is crucial for building effective machine learning models.
User Experience Researcher
A User Experience Researcher studies how users interact with products and services to improve their usability and experience. The Exploratory Data Analysis course from Johns Hopkins University may be helpful for aspiring User Experience Researchers as it provides a foundation in data exploration and visualization. By learning essential techniques for summarizing data, identifying patterns, and visualizing results, individuals can gain valuable skills for understanding user behavior, identifying pain points, and making data-driven decisions to improve user experience.
Business Analyst
A Business Analyst uses data to identify opportunities and solve problems for businesses. The Exploratory Data Analysis course from Johns Hopkins University can be beneficial for aspiring Business Analysts as it provides a foundation in data exploration and visualization. By learning essential techniques for summarizing data, identifying patterns, and visualizing results, individuals can gain valuable skills for analyzing business data, identifying trends, and making data-driven recommendations.
Data Engineer
A Data Engineer designs, builds, and maintains data infrastructure and systems. The Exploratory Data Analysis course from Johns Hopkins University may be helpful for aspiring Data Engineers as it provides a foundation in data exploration and visualization. By learning essential techniques for summarizing data, identifying patterns, and visualizing results, individuals can gain valuable skills for understanding and manipulating data, which is crucial for building and maintaining effective data infrastructure.
Quality Assurance Analyst
A Quality Assurance Analyst ensures that products and services meet quality standards. The Exploratory Data Analysis course from Johns Hopkins University may be helpful for aspiring Quality Assurance Analysts as it provides a foundation in data exploration and visualization. By learning essential techniques for summarizing data, identifying patterns, and visualizing results, individuals can gain valuable skills for analyzing data, identifying defects, and ensuring that products and services meet customer requirements.
Quantitative Analyst
A Quantitative Analyst uses mathematical and statistical models to analyze financial data and make investment decisions. The Exploratory Data Analysis course from Johns Hopkins University may be helpful for aspiring Quantitative Analysts as it provides a foundation in data exploration and visualization. By learning essential techniques for summarizing data, identifying patterns, and visualizing results, individuals can gain valuable skills for analyzing financial data, identifying trends, and making data-driven investment decisions.
Market Researcher
A Market Researcher gathers and analyzes data to understand consumer behavior and market trends. The Exploratory Data Analysis course from Johns Hopkins University can be beneficial for aspiring Market Researchers as it provides a foundation in data exploration and visualization. By learning essential techniques for summarizing data, identifying patterns, and visualizing results, individuals can gain valuable skills for understanding and analyzing market data, identifying trends, and making data-driven marketing decisions.
Actuary
An Actuary uses mathematical and statistical models to assess risk and uncertainty. The Exploratory Data Analysis course from Johns Hopkins University may be helpful for aspiring Actuaries as it provides a foundation in data exploration and visualization. By learning essential techniques for summarizing data, identifying patterns, and visualizing results, individuals can gain valuable skills for understanding and analyzing risk data, making data-driven decisions, and developing insurance and financial products.
Epidemiologist
An Epidemiologist investigates the distribution and determinants of health-related states or events in specified populations. The Exploratory Data Analysis course from Johns Hopkins University may be helpful for aspiring Epidemiologists as it provides a foundation in data exploration and visualization. By learning essential techniques for summarizing data, identifying patterns, and visualizing results, individuals can gain valuable skills for understanding and analyzing health data, identifying trends, and making data-driven public health decisions.
Environmental Scientist
An Environmental Scientist studies the environment and its components and processes. The Exploratory Data Analysis course from Johns Hopkins University may be helpful for aspiring Environmental Scientists as it provides a foundation in data exploration and visualization. By learning essential techniques for summarizing data, identifying patterns, and visualizing results, individuals can gain valuable skills for understanding and analyzing environmental data, identifying trends, and making data-driven environmental decisions.

Reading list

We've selected 35 books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Exploratory Data Analysis.
Provides a comprehensive overview of exploratory data analysis techniques in R, covering both the theoretical foundations and practical applications. It is an excellent resource for students and practitioners who want to learn more about EDA.
Provides a comprehensive overview of data visualization principles and best practices. It covers a wide range of topics, from the basics of data visualization to advanced techniques for creating interactive and engaging visualizations.
Provides a comprehensive overview of data visualization techniques, which is essential for understanding the graphical displays covered in this course.
Provides a comprehensive introduction to statistical methods for the analysis of biomedical data. It covers a wide range of topics, from basic statistical concepts to advanced techniques for analyzing complex data.
Provides a detailed treatment of multivariate analysis of variance, which statistical technique used to compare multiple groups of data.
Provides a more comprehensive overview of data analysis using Python, covering topics such as data manipulation, visualization, and statistical modeling.
Provides a comprehensive overview of cluster analysis, which statistical technique used to identify groups of similar data points.
Provides a comprehensive overview of statistical methods for high-dimensional data, which is essential for understanding the methods used in this course for visualizing and analyzing high-dimensional data.
Provides a comprehensive introduction to statistical learning methods. It covers a wide range of topics, from supervised learning to unsupervised learning.
Provides a detailed treatment of dimension reduction techniques, which are statistical techniques used to reduce the number of variables in a dataset.
Provides a practical introduction to the R programming language, which is essential for understanding the code used in this course.
Provides a comprehensive overview of R for data science, which is essential for understanding the code used in this course.
Provides a comprehensive introduction to multivariate statistical analysis. It covers a wide range of topics, from basic concepts to advanced techniques for analyzing complex data.
Provides a more comprehensive overview of deep learning using Python, covering topics such as convolutional neural networks, recurrent neural networks, and generative adversarial networks.
Provides a detailed treatment of the Lattice graphics package, which powerful graphics package for R.
Provides a comprehensive introduction to data analysis using R. It covers a wide range of topics, from data cleaning and wrangling to statistical modeling and visualization.
Provides a comprehensive overview of data mining techniques in R. It is an excellent resource for students and practitioners who want to learn more about how to use R for data mining.
Provides a hands-on introduction to machine learning. It covers a wide range of topics, from supervised learning to unsupervised learning.
Provides a comprehensive overview of statistical methods for psychology. It is an excellent resource for students and practitioners who want to learn more about how to use statistical methods to analyze psychological data.
Provides a comprehensive overview of data science for business. It covers a wide range of topics, from data collection and wrangling to data analysis and visualization.
Provides a comprehensive overview of statistical inference, covering topics such as hypothesis testing, confidence intervals, and Bayesian inference.
Provides a comprehensive overview of data analysis techniques using SAS. It is an excellent resource for students and practitioners who want to learn more about how to use SAS for data analysis.
Provides a comprehensive overview of statistical methods for business and economics. It is an excellent resource for students and practitioners who want to learn more about how to use statistical methods to analyze business and economic data.
Provides a comprehensive introduction to causal inference. It covers a wide range of topics, from the basics of causal inference to advanced techniques for identifying and estimating causal effects.
Provides a comprehensive introduction to deep learning. It covers a wide range of topics, from the basics of deep learning to advanced techniques for building and training deep learning models.
Provides a more mathematical introduction to probability and statistics, covering topics such as probability theory, random variables, and statistical distributions.
Provides a comprehensive introduction to Bayesian data analysis. It covers a wide range of topics, from the basics of Bayesian data analysis to advanced techniques for building and fitting Bayesian models.
Provides a comprehensive introduction to time series analysis. It covers a wide range of topics, from the basics of time series analysis to advanced techniques for forecasting and modeling time series data.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Here are nine courses similar to Exploratory Data Analysis.
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2024 OpenCourser