We may earn an affiliate commission when you visit our partners.
Course image
Roger D. Peng, PhD, Jeff Leek, PhD, and Brian Caffo, PhD

This course covers the essential exploratory techniques for summarizing data. These techniques are typically applied before formal modeling commences and can help inform the development of more complex statistical models. Exploratory techniques are also important for eliminating or sharpening potential hypotheses about the world that can be addressed by the data. We will cover in detail the plotting systems in R as well as some of the basic principles of constructing data graphics. We will also cover some of the common multivariate statistical techniques used to visualize high-dimensional data.

Enroll now

What's inside

Syllabus

Week 1
This week covers the basics of analytic graphics and the base plotting system in R. We've also included some background material to help you install R if you haven't done so already.
Read more

Traffic lights

Read about what's good
what should give you pause
and possible dealbreakers
Provides foundational knowledge of exploratory data analysis techniques
Covers specific multivariate statistical techniques for visualizing high-dimensional data
Instructors are recognized for their expertise in exploratory data analysis
Taught using R, which has strong support for exploratory data analysis
Provides case studies to demonstrate real-world applications of exploratory data analysis

Save this course

Create your own learning path. Save this course to your list so you can find it easily later.
Save

Reviews summary

Practical exploratory data analysis in r

According to learners, this course provides a solid introduction to Exploratory Data Analysis concepts and techniques using R. Many found the coverage of the ggplot2 plotting system particularly valuable for creating effective data visualizations. Students appreciate the hands-on approach with practical assignments that help solidify understanding. While the course is well-structured across four weeks, some students felt certain topics could benefit from greater depth or that the pace was uneven. Setup issues with R or packages were mentioned by a few. Overall, it's considered a strong foundational course for those new to EDA in R, though prior R experience is beneficial.
Pace felt uneven for some learners.
"Week 1 and 2 were great, but Week 3 felt rushed with complex topics."
"The structure is logical, but the difficulty seemed to jump in the later weeks."
"Found the pace challenging at times, especially integrating the different concepts."
Some topics could benefit from more detail.
"While good for an intro, I felt some advanced topics were covered a bit too quickly."
"Could use more in-depth coverage on certain techniques, particularly for higher dimensions."
"It's a good overview, but not sufficient if you need deep theoretical understanding."
Easier for those already familiar with R.
"Highly recommend having some basic R knowledge before starting this course."
"As someone new to R, I found myself struggling more than others with the code."
"The course assumes a minimal level of R familiarity which might be tough for complete beginners."
Hands-on exercises aid learning and application.
"The assignments were practical and reinforced the concepts taught in the lectures."
"I really benefited from the hands-on coding exercises and projects."
"The practical nature of the homework helped me apply what I learned."
Provides a good introduction to EDA principles.
"This course gave me a great introduction to the core concepts of Exploratory Data Analysis."
"I feel like I have a solid foundation now after taking this course."
"It covers the essential techniques needed to get started with EDA."
Detailed coverage of key R plotting systems.
"The sections on ggplot2 were incredibly helpful and immediately applicable."
"Learning the different R plotting systems (base, lattice, ggplot2) was a major plus."
"The course does a great job explaining how to create various plots in R."
Potential difficulties with software installation.
"Had some trouble getting R and the necessary packages installed initially."
"The setup instructions weren't always clear for different operating systems."
"Encountered errors related to package dependencies during the exercises."

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Exploratory Data Analysis with these activities:
Read 'Exploratory Data Analysis with R' by Roger D. Peng
Gain a deeper understanding of the course concepts by reading the foundational book on exploratory data analysis with R.
Show steps
  • Read chapters covering basic exploratory techniques, data visualization, and multivariate statistical methods.
  • Work through the practice exercises provided in the book.
Review probability theory
Brush up on probability theory to strengthen your foundation for this course's statistical concepts.
Browse courses on Probability Theory
Show steps
  • Review probability concepts such as conditional probability, Bayes' theorem, and random variables.
  • Solve practice problems involving probability distributions and statistical inference.
Practice plotting techniques in R
Improve your ability to create clear and informative data visualizations for exploratory data analysis.
Browse courses on R
Show steps
  • Follow the tutorials in Week 1 and Week 2 of the course
  • Experiment with the different plotting systems in R
  • Try creating your own data visualizations from scratch
Eight other activities
Expand to see all activities and additional details
Show all 11 activities
Visualize Data with R
Practice visualizing data with R to enhance your understanding of data distributions and relationships.
Browse courses on Data Visualization
Show steps
  • Import data into R
  • Create basic plots (e.g., histograms, scatterplots)
  • Customize plot appearance (e.g., colors, labels)
  • Use R packages for advanced visualization (e.g., ggplot2)
Complete practice exercises on data visualization
Reinforce your understanding of data visualization techniques by completing practice exercises.
Browse courses on Data Visualization
Show steps
  • Use the ggplot2 library to create various types of plots, such as scatterplots, histograms, and box plots.
  • Practice customizing plots by changing colors, adding titles, and adjusting axes.
Attend study groups with classmates
Enhance your understanding of course concepts by discussing them with peers.
Show steps
  • Form study groups with classmates who have similar interests or learning styles.
  • Discuss course material, work on practice problems together, and provide mutual support.
Analyze real-world datasets with R
Develop your skills in using R to explore and analyze real-world datasets, enhancing your understanding of data analysis techniques.
Browse courses on Data Analysis
Show steps
  • Find a dataset of interest from a reputable source
  • Load the dataset into R and explore its structure
  • Perform exploratory data analysis using the techniques covered in the course
  • Write a report or presentation summarizing your findings
Follow tutorials on advanced R programming
Expand your R programming skills by following tutorials on advanced topics.
Browse courses on R Programming
Show steps
  • Learn how to work with large datasets using the dplyr and tidyr libraries.
  • Practice creating interactive data visualizations using the Shiny package.
Attend a workshop on data wrangling
Develop practical skills in data wrangling by attending a workshop dedicated to the topic.
Browse courses on Data Wrangling
Show steps
  • Participate in hands-on exercises to learn techniques for data cleaning, transformation, and preparation.
  • Interact with experts and ask questions to enhance your understanding.
Create a data visualization project
Apply your data visualization skills to a real-world dataset by creating a project.
Browse courses on Data Analysis
Show steps
  • Choose a dataset and explore it using exploratory data analysis techniques.
  • Create a set of visualizations that effectively communicate the insights gained from the data analysis.
  • Present your project to a peer or mentor for feedback.
Contribute to an open-source data visualization project
Apply your skills and contribute to the open-source community by participating in a data visualization project.
Browse courses on Open Source
Show steps
  • Identify an open-source data visualization project that aligns with your interests.
  • Join the project community, learn about their codebase, and identify areas where you can contribute.
  • Make code contributions, participate in discussions, and collaborate with other developers.

Career center

Learners who complete Exploratory Data Analysis will develop knowledge and skills that may be useful to these careers:
Data Analyst
A Data Analyst collects, processes, and analyzes data to extract meaningful insights. The Exploratory Data Analysis course from Johns Hopkins University can help aspiring Data Analysts develop the skills needed to summarize data, identify patterns, and visualize results. This course covers essential techniques for exploratory data analysis, including plotting, clustering, and dimension reduction, which are crucial for understanding data and making informed decisions.
Data Visualization Specialist
A Data Visualization Specialist designs and creates visual representations of data to communicate insights and trends. The Exploratory Data Analysis course from Johns Hopkins University provides a strong foundation for aspiring Data Visualization Specialists by covering essential techniques for summarizing data, identifying patterns, and visualizing results. This course helps individuals develop the skills needed to create effective and engaging data visualizations that clearly communicate insights and support decision-making.
Statistician
A Statistician uses mathematical and statistical methods to analyze data, draw conclusions, and make predictions. The Exploratory Data Analysis course from Johns Hopkins University provides a solid foundation for aspiring Statisticians by covering essential exploratory techniques for summarizing data, identifying patterns, and visualizing results. This course helps build a strong understanding of data analysis principles and prepares individuals for more advanced statistical modeling and inference.
Data Scientist
A Data Scientist combines expertise in data analysis, programming, and machine learning to extract insights from data. The Exploratory Data Analysis course from Johns Hopkins University can be beneficial for aspiring Data Scientists as it provides a strong foundation in data exploration and visualization. By learning essential techniques for summarizing data, identifying patterns, and visualizing results, individuals can gain valuable skills for data exploration and analysis.
Information Architect
An Information Architect designs and organizes information systems to ensure they are easy to find and use. The Exploratory Data Analysis course from Johns Hopkins University may be helpful for aspiring Information Architects as it provides a foundation in data exploration and visualization. By learning essential techniques for summarizing data, identifying patterns, and visualizing results, individuals can gain valuable skills for understanding user needs, organizing information effectively, and designing user-friendly information systems.
Machine Learning Engineer
A Machine Learning Engineer designs, develops, and deploys machine learning models to solve real-world problems. The Exploratory Data Analysis course from Johns Hopkins University may be helpful for aspiring Machine Learning Engineers as it provides a foundation in data exploration and visualization. By learning essential techniques for summarizing data, identifying patterns, and visualizing results, individuals can gain valuable skills for understanding and manipulating data, which is crucial for building effective machine learning models.
User Experience Researcher
A User Experience Researcher studies how users interact with products and services to improve their usability and experience. The Exploratory Data Analysis course from Johns Hopkins University may be helpful for aspiring User Experience Researchers as it provides a foundation in data exploration and visualization. By learning essential techniques for summarizing data, identifying patterns, and visualizing results, individuals can gain valuable skills for understanding user behavior, identifying pain points, and making data-driven decisions to improve user experience.
Business Analyst
A Business Analyst uses data to identify opportunities and solve problems for businesses. The Exploratory Data Analysis course from Johns Hopkins University can be beneficial for aspiring Business Analysts as it provides a foundation in data exploration and visualization. By learning essential techniques for summarizing data, identifying patterns, and visualizing results, individuals can gain valuable skills for analyzing business data, identifying trends, and making data-driven recommendations.
Quality Assurance Analyst
A Quality Assurance Analyst ensures that products and services meet quality standards. The Exploratory Data Analysis course from Johns Hopkins University may be helpful for aspiring Quality Assurance Analysts as it provides a foundation in data exploration and visualization. By learning essential techniques for summarizing data, identifying patterns, and visualizing results, individuals can gain valuable skills for analyzing data, identifying defects, and ensuring that products and services meet customer requirements.
Data Engineer
A Data Engineer designs, builds, and maintains data infrastructure and systems. The Exploratory Data Analysis course from Johns Hopkins University may be helpful for aspiring Data Engineers as it provides a foundation in data exploration and visualization. By learning essential techniques for summarizing data, identifying patterns, and visualizing results, individuals can gain valuable skills for understanding and manipulating data, which is crucial for building and maintaining effective data infrastructure.
Quantitative Analyst
A Quantitative Analyst uses mathematical and statistical models to analyze financial data and make investment decisions. The Exploratory Data Analysis course from Johns Hopkins University may be helpful for aspiring Quantitative Analysts as it provides a foundation in data exploration and visualization. By learning essential techniques for summarizing data, identifying patterns, and visualizing results, individuals can gain valuable skills for analyzing financial data, identifying trends, and making data-driven investment decisions.
Market Researcher
A Market Researcher gathers and analyzes data to understand consumer behavior and market trends. The Exploratory Data Analysis course from Johns Hopkins University can be beneficial for aspiring Market Researchers as it provides a foundation in data exploration and visualization. By learning essential techniques for summarizing data, identifying patterns, and visualizing results, individuals can gain valuable skills for understanding and analyzing market data, identifying trends, and making data-driven marketing decisions.
Actuary
An Actuary uses mathematical and statistical models to assess risk and uncertainty. The Exploratory Data Analysis course from Johns Hopkins University may be helpful for aspiring Actuaries as it provides a foundation in data exploration and visualization. By learning essential techniques for summarizing data, identifying patterns, and visualizing results, individuals can gain valuable skills for understanding and analyzing risk data, making data-driven decisions, and developing insurance and financial products.
Epidemiologist
An Epidemiologist investigates the distribution and determinants of health-related states or events in specified populations. The Exploratory Data Analysis course from Johns Hopkins University may be helpful for aspiring Epidemiologists as it provides a foundation in data exploration and visualization. By learning essential techniques for summarizing data, identifying patterns, and visualizing results, individuals can gain valuable skills for understanding and analyzing health data, identifying trends, and making data-driven public health decisions.
Environmental Scientist
An Environmental Scientist studies the environment and its components and processes. The Exploratory Data Analysis course from Johns Hopkins University may be helpful for aspiring Environmental Scientists as it provides a foundation in data exploration and visualization. By learning essential techniques for summarizing data, identifying patterns, and visualizing results, individuals can gain valuable skills for understanding and analyzing environmental data, identifying trends, and making data-driven environmental decisions.

Reading list

We've selected 35 books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Exploratory Data Analysis.
Provides a comprehensive overview of exploratory data analysis techniques in R, covering both the theoretical foundations and practical applications. It is an excellent resource for students and practitioners who want to learn more about EDA.
Provides a comprehensive overview of data visualization principles and best practices. It covers a wide range of topics, from the basics of data visualization to advanced techniques for creating interactive and engaging visualizations.
Provides a comprehensive overview of data visualization techniques, which is essential for understanding the graphical displays covered in this course.
Provides a comprehensive introduction to statistical methods for the analysis of biomedical data. It covers a wide range of topics, from basic statistical concepts to advanced techniques for analyzing complex data.
Provides a detailed treatment of multivariate analysis of variance, which statistical technique used to compare multiple groups of data.
Provides a more comprehensive overview of data analysis using Python, covering topics such as data manipulation, visualization, and statistical modeling.
Provides a comprehensive overview of cluster analysis, which statistical technique used to identify groups of similar data points.
Provides a comprehensive overview of statistical methods for high-dimensional data, which is essential for understanding the methods used in this course for visualizing and analyzing high-dimensional data.
Provides a comprehensive introduction to statistical learning methods. It covers a wide range of topics, from supervised learning to unsupervised learning.
Provides a detailed treatment of dimension reduction techniques, which are statistical techniques used to reduce the number of variables in a dataset.
Provides a practical introduction to the R programming language, which is essential for understanding the code used in this course.
Provides a comprehensive overview of R for data science, which is essential for understanding the code used in this course.
Provides a comprehensive introduction to multivariate statistical analysis. It covers a wide range of topics, from basic concepts to advanced techniques for analyzing complex data.
Provides a more comprehensive overview of deep learning using Python, covering topics such as convolutional neural networks, recurrent neural networks, and generative adversarial networks.
Provides a detailed treatment of the Lattice graphics package, which powerful graphics package for R.
Provides a comprehensive introduction to data analysis using R. It covers a wide range of topics, from data cleaning and wrangling to statistical modeling and visualization.
Provides a comprehensive overview of data mining techniques in R. It is an excellent resource for students and practitioners who want to learn more about how to use R for data mining.
Provides a hands-on introduction to machine learning. It covers a wide range of topics, from supervised learning to unsupervised learning.
Provides a comprehensive overview of statistical methods for psychology. It is an excellent resource for students and practitioners who want to learn more about how to use statistical methods to analyze psychological data.
Provides a comprehensive overview of data science for business. It covers a wide range of topics, from data collection and wrangling to data analysis and visualization.
Provides a comprehensive overview of statistical inference, covering topics such as hypothesis testing, confidence intervals, and Bayesian inference.
Provides a comprehensive overview of data analysis techniques using SAS. It is an excellent resource for students and practitioners who want to learn more about how to use SAS for data analysis.
Provides a comprehensive overview of statistical methods for business and economics. It is an excellent resource for students and practitioners who want to learn more about how to use statistical methods to analyze business and economic data.
Provides a comprehensive introduction to causal inference. It covers a wide range of topics, from the basics of causal inference to advanced techniques for identifying and estimating causal effects.
Provides a comprehensive introduction to deep learning. It covers a wide range of topics, from the basics of deep learning to advanced techniques for building and training deep learning models.
Provides a more mathematical introduction to probability and statistics, covering topics such as probability theory, random variables, and statistical distributions.
Provides a comprehensive introduction to Bayesian data analysis. It covers a wide range of topics, from the basics of Bayesian data analysis to advanced techniques for building and fitting Bayesian models.
Provides a comprehensive introduction to time series analysis. It covers a wide range of topics, from the basics of time series analysis to advanced techniques for forecasting and modeling time series data.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Similar courses are unavailable at this time. Please try again later.
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2025 OpenCourser