We may earn an affiliate commission when you visit our partners.
Course image
Mark J Grover and Ray Lopez, Ph.D.

This is the second course in the IBM AI Enterprise Workflow Certification specialization.  You are STRONGLY encouraged to complete these courses in order as they are not individual independent courses, but part of a workflow where each course builds on the previous ones.  

Read more

This is the second course in the IBM AI Enterprise Workflow Certification specialization.  You are STRONGLY encouraged to complete these courses in order as they are not individual independent courses, but part of a workflow where each course builds on the previous ones.  

In this course you will begin your work for a hypothetical streaming media company by doing exploratory data analysis (EDA).  Best practices for data visualization, handling missing data, and hypothesis testing will be introduced to you as part of your work.  You will learn techniques of estimation with probability distributions and extending these estimates to apply null hypothesis significance tests. You will apply what you learn through two hands on case studies: data visualization and multiple testing using a simple pipeline.

 

By the end of this course you should be able to:

1.  List several best practices concerning EDA and data visualization

2.  Create a simple dashboard in Watson Studio

3.  Describe strategies for dealing with missing data

4.  Explain the difference between imputation and multiple imputation

5.  Employ common distributions to answer questions about event probabilities

6.  Explain the investigative role of hypothesis testing in EDA

7.  Apply several methods for dealing with multiple testing

 

Who should take this course?

This course targets existing data science practitioners that have expertise building machine learning models, who want to deepen their skills on building and deploying AI in large enterprises. If you are an aspiring Data Scientist, this course is NOT for you as you need real world expertise to benefit from the content of these courses.

What skills should you have?

It is assumed that you have completed Course 1 of the IBM AI Enterprise Workflow specialization and have a solid understanding of the following topics prior to starting this course: Fundamental understanding of Linear Algebra; Understand sampling, probability theory, and probability distributions; Knowledge of descriptive and inferential statistical concepts; General understanding of machine learning techniques and best practices; Practiced understanding of Python and the packages commonly used in data science: NumPy, Pandas, matplotlib, scikit-learn; Familiarity with IBM Watson Studio; Familiarity with the design thinking process.

Enroll now

What's inside

Syllabus

Data Analysis
Exploratory data analysis is mostly about gaining insight through visualization and hypothesis testing. This unit looks at EDA, data visualization, and missing values. One missing value strategy may be better for some models, but for others another strategy may show better predictive performance.
Read more
Data Investigation
Data scientists employ a broad range of statistical tools to analyze data and reach conclusions from data. This unit focuses on the foundational techniques of estimation with probability distributions and extending these estimates to apply null hypothesis significance tests.

Good to know

Know what's good
, what to watch for
, and possible dealbreakers
Specifically targets experienced data science practitioners who want to build and deploy AI in large enterprises
Not recommended for aspiring Data Scientists, who need real-world experience to benefit from the content
Assumes students have completed Course 1 of the IBM AI Enterprise Workflow specialization
Requires learners to have Python and data science visualization skills
Familiarizes learners with visualizing data and dealing with missing data in IBM Watson Studio
Offers practical experience through assignments involving data visualization and hypothesis testing

Save this course

Save AI Workflow: Data Analysis and Hypothesis Testing to your list so you can find it easily later:
Save

Reviews summary

Mixed feedback for ai course

Learners say that this AI course has a few strengths and several areas of opportunity for improvement. Key strengths include: insightful data manipulation content, short but interesting hypothesis testing section, high-quality material overall. Opportunities for improvement include: additional code examples for non-experts, clarification of Watson Studio information, better explanation of some exercises, and more comprehensive coverage of hypothesis testing. Additionally, students report that some instructors are absent, questions are ignored, vital course materials are missing, and there are many typos. These issues contribute to a less than ideal learning experience.
Learners find hypothesis testing content short but interesting.
"The hypothesis testing was very short so I didn't expect much. Still interesting to know about multiple hypothesis testing."
Learners find data manipulation content insightful.
"The part on the EDA is very insightful, I learned a lot about data manipulation in Pandas."
There are many typos in the course materials.
"There are SO MANY misspellings in the texts by the way..."
Some vital course materials, like answer keys, are missing.
"Quizzes mark you as correct even if you're not, the answer keys are missing from notebooks"
Some instructors are absent and ignore student questions.
"Instructors are completely absent and ignore questions from students"

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in AI Workflow: Data Analysis and Hypothesis Testing with these activities:
Review Hypothesis Testing
Refreshes the fundamental skill of hypothesis testing for statistical analysis.
Browse courses on Hypothesis Testing
Show steps
  • Study the provided resources on hypothesis testing
  • Take a practice quiz on hypothesis testing concepts
  • Apply hypothesis testing to a simple dataset
Practice data cleaning and imputation using provided datasets
Provides practical experience in handling missing values, leading to better understanding.
Browse courses on Data Cleaning
Show steps
  • Gather sample datasets with different types of missing data patterns
  • Apply various imputation techniques to clean the data
  • Evaluate the quality of the imputed data
Explore Watson Studio's dashboard creation capabilities
Enhances understanding of dashboard creation tools for effective data presentation.
Browse courses on Watson Studio
Show steps
  • Review tutorials on building dashboards in Watson Studio
  • Create a simple dashboard using a given dataset
Five other activities
Expand to see all activities and additional details
Show all eight activities
Work through provided data visualization examples
Hands-on activity helps transition knowledge to working memory.
Browse courses on Data Visualization
Show steps
  • Review examples in provided materials
  • Replicate one of the examples using a sample dataset
  • Present your replicated example
Collaborate on a data visualization project using a shared dataset
Promotes deeper understanding of data visualization principles through collaborative analysis of real-world data.
Browse courses on Data Visualization
Show steps
  • Form groups of 3-4 students
  • Select a dataset
  • Brainstorm data visualization ideas
  • Create data visualizations and present to the group
  • Provide feedback and refine visualizations
Contribute to the pandas documentation by improving examples or fixing typos
Provides hands-on experience with the pandas library and contributes to the open source community.
Browse courses on Pandas
Show steps
  • Identify areas for improvement in the documentation
  • Make changes to the documentation
  • Submit a pull request
  • Review feedback from maintainers
  • Incorporate feedback and update the documentation
Participate in a data visualization competition
Tests data visualization abilities and exposes students to industry practices.
Browse courses on Data Visualization
Show steps
  • Identify a suitable competition
  • Prepare data and create visualizations
  • Submit your entry
Build a data visualization dashboard for a real-world dataset
Integrates knowledge of data visualization and dashboard creation with real-world data.
Browse courses on Data Visualization
Show steps
  • Choose a dataset
  • Clean and prepare the data
  • Design the dashboard
  • Implement the dashboard
  • Deploy and share the dashboard

Career center

Learners who complete AI Workflow: Data Analysis and Hypothesis Testing will develop knowledge and skills that may be useful to these careers:
Data Scientist
Data Scientists leverage advanced statistical techniques and machine learning algorithms to solve complex business problems. The AI Workflow: Data Analysis and Hypothesis Testing course can be a valuable addition to their skillset by providing a deeper understanding of exploratory data analysis, hypothesis testing, and probability distributions. This knowledge will enable Data Scientists to make more informed decisions and develop more accurate models.
Statistician
Statisticians apply statistical principles to collect, analyze, and interpret data. The AI Workflow: Data Analysis and Hypothesis Testing course can enhance their expertise by introducing them to advanced techniques for data visualization, handling missing values, and hypothesis testing. By gaining a deeper understanding of these concepts, Statisticians will be able to provide more insightful analysis and draw more accurate conclusions from data.
Market Researcher
Market Researchers gather and analyze data to understand consumer behavior and market trends. The AI Workflow: Data Analysis and Hypothesis Testing course can enhance their skills by providing them with advanced techniques for data visualization, hypothesis testing, and probability distributions. This knowledge will enable Market Researchers to conduct more effective research and provide more valuable insights to their clients.
Data Analyst
The Data Analyst role involves collecting, cleaning, and analyzing data to extract meaningful insights. The AI Workflow: Data Analysis and Hypothesis Testing course can provide a solid foundation for this role by teaching best practices for data visualization, handling missing data, and hypothesis testing. By completing this course, learners will gain skills in data exploration and statistical analysis, which are essential for success as a Data Analyst.
Business Analyst
Business Analysts use data to identify and solve business problems. The AI Workflow: Data Analysis and Hypothesis Testing course can provide them with a strong foundation in data analysis and hypothesis testing, which are essential skills for success in this role. By completing this course, learners will gain the ability to analyze data, identify trends, and make recommendations that can improve business outcomes.
Financial Analyst
Financial Analysts use data to evaluate and make recommendations on investments. The AI Workflow: Data Analysis and Hypothesis Testing course can provide them with a strong foundation in data analysis and hypothesis testing, which are essential skills for success in this role. By completing this course, learners will gain the ability to analyze financial data, identify trends, and make recommendations that can improve investment decisions.
Risk Analyst
Risk Analysts use data to identify and assess risks to businesses. The AI Workflow: Data Analysis and Hypothesis Testing course can provide them with a strong foundation in data analysis and hypothesis testing, which are essential skills for success in this role. By completing this course, learners will gain the ability to analyze data, identify risks, and develop mitigation strategies.
Operations Research Analyst
Operations Research Analysts use mathematical and statistical techniques to improve the efficiency of business operations. The AI Workflow: Data Analysis and Hypothesis Testing course can provide them with a strong foundation in data analysis and hypothesis testing, which are essential skills for success in this role. By completing this course, learners will gain the ability to analyze data, identify inefficiencies, and develop solutions that can improve operational performance.
Data Engineer
Data Engineers design, build, and maintain data pipelines and systems. The AI Workflow: Data Analysis and Hypothesis Testing course can provide them with a strong foundation in data analysis and hypothesis testing, which are essential skills for success in this role. By completing this course, learners will gain the ability to analyze data, identify data quality issues, and develop data pipelines that can support data-driven decision-making.
Data Visualization Specialist
Data Visualization Specialists create visual representations of data to communicate insights. The AI Workflow: Data Analysis and Hypothesis Testing course can provide them with a strong foundation in data analysis and hypothesis testing, which are essential skills for success in this role. By completing this course, learners will gain the ability to analyze data, identify trends, and create visual representations that can effectively communicate insights to stakeholders.
Quantitative Analyst
Quantitative Analysts use mathematical and statistical techniques to solve financial problems. The AI Workflow: Data Analysis and Hypothesis Testing course can provide them with a strong foundation in data analysis and hypothesis testing, which are essential skills for success in this role. By completing this course, learners will gain the ability to analyze data, identify patterns, and develop mathematical models that can be used to solve financial problems.
Machine Learning Engineer
Machine Learning Engineers design, build, and deploy machine learning models. The AI Workflow: Data Analysis and Hypothesis Testing course can provide them with a strong foundation in data analysis and hypothesis testing, which are essential skills for success in this role. By completing this course, learners will gain the ability to analyze data, identify patterns, and develop machine learning models that can solve business problems.
Actuary
Actuaries use mathematical and statistical techniques to assess and manage financial risks. The AI Workflow: Data Analysis and Hypothesis Testing course can provide them with a strong foundation in data analysis and hypothesis testing, which are essential skills for success in this role. By completing this course, learners will gain the ability to analyze data, identify risks, and develop models that can be used to assess and manage financial risks.
Epidemiologist
Epidemiologists investigate the causes and effects of diseases. The AI Workflow: Data Analysis and Hypothesis Testing course can provide them with a strong foundation in data analysis and hypothesis testing, which are essential skills for success in this role. By completing this course, learners will gain the ability to analyze data, identify patterns, and develop hypotheses that can be used to investigate the causes and effects of diseases.

Reading list

We've selected 14 books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in AI Workflow: Data Analysis and Hypothesis Testing.
Provides a comprehensive overview of exploratory data analysis (EDA) techniques using the R programming language. It covers a wide range of EDA topics, including data visualization, data cleaning, and hypothesis testing. This book valuable resource for data scientists and analysts who want to learn more about EDA.
Covers a related area of statistics not covered in this particular course, but provides knowledge useful to all data scientists.
Provides a comprehensive introduction to statistical inference, covering topics such as sampling, estimation, hypothesis testing, and regression analysis. It valuable resource for students and practitioners who want to learn more about statistical inference.
Good general companion to this course. It will give any learners additional information on data visualization and hypothesis testing in python
Provides a practical introduction to data science for business professionals. It covers a wide range of data science topics, including data collection, data analysis, and data visualization. This book valuable resource for business professionals who want to learn more about data science.
Provides a comprehensive overview of deep learning, covering topics such as neural networks, convolutional neural networks, and recurrent neural networks. It valuable resource for students and practitioners who want to learn more about deep learning.
Provides a comprehensive overview of reinforcement learning, covering topics such as Markov decision processes, value functions, and reinforcement learning algorithms. It valuable resource for students and practitioners who want to learn more about reinforcement learning.
Provides a comprehensive overview of natural language processing (NLP), covering topics such as text preprocessing, machine translation, and natural language generation. It valuable resource for students and practitioners who want to learn more about NLP.
Provides a comprehensive overview of data mining, covering topics such as data preprocessing, data clustering, and data classification. It valuable resource for students and practitioners who want to learn more about data mining.
Provides a comprehensive overview of statistics for machine learning, covering topics such as probability, linear algebra, and optimization. It valuable resource for students and practitioners who want to learn more about statistics for machine learning.
Provides a comprehensive overview of deep learning with Python, covering topics such as neural networks, convolutional neural networks, and recurrent neural networks. It valuable resource for students and practitioners who want to learn more about deep learning with Python.
Provides a comprehensive overview of data visualization, covering topics such as data visualization techniques, data visualization tools, and data visualization best practices. It valuable resource for students and practitioners who want to learn more about data visualization.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Here are nine courses similar to AI Workflow: Data Analysis and Hypothesis Testing.
Using probability distributions for real world problems...
Most relevant
Statistics Fundamentals for Business Analytics
Most relevant
The Power of Statistics
Most relevant
Foundations of Statistics and Probability for Machine...
Most relevant
Introduction to Probability and Statistics
Most relevant
Statistics Masterclass for Data Science and Data Analytics
Most relevant
Probability and Statistics IV: Confidence Intervals and...
Most relevant
Probability & Statistics for Machine Learning & Data...
Most relevant
Essential Statistics for Data Analysis
Most relevant
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2024 OpenCourser