AI Workflow: Data Analysis and Hypothesis Testing from Coursera

This is the second course in the IBM AI Enterprise Workflow Certification specialization. You are STRONGLY encouraged to complete these courses in order as they are not individual independent courses, but part of a workflow where each course builds on the previous ones.

In this course you will begin your work for a hypothetical streaming media company by doing exploratory data analysis (EDA). Best practices for data visualization, handling missing data, and hypothesis testing will be introduced to you as part of your work. You will learn techniques of estimation with probability distributions and extending these estimates to apply null hypothesis significance tests. You will apply what you learn through two hands on case studies: data visualization and multiple testing using a simple pipeline.

By the end of this course you should be able to:

1. List several best practices concerning EDA and data visualization

2. Create a simple dashboard in Watson Studio

3. Describe strategies for dealing with missing data

4. Explain the difference between imputation and multiple imputation

5. Employ common distributions to answer questions about event probabilities

6. Explain the investigative role of hypothesis testing in EDA

7. Apply several methods for dealing with multiple testing

Who should take this course?

This course targets existing data science practitioners that have expertise building machine learning models, who want to deepen their skills on building and deploying AI in large enterprises. If you are an aspiring Data Scientist, this course is NOT for you as you need real world expertise to benefit from the content of these courses.

What skills should you have?

It is assumed that you have completed Course 1 of the IBM AI Enterprise Workflow specialization and have a solid understanding of the following topics prior to starting this course: Fundamental understanding of Linear Algebra; Understand sampling, probability theory, and probability distributions; Knowledge of descriptive and inferential statistical concepts; General understanding of machine learning techniques and best practices; Practiced understanding of Python and the packages commonly used in data science: NumPy, Pandas, matplotlib, scikit-learn; Familiarity with IBM Watson Studio; Familiarity with the design thinking process.

What's inside

Syllabus

Data Analysis

Exploratory data analysis is mostly about gaining insight through visualization and hypothesis testing. This unit looks at EDA, data visualization, and missing values. One missing value strategy may be better for some models, but for others another strategy may show better predictive performance.

Traffic lights

Read about what's good

what should give you pause

and possible dealbreakers

Specifically targets experienced data science practitioners who want to build and deploy AI in large enterprises

Not recommended for aspiring Data Scientists, who need real-world experience to benefit from the content

Assumes students have completed Course 1 of the IBM AI Enterprise Workflow specialization

Requires learners to have Python and data science visualization skills

Familiarizes learners with visualizing data and dealing with missing data in IBM Watson Studio

Offers practical experience through assignments involving data visualization and hypothesis testing

Reviews summary

Enterprise ai workflow: data analysis & hypothesis testing

According to learners, this course, the second in the IBM AI Enterprise Workflow specialization, provides valuable insights into exploratory data analysis (EDA) and hypothesis testing within an enterprise context. Many found the practical case studies and assignments to be particularly helpful for applying concepts. While the content on core statistical techniques and their application in AI workflows is generally well-received, some students noted that the prerequisites should be taken seriously, as the course assumes a solid existing foundation. The integration with IBM Watson Studio is a key feature, though a few reviewers mentioned encountering platform-related issues. Overall, students report gaining applicable skills for their work.

Platform is central to the course.

"Learning to create a simple dashboard in Watson Studio was a direct application of skills."

"The course heavily relies on using IBM Watson Studio."

"I had some minor technical issues working with the Watson Studio environment during assignments."

"Familiarity with Watson Studio is definitely beneficial before starting this course."

Requires strong background as described.

"This course is definitely NOT for beginners. You need a solid data science background."

"Make sure you complete the first course and have the stated prerequisites down before starting."

"I struggled with some parts because I didn't have the necessary prior statistical knowledge."

"As an experienced practitioner, I found the pacing and content depth appropriate, assuming you meet the prereqs."

Logically follows the previous course.

"This course seamlessly builds upon the concepts introduced in the first course of the specialization."

"Taking the courses in order is essential, and this one extends the workflow nicely."

"The continuation of the hypothetical streaming media company project provides good continuity."

"It feels like a natural progression from the previous module."

Concepts are explained effectively.

"The explanations of EDA best practices and dealing with missing data were very clear."

"I understood the difference between imputation and multiple imputation well after the lectures."

"The course did a good job explaining the role of hypothesis testing in EDA."

"I felt the instructors presented the material on probability distributions effectively."

Hands-on exercises reinforce learning.

"The hands-on case studies were incredibly useful for solidifying the theoretical concepts."

"I really appreciated the practical application of hypothesis testing demonstrated through the assignments."

"The simple pipeline case study helped me see how these steps fit into a larger workflow."

"Applying what I learned in Watson Studio made the concepts feel more concrete."

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in AI Workflow: Data Analysis and Hypothesis Testing with these activities:

Review Hypothesis Testing

Show steps

Refreshes the fundamental skill of hypothesis testing for statistical analysis.

Browse courses on Hypothesis Testing

Show steps

Study the provided resources on hypothesis testing
Take a practice quiz on hypothesis testing concepts
Apply hypothesis testing to a simple dataset

Practice data cleaning and imputation using provided datasets

Show steps

Provides practical experience in handling missing values, leading to better understanding.

Browse courses on Data Cleaning

Show steps

Gather sample datasets with different types of missing data patterns
Apply various imputation techniques to clean the data
Evaluate the quality of the imputed data

Explore Watson Studio's dashboard creation capabilities

Show steps

Enhances understanding of dashboard creation tools for effective data presentation.

Browse courses on Watson Studio

Show steps

Review tutorials on building dashboards in Watson Studio
Create a simple dashboard using a given dataset

Five other activities

Expand to see all activities and additional details

Show all eight activities

Work through provided data visualization examples

Show steps

Hands-on activity helps transition knowledge to working memory.

Browse courses on Data Visualization

Show steps

Review examples in provided materials
Replicate one of the examples using a sample dataset
Present your replicated example

Collaborate on a data visualization project using a shared dataset

Show steps

Promotes deeper understanding of data visualization principles through collaborative analysis of real-world data.

Browse courses on Data Visualization

Show steps

Form groups of 3-4 students
Select a dataset
Brainstorm data visualization ideas
Create data visualizations and present to the group
Provide feedback and refine visualizations

Contribute to the pandas documentation by improving examples or fixing typos

Show steps

Provides hands-on experience with the pandas library and contributes to the open source community.

Browse courses on Pandas

Show steps

Identify areas for improvement in the documentation
Make changes to the documentation
Submit a pull request
Review feedback from maintainers
Incorporate feedback and update the documentation

Participate in a data visualization competition

Show steps

Tests data visualization abilities and exposes students to industry practices.

Browse courses on Data Visualization

Show steps

Identify a suitable competition
Prepare data and create visualizations
Submit your entry

Build a data visualization dashboard for a real-world dataset

Show steps

Integrates knowledge of data visualization and dashboard creation with real-world data.

Browse courses on Data Visualization

Show steps

Choose a dataset
Clean and prepare the data
Design the dashboard
Implement the dashboard
Deploy and share the dashboard

Career center

Learners who complete AI Workflow: Data Analysis and Hypothesis Testing will develop knowledge and skills that may be useful to these careers:

Data Scientist

Data Scientists leverage advanced statistical techniques and machine learning algorithms to solve complex business problems. The AI Workflow: Data Analysis and Hypothesis Testing course can be a valuable addition to their skillset by providing a deeper understanding of exploratory data analysis, hypothesis testing, and probability distributions. This knowledge will enable Data Scientists to make more informed decisions and develop more accurate models.

See salaries and explore the career path for Data Scientist

Statistician

Statisticians apply statistical principles to collect, analyze, and interpret data. The AI Workflow: Data Analysis and Hypothesis Testing course can enhance their expertise by introducing them to advanced techniques for data visualization, handling missing values, and hypothesis testing. By gaining a deeper understanding of these concepts, Statisticians will be able to provide more insightful analysis and draw more accurate conclusions from data.

See salaries and explore the career path for Statistician

Market Researcher

Market Researchers gather and analyze data to understand consumer behavior and market trends. The AI Workflow: Data Analysis and Hypothesis Testing course can enhance their skills by providing them with advanced techniques for data visualization, hypothesis testing, and probability distributions. This knowledge will enable Market Researchers to conduct more effective research and provide more valuable insights to their clients.

See salaries and explore the career path for Market Researcher

Data Analyst

The Data Analyst role involves collecting, cleaning, and analyzing data to extract meaningful insights. The AI Workflow: Data Analysis and Hypothesis Testing course can provide a solid foundation for this role by teaching best practices for data visualization, handling missing data, and hypothesis testing. By completing this course, learners will gain skills in data exploration and statistical analysis, which are essential for success as a Data Analyst.

See salaries and explore the career path for Data Analyst

Risk Analyst

Risk Analysts use data to identify and assess risks to businesses. The AI Workflow: Data Analysis and Hypothesis Testing course can provide them with a strong foundation in data analysis and hypothesis testing, which are essential skills for success in this role. By completing this course, learners will gain the ability to analyze data, identify risks, and develop mitigation strategies.

See salaries and explore the career path for Risk Analyst

Business Analyst

Business Analysts use data to identify and solve business problems. The AI Workflow: Data Analysis and Hypothesis Testing course can provide them with a strong foundation in data analysis and hypothesis testing, which are essential skills for success in this role. By completing this course, learners will gain the ability to analyze data, identify trends, and make recommendations that can improve business outcomes.

See salaries and explore the career path for Business Analyst

Financial Analyst

Financial Analysts use data to evaluate and make recommendations on investments. The AI Workflow: Data Analysis and Hypothesis Testing course can provide them with a strong foundation in data analysis and hypothesis testing, which are essential skills for success in this role. By completing this course, learners will gain the ability to analyze financial data, identify trends, and make recommendations that can improve investment decisions.

See salaries and explore the career path for Financial Analyst

Operations Research Analyst

Operations Research Analysts use mathematical and statistical techniques to improve the efficiency of business operations. The AI Workflow: Data Analysis and Hypothesis Testing course can provide them with a strong foundation in data analysis and hypothesis testing, which are essential skills for success in this role. By completing this course, learners will gain the ability to analyze data, identify inefficiencies, and develop solutions that can improve operational performance.

See salaries and explore the career path for Operations Research Analyst

Data Engineer

Data Engineers design, build, and maintain data pipelines and systems. The AI Workflow: Data Analysis and Hypothesis Testing course can provide them with a strong foundation in data analysis and hypothesis testing, which are essential skills for success in this role. By completing this course, learners will gain the ability to analyze data, identify data quality issues, and develop data pipelines that can support data-driven decision-making.

See salaries and explore the career path for Data Engineer

Machine Learning Engineer

Machine Learning Engineers design, build, and deploy machine learning models. The AI Workflow: Data Analysis and Hypothesis Testing course can provide them with a strong foundation in data analysis and hypothesis testing, which are essential skills for success in this role. By completing this course, learners will gain the ability to analyze data, identify patterns, and develop machine learning models that can solve business problems.

See salaries and explore the career path for Machine Learning Engineer

Data Visualization Specialist

Data Visualization Specialists create visual representations of data to communicate insights. The AI Workflow: Data Analysis and Hypothesis Testing course can provide them with a strong foundation in data analysis and hypothesis testing, which are essential skills for success in this role. By completing this course, learners will gain the ability to analyze data, identify trends, and create visual representations that can effectively communicate insights to stakeholders.

See salaries and explore the career path for Data Visualization Specialist

Quantitative Analyst

Quantitative Analysts use mathematical and statistical techniques to solve financial problems. The AI Workflow: Data Analysis and Hypothesis Testing course can provide them with a strong foundation in data analysis and hypothesis testing, which are essential skills for success in this role. By completing this course, learners will gain the ability to analyze data, identify patterns, and develop mathematical models that can be used to solve financial problems.

See salaries and explore the career path for Quantitative Analyst

Actuary

Actuaries use mathematical and statistical techniques to assess and manage financial risks. The AI Workflow: Data Analysis and Hypothesis Testing course can provide them with a strong foundation in data analysis and hypothesis testing, which are essential skills for success in this role. By completing this course, learners will gain the ability to analyze data, identify risks, and develop models that can be used to assess and manage financial risks.

See salaries and explore the career path for Actuary

Epidemiologist

Epidemiologists investigate the causes and effects of diseases. The AI Workflow: Data Analysis and Hypothesis Testing course can provide them with a strong foundation in data analysis and hypothesis testing, which are essential skills for success in this role. By completing this course, learners will gain the ability to analyze data, identify patterns, and develop hypotheses that can be used to investigate the causes and effects of diseases.

See salaries and explore the career path for Epidemiologist