Machine Learning, Data Science and Generative AI with Python from Udemy

What's inside

Learning objectives

Build artificial neural networks with tensorflow and keras
Implement machine learning at massive scale with apache spark's mllib
Classify images, data, and sentiments using deep learning
Make predictions using linear regression, polynomial regression, and multivariate regression
Data visualization with matplotlib and seaborn
Understand reinforcement learning - and how to build a pac-man bot
Classify data using k-means clustering, support vector machines (svm), knn, decision trees, naive bayes, and pca

Use train/test and k-fold cross validation to choose and tune your models
Build a movie recommender system using item-based and user-based collaborative filtering
Clean your input data to remove outliers
Design and evaluate a/b tests using t-tests and p-values
Show more
Show less

Build artificial neural networks with tensorflow and keras
Implement machine learning at massive scale with apache spark's mllib
Classify images, data, and sentiments using deep learning
Make predictions using linear regression, polynomial regression, and multivariate regression
Data visualization with matplotlib and seaborn
Understand reinforcement learning - and how to build a pac-man bot
Classify data using k-means clustering, support vector machines (svm), knn, decision trees, naive bayes, and pca
Use train/test and k-fold cross validation to choose and tune your models
Build a movie recommender system using item-based and user-based collaborative filtering
Clean your input data to remove outliers
Design and evaluate a/b tests using t-tests and p-values
Show more
Show less

Syllabus

Get a working scientific Python environment set up, and understand how to use this course.

What to expect in this course, who it's for, and the general format we'll follow.

In a crash course on Python and what's different about it, we'll cover the importance of whitespace in Python scripts, and how to import Python modules.

In part 2 of our Python crash course, we'll cover Python data structures including lists, tuples, and dictionaries.

In this lesson, we'll see how functions work in Python.

We'll wrap up our Python crash course covering Boolean expressions and looping constructs.

Pandas is a library we'll use throughout the course for loading, examining, and manipulating data. Let's see how it works with some examples, and you'll have an exercise at the end too.

We cover the differences between continuous and discrete numerical data, categorical data, and ordinal data.

A refresher on mean, median, and mode - and when it's appropriate to use each.

We'll use mean, median, and mode in some real Python code, and set you loose to write some code of your own.

We'll cover how to compute the variation and standard deviation of a data distribution, and how to do it using some examples in Python.

Introducing the concepts of probability density functions (PDF's) and probability mass functions (PMF's).

We'll show examples of continuous, normal, exponential, binomial, and poisson distributions using iPython.

We'll look at some examples of percentiles and quartiles in data distributions, and then move on to the concept of the first four moments of data sets.

An overview of different tricks in matplotlib for creating graphs of your data, using different graph types and styles.

The concepts of covariance and correlation used to look for relationships between different sets of attributes, and some examples in Python.

We cover the concepts and equations behind conditional probability, and use it to try and find a relationship between age and purchases in some fabricated data using Python.

Here we'll go over my solution to the exercise I challenged you with in the previous lecture - changing our fabricated data to have no real correlation between ages and purchases, and seeing if you can detect that using conditional probability.

An overview of Bayes' Theorem, and an example of using it to uncover misleading statistics surrounding the accuracy of drug testing.

We introduce the concept of linear regression and how it works, and use it to fit a line to some sample data using Python.

We cover the concepts of polynomial regression, and use it to fit a more complex page speed - purchase relationship in Python.

Multivariate models let us predict some value given more than one attribute. We cover the concept, then use it to build a model in Python to predict car prices based on their number of doors, mileage, and number of cylinders. We'll also get our first look at the statsmodels library in Python.

We'll just cover the concept of multi-level modeling, as it is a very advanced topic. But you'll get the ideas and challenges behind it.

The concepts of supervised and unsupervised machine learning, and how to evaluate the ability of a machine learning model to predict new values using the train/test technique.

We'll apply train test to a real example using Python.

We'll introduce the concept of Naive Bayes and how we might apply it to the problem of building a spam classifier.

We'll actually write a working spam classifier, using real email training data and a surprisingly small amount of code!

K-Means is a way to identify things that are similar to each other. It's a case of unsupervised learning, which could result in clusters you never expected!

We'll apply K-Means clustering to find interesting groupings of people based on their age and income.

Entropy is a measure of the disorder in a data set - we'll learn what that means, and how to compute it mathematically.

Decision trees can automatically create a flow chart for making some decision, based on machine learning! Let's learn how they work.

We'll create a decision tree and an entire "random forest" to predict hiring decisions for job candidates.

Random Forests was an example of ensemble learning; we'll cover over techniques for combining the results of many models to create a better result than any one could produce on its own.

XGBoost is perhaps the most powerful machine learning algorithm today, and it's really easy to use. We'll cover how it works, how to tune it, and run an example on the Iris data set showing how powerful XGBoost is.

Support Vector Machines are an advanced technique for classifying data that has multiple features. It treats those features as dimensions, and partitions this higher-dimensional space using "support vectors."

We'll use scikit-learn to easily classify people using a C-Support Vector Classifier.

One way to recommend items is to look for other people similar to you based on their behavior, and recommend stuff they liked that you haven't seen yet.

The shortcomings of user-based collaborative filtering can be solved by flipping it on its head, and instead looking at relationships between items instead of relationships between people.

We'll use the real-world MovieLens data set of movie ratings to take a first crack at finding movies that are similar to each other, which is the first step in item-based collaborative filtering.

Our initial results for movies similar to Star Wars weren't very good. Let's figure out why, and fix it.

We'll implement a complete item-based collaborative filtering system that uses real-world movie ratings data to recommend movies to any user.

As a student exercise, try some of my ideas - or some ideas of your own - to make the results of our item-based collaborative filter even better.

KNN is a very simple supervised machine learning technique; we'll quickly cover the concept here.

We'll use the simple KNN technique and apply it to a more complicated problem: finding the most similar movies to a given movie just given its genre and rating information, and then using those "nearest neighbors" to predict the movie's rating.

Data that includes many features or many different vectors can be thought of as having many dimensions. Often it's useful to reduce those dimensions down to something more easily visualized, for compression, or to just distill the most important information from a data set (that is, information that contributes the most to the data's variance.) Principal Component Analysis and Singular Value Decomposition do that.

We'll use sckikit-learn's built-in PCA system to reduce the 4-dimensions Iris data set down to 2 dimensions, while still preserving most of its variance.

Cloud-based data storage and analysis systems like Hadoop, Hive, Spark, and MapReduce are turning the field of data warehousing on its head. Instead of extracting, transforming, and then loading data into a data warehouse, the transformation step is now more efficiently done using a cluster after it's already been loaded. With computing and storage resources so cheap, this new approach now makes sense.

We'll describe the concept of reinforcement learning - including Markov Decision Processes, Q-Learning, and Dynamic Programming - all using a simple example of developing an intelligent Pac-Man.

What's a confusion matrix, and how do I read it?

Bias and Variance both contribute to overall error; understand these components of error and how they relate to each other.

We'll introduce the concept of K-Fold Cross-Validation to make train/test even more robust, and apply it to a real model.

Cleaning your raw input data is often the most important, and time-consuming, part of your job as a data scientist!

In this example, we'll try to find the top-viewed web pages on a web site - and see how much data pollution makes that into a very difficult task!

A brief reminder: some models require input data to be normalized, or within the same range, of each other. Always read the documentation on the techniques you are using.

A review of how outliers can affect your results, and how to identify and deal with them in a principled manner.

We'll present an overview of the steps needed to install Apache Spark on your desktop in standalone mode, and get started by getting a Java Development Kit installed on your system.

A high-level overview of Apache Spark, what it is, and how it works.

We'll go in more depth on the core of Spark - the RDD object, and what you can do with it.

A quick overview of MLLib's capabilities, and the new data types it introduces to Spark.

We'll walk through an example of coding up and running a decision tree using Apache Spark's MLLib! In this exercise, we try to predict if a job candidate will be hired based on their work and educational history, using a decision tree that can be distributed across an entire cluster with Spark.

We'll take the same example of clustering people by age and income from our earlier K-Means lecture - but solve it in Spark!

We'll introduce the concept of TF-IDF (Term Frequency / Inverse Document Frequency) and how it applies to search problems, in preparation for using it with MLLib.

Let's use TF-IDF, Spark, and MLLib to create a rudimentary search engine for real Wikipedia pages!

Spark 2.0 introduced a new API for MLLib based on DataFrame objects; we'll look at an example of using this to create and use a linear regression model.

High-level thoughts on various ways to deploy your trained models to production systems including apps and websites.

Running controlled experiments on your website usually involves a technique called the A/B test. We'll learn how they work.

How to determine significance of an A/B tests results, and measure the probability of the results being just from random chance, using T-Tests, the T-statistic, and the P-value.

We'll fabricate A/B test data from several scenarios, and measure the T-statistic and P-Value for each using Python.

Some A/B tests just don't affect customer behavior one way or another. How do you know how long to let an experiment run for before giving up?

There are many limitations associated with running short-term A/B tests - novelty effects, seasonal effects, and more can lead you to the wrong decisions. We'll discuss the forces that may result in misleading A/B test results so you can watch out for them.

If you skipped ahead, I'll show you where to get the course materials for just this section. And we'll cover some pre-requisite concepts for understanding how neural networks operate: gradient descent, autodiff, and softmax.

We'll cover the evolution of artificial neural networks from 1943 to modern-day architectures, which is a great way to understand how they work.

Google's Tensorflow Playground lets you experiment with deep neural networks and understand them - without writing a line of code!

Traffic lights

Read about what's good

what should give you pause

and possible dealbreakers

Covers the fundamentals of linear regression, polynomial regression, and multivariate regression, essential for learning more advanced topics

Includes exercises and examples in Python, hands-on experience in machine learning and data analysis

Provides a strong foundation in machine learning concepts, suitable for both beginners and those looking to expand their knowledge

Teaches practical data science applications, bridging the gap between theory and real-world use cases

Includes a section on Apache Spark, enabling students to tackle big data sets on computing clusters

Offers a comprehensive overview of machine learning models, including linear regression, SVM, decision trees, and more

Reviews summary

Ai for beginner

Learners say that this course is good for absolute beginners. It provides foundational concepts for Data Science and Generative AI using Python. The instructor is knowledgeable and presents the material in a clear and structured manner. Some learners found some terminology to be unfamiliar, as the course covers a broad range of topics. Overall, learners recommend this course to those new to these fields.

Covers a wide range of topics

"Overall the course is good."

Suitable for learners with no prior experience

"The course was really for a beginner level as mentioned."

May include unfamiliar terms for absolute beginners

"There were some terms which made me uncomfortable(like hadoop, clusters, etc) as i heard them for the first time."

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Machine Learning, Data Science and Generative AI with Python with these activities:

Review Python Basics

Show steps

This course covers machine learning and AI using Python. Completing this activity prior to the course can help refresh your Python programming knowledge, making it easier to follow the course material.

Browse courses on Python

Show steps

Review the Python documentation on data structures and functions.
Practice writing simple Python scripts and run them using a Python interpreter.
Complete a few Python tutorials or exercises to reinforce your understanding.

Explore ML and AI concepts using online tutorials

Show steps

This course covers a wide range of ML and AI concepts. Spend some time exploring these topics with online tutorials. This can provide a helpful introduction and make it easier to understand the course material.

Show steps

Find a few reputable online tutorials or courses on ML and AI.
Go through the tutorials, taking notes and trying out the examples.
Discuss your findings with other learners or experts in the field.

Read 'Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow'

Show steps

This book provides a comprehensive overview of ML and AI concepts, with a focus on practical implementation using popular Python libraries. Reading this book can supplement the course material and reinforce your understanding.

View Hands-On Machine Learning with Scikit-Learn,... on Amazon

Show steps

Read the book thoroughly and take notes on key concepts and techniques.
Work through the code examples and exercises provided in the book.
Discuss your understanding of the book with other learners or experts.

Three other activities

Expand to see all activities and additional details

Show all six activities

Work through practice problems and exercises

Show steps

This course covers a variety of ML algorithms and techniques. To reinforce your understanding, it's helpful to practice solving problems and completing exercises related to these topics.

Browse courses on ML Algorithms

Show steps

Find a collection of practice problems or exercises related to ML and AI.
Work through the problems, using Python to implement your solutions.
Review your solutions and identify areas for improvement.

Create a blog post or presentation on an ML or AI topic

Show steps

Creating content can help you solidify your understanding of ML and AI concepts. Choose a topic that interests you and write a blog post or create a presentation that explains it in detail.

Show steps

Choose a topic that you are interested in and that you have a good understanding of.
Research the topic thoroughly and gather information from credible sources.
Organize your thoughts and create an outline for your blog post or presentation.
Write your blog post or create your presentation, making sure to explain the topic clearly and concisely.
Proofread your work and get feedback from others before sharing it.

Build a small ML or AI project

Show steps

Applying your knowledge to a practical project can significantly enhance your understanding of ML and AI. Choose a project that aligns with your interests and work through the process of developing it.

Show steps

Identify a problem or opportunity that you can address using ML or AI.
Gather data and prepare it for use in your project.
Choose appropriate ML or AI algorithms and implement them in Python.
Evaluate the performance of your project and make improvements as needed.
Deploy your project and share it with others.

Career center

Learners who complete Machine Learning, Data Science and Generative AI with Python will develop knowledge and skills that may be useful to these careers:

Data Scientist

A Data Scientist applies scientific methods, processes, algorithms and systems to extract knowledge and insights from many structural and unstructured data. This course is a good fit if you wish to be a Data Scientist because it covers a wide range of topics in machine learning, data science, and generative AI, including neural networks, TensorFlow, Keras, Apache Spark, and more. It also provides hands-on experience with Python, a programming language commonly used in data science.

See salaries and explore the career path for Data Scientist

Machine Learning Engineer

A Machine Learning Engineer designs, develops, and maintains machine learning models and systems. This course is a good fit if you wish to be a Machine Learning Engineer because it covers a wide range of topics in machine learning, data science, and generative AI, including neural networks, TensorFlow, Keras, Apache Spark, and more. It also provides hands-on experience with Python, a programming language commonly used in machine learning.

See salaries and explore the career path for Machine Learning Engineer

AI Engineer

An AI Engineer builds and maintains artificial intelligence systems. This course is a good fit if you wish to be an AI Engineer because it covers a wide range of topics in machine learning, data science, and generative AI, including neural networks, TensorFlow, Keras, Apache Spark, and more. It also provides hands-on experience with Python, a programming language commonly used in AI.

See salaries and explore the career path for AI Engineer

Data Architect

A Data Architect designs and manages data systems and infrastructure. This course is a good fit if you wish to be a Data Architect because it covers a wide range of topics in data science, including data architecture, data engineering, and big data.

See salaries and explore the career path for Data Architect

Data Analyst

A Data Analyst collects, cleans, and analyzes data to help organizations make informed decisions. This course is a good fit if you wish to be a Data Analyst because it covers a wide range of topics in data science, including data cleaning, data analysis, and data visualization.

See salaries and explore the career path for Data Analyst

Data Engineer

A Data Engineer builds and maintains data pipelines and infrastructure. This course is a good fit if you wish to be a Data Engineer because it covers a wide range of topics in data science, including data engineering, data warehousing, and big data.

See salaries and explore the career path for Data Engineer

Software Engineer

A Software Engineer builds and maintains software applications. This course may be useful if you wish to be a Software Engineer because it covers topics such as Python, data science, and machine learning, which are becoming increasingly important in the software industry.

See salaries and explore the career path for Software Engineer

Business Analyst

A Business Analyst uses data and analysis to help organizations improve their operations. This course may be useful if you wish to be a Business Analyst because it covers topics such as data analysis, data visualization, and machine learning, which are becoming increasingly important in the business world.

See salaries and explore the career path for Business Analyst

Product Manager

A Product Manager plans and manages the development and launch of new products. This course may be useful if you wish to be a Product Manager because it covers topics such as data analysis, data visualization, and machine learning, which are becoming increasingly important in the product development process.

See salaries and explore the career path for Product Manager

Quantitative Trader

A Quantitative Trader uses mathematical and statistical models to trade financial assets. This course may be useful if you wish to be a Quantitative Trader because it covers topics such as data analysis, data visualization, and machine learning, which are becoming increasingly important in the financial industry.

See salaries and explore the career path for Quantitative Trader

Statistician

A Statistician collects, analyzes, and interprets data to help organizations make informed decisions. This course may be useful if you wish to be a Statistician because it covers topics such as data analysis, data visualization, and machine learning, which are becoming increasingly important in the field of statistics.

See salaries and explore the career path for Statistician

Market Researcher

A Market Researcher conducts research to help organizations understand their customers and markets. This course may be useful if you wish to be a Market Researcher because it covers topics such as data analysis, data visualization, and machine learning, which are becoming increasingly important in the field of market research.

See salaries and explore the career path for Market Researcher

DevOps Engineer

A DevOps Engineer works to bridge the gap between development and operations teams. This course may be useful if you wish to be a DevOps Engineer because it covers topics such as data science, machine learning, and cloud computing, which are becoming increasingly important in the field of DevOps.

See salaries and explore the career path for DevOps Engineer

Research Scientist

A Research Scientist conducts research to advance scientific knowledge. This course may be useful if you wish to be a Research Scientist because it covers topics such as data science, machine learning, and artificial intelligence, which are becoming increasingly important in the field of scientific research.

See salaries and explore the career path for Research Scientist

Cloud Architect

A Cloud Architect designs and manages cloud computing systems. This course may be useful if you wish to be a Cloud Architect because it covers topics such as data science, machine learning, and cloud computing, which are becoming increasingly important in the field of cloud computing.

See salaries and explore the career path for Cloud Architect