This is a hands-on, project-based course designed to help you master the foundations for unsupervised learning in Python.
We’ll start by reviewing the data science workflow, discussing the techniques & applications of unsupervised learning, and walking through the data prep steps required for modeling. You’ll learn how to set the correct row granularity for modeling, apply feature engineering techniques, select relevant features, and scale your data using normalization and standardization.
This is a hands-on, project-based course designed to help you master the foundations for unsupervised learning in Python.
We’ll start by reviewing the data science workflow, discussing the techniques & applications of unsupervised learning, and walking through the data prep steps required for modeling. You’ll learn how to set the correct row granularity for modeling, apply feature engineering techniques, select relevant features, and scale your data using normalization and standardization.
From there we'll fit, tune, and interpret 3 popular clustering models using scikit-learn. We’ll start with K-Means Clustering, learn to interpret the output’s cluster centers, and use inertia plots to select the right number of clusters. Next, we’ll cover Hierarchical Clustering, where we’ll use dendrograms to identify clusters and cluster maps to interpret them. Finally, we’ll use DBSCAN to detect clusters and noise points and evaluate the models using their silhouette score.
We’ll also use DBSCAN and Isolation Forests for anomaly detection, a common application of unsupervised learning models for identifying outliers and anomalous patterns. You’ll learn to tune and interpret the results of each model and visualize the anomalies using pair plots.
Next, we’ll introduce the concept of dimensionality reduction, discuss its benefits for data science, and explore the stages in the data science workflow in which it can be applied. We’ll then cover two popular techniques: Principal Component Analysis, which is great for both feature extraction and data visualization, and t-SNE, which is ideal for data visualization.
Last but not least, we’ll introduce recommendation engines, and you'll practice creating both content-based and collaborative filtering recommenders using techniques such as Cosine Similarity and Singular Value Decomposition.
Throughout the course you'll play the role of an Associate Data Scientist for the HR Analytics team at a software company trying to increase employee retention. Using the skills you learn throughout the course, you'll use Python to segment the employees, visualize the clusters, and recommend next steps to increase retention.
COURSE OUTLINE:
Intro to Data Science
Introduce the fields of data science and machine learning, review essential skills, and introduce each phase of the data science workflow
Unsupervised Learning 101
Review the basics of unsupervised learning, including key concepts, types of techniques and applications, and its place in the data science workflow
Pre-Modeling Data Prep
Recap the data prep steps required to apply unsupervised learning models, including restructuring data, engineering & scaling features, and more
Clustering
Apply three different clustering techniques in Python and learn to interpret their results using metrics, visualizations, and domain expertise
Anomaly Detection
Understand where anomaly detection fits in the data science workflow, and apply techniques like Isolation Forests and DBSCAN in Python
Dimensionality Reduction
Use techniques like Principal Component Analysis (PCA) and t-SNE in Python to reduce the number of features in a data set without losing information
Recommenders
Recognize the variety of approaches for creating recommenders, then apply unsupervised learning techniques in Python, including Cosine Similarity and Singular Vector Decomposition (SVD)
Ready to dive in? Join today and get immediate5 hours of high-quality video
22 homework assignments
7 quizzes
3 projects
Data Science in Python: Unsupervised Learning ebook (350+ pages)
Downloadable project files & solutions
Expert support and Q&A forum
30-day Udemy satisfaction guarantee
If you're an aspiring or seasoned data scientist looking for a practical overview of unsupervised learning techniques in Python with a focus on interpretation, this is the course for you.
Happy learning.
-Alice Zhao (Python Expert & Data Science Instructor, Maven Analytics)
OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.
Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.
Find this site helpful? Tell a friend about us.
We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.
Your purchases help us maintain our catalog and keep our servers humming without ads.
Thank you for supporting OpenCourser.