This course is designed to provide a complete introduction to Deep Learning. It is aimed at beginners and intermediate programmers and data scientists who are familiar with Python and want to understand and apply Deep Learning techniques to a variety of problems.
We start with a review of Deep Learning applications and a recap of Machine Learning tools and techniques. Then we introduce Artificial Neural Networks and explain how they are trained to solve Regression and Classification problems.
This course is designed to provide a complete introduction to Deep Learning. It is aimed at beginners and intermediate programmers and data scientists who are familiar with Python and want to understand and apply Deep Learning techniques to a variety of problems.
We start with a review of Deep Learning applications and a recap of Machine Learning tools and techniques. Then we introduce Artificial Neural Networks and explain how they are trained to solve Regression and Classification problems.
Over the rest of the course we introduce and explain several architectures including Fully Connected, Convolutional and Recurrent Neural Networks, and for each of these we explain both the theory and give plenty of example applications.
This course is a good balance between theory and practice. We don't shy away from explaining mathematical details and at the same time we provide exercises and sample code to apply what you've just learned.
The goal is to provide students with a strong foundation, not just theory, not just scripting, but both. At the end of the course you'll be able to recognize which problems can be solved with Deep Learning, you'll be able to design and train a variety of Neural Network models and you'll be able to use cloud computing to speed up training and improve your model's performance.
Welcome to the course!
This is a hands-on course where you learn to train deep learning models. Deep learning models are used in real world applications to power technologies such as language translation and object recognition.
Lets get our development environment ready. Let's install Anaconda python and additional python packages you will need in order to follow the course.
Let's get the source code that we will use during the course.
Running your first model will help us check that you have installed all the material correctly.
First of all let's establish a common vocabulary and introduce some common terms that will be used throughout the course
Descriptive statistics and a few simple checks can be very useful to formulate an initial intuition about the data.
Plotting is a powerful way to explore the data and different kinds of plots are useful in different situations.
Let's show an example of plotting with Matplotlib!
Most often than not data is not just tabular. Deep learning can handle text documents, images, sound, and even binary data.
Often Deep Learning uses Image or Audio data, let's see how we can work with it in the Jupyter Environment!
Feature engineering is the process through which we can transform an unstructured datapoint to a structured, tabular record.
In this exercise you will load and plot a dataset, exploring it visually to gather some insights and also to familiarize with python's plotting library: Matplotlib.
Let's continue working through and explaining the solutions!
There are several types of machine learning, including supervised learning, unsupervised learning, reinforcement learning etc. This course focuses primarily on Supervised Learning.
Supervised learning allows computers to learn patterns from examples. It is used in several domains and applications and here you learn to identify problems that can be solved using it.
The easiest example of supervised learning is Linear Regression. LR looks for a functional relation between input and output variables.
In order to find the best possible linear model to describe our data, we need to define a criterion to evaluate the "goodness" of a particular model. This is the role of the cost function.
Let's begin to work through the notebook example for the cost function!
Now that we have both a hypothesis (linear model) and a cost function (mean squared error), we need to find the combination of parameters that minimizes such cost.
Let's play with Keras to create a Linear Regression Model!
How can we know if the model we just trained is good? Since the purpose of our model is to learn to generalize from examples let's test how the model performs on a new set of data not used for training.
Let's code through an example of evaluating model performance!
Classification is a technique to use when the target variable is discrete, instead of continuous. Here we introduce similarities and differences from a regression.
Let's code through a classification example!
In some cases our model may seem to be performing really well on the training data, but poorly on the test data. This is called overfitting.
A more accurate way to assess the ability of our model to generalize to unseen datapoints is to repeat the train/test split procedure multiple times and then average the results. This is called cross-validation.
Let's code through some cross validation!
In a binary classification we can define several types of error and choose which one to reduce.
Sometimes we need to preprocess the features, for example if we have categorical data or if the scale is too big or too small.
Let's code through an example solution of the pre-processing problems!
Deep learning is successfully applied to many different domains. Here we review a few of them.
The perceptron is the simplest neural network and here we learn all about Nodes, Edges, Biases, Weights as well as the need for an Activation function
We can combine the output of a perceptron to the input of another one, stacking them into layers. A fully connected architecture is just a series of such layers. Forward propagation still applies.
Let's code through a NN example!
Let's learn how to work with multiple outputs!
Let's code through an example of multi-class classification!
The activation function is what makes neural networks so powerful. In this lecture we review several types of activation functions and understand why it is necessary.
A neural network formulates a prediction using "forward propagation". Here you will learn what it is.
Let's work through our Deep Learning Introduction exercises!
The Tensorflow playground is a nice web app that allows you to play around with simple neural network parameters to get a feel for what they do.
What is the gradient and why is it important? In this lecture we introduce the gradient in 1 dimension and then extend it to many dimensions.
The gradient is important because it allows us to know how to adjust the parameters of our model in order to find the best model. Here I will give you some intuition about it.
Let's quickly cover the Chain Rule that you'll need to understand!
How does backpropagation work when we have a more complex neural network? The chain rule of derivation is the answer. As we shall see this reduces to a lot of matrix multiplications.
The learning rate is the external parameter that we can control to decide the size of our updates to the weights.
How do we feed the data to our model in order to adjust the weights by gradient descent? The answer is in batches. In this lecture you will learn all about epochs, batches and mini-batches.
Let's briefly go over working with NumPy arrays!
The learning rate is an important parameter of your model, let's go over it!
Let's see how models can be effected using the learning rate
Gradient descent is a first-order iterative optimization algorithm. To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient (or of the approximate gradient) of the function at the current point.
Let's code through an example of Gradient Descent!
Exponentially Weighted Moving Average is one of the most common algorithms used for smoothing!
Many improved optimization algorithms use the ewma filter. Here we review a few improvements to the naive backpropagation algorithm.
Let's code through some optimization algorithms that are using ewma.
Let's code through some initialization, assigning weights to the initial values of our model.
Let's visualize the inner layers of our network!
Let's work through the solutions for exercise 1!
Let's work through the solutions for exercise 2!
Let's work through the solutions for exercise 3!
Let's work through the solutions for exercise 4!
Tensorflow comes equipped with a small visualization server that allows us to display a bunch of things.
Images can be viewed as a sequence of pixels or we can extract ad hoc features from them. Both approaches offer advantages and limitations.
OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.
Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.
Find this site helpful? Tell a friend about us.
We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.
Your purchases help us maintain our catalog and keep our servers humming without ads.
Thank you for supporting OpenCourser.