Introduction to Deep Learning: A Comprehensive Course
You've probably landed on this page because you've had one too many conversations about Deep Learning, and you want to learn more. You're at the right place. Read on for a digest of the fundamentals to master and our best-in-class resources to upskill.
You’ve probably landed on this page because you’ve had one too many conversations about Deep Learning, and you want to learn more.
We live in the big data era, where nearly every industry is discovering new ways to implement massive amounts of data into their business plan goals and retrieve beneficial outputs. The more readily available data and computational power has allowed us to produce more accurate outputs and build models more efficiently.
Before diving into Deep Learning, let’s quickly state some important definitions:
Machine Learning (ML) is a form of AI that allows models to learn and improve using past experience by exploring the data and identifying patterns with little human intervention.
For example:
Financial fraud detection - money laundering, security checks on Paypal
Image Recognition - identifying objects, persons, places, etc.
Artificial Intelligence (AI) is the ability of a computer or a computer-controlled robot to perform tasks that humans usually do as they require human intelligence.
For example
Self-driving cars - built-in AI-powered safety functions
Covid-19 - thermal imaging in airports.
What is Deep Learning?
Artificial Intelligence, Machine Learning and Deep Learning
Deep Learning (DL) is a subset of Machine Learning in which computers are taught to imitate exactly what humans naturally do. This is done by training an algorithm to predict outputs, given a set of inputs, using both supervised and unsupervised learning.
If we look at the traditional method of computer programming, it involves input and a set of rules, which are then combined to get the desired output. However, in Machine Learning and Deep Learning, the input and information we already have help us to create a set of rules. We can then use this set of rules and apply it to data that we don’t have information about.
Supervised Learning is when the algorithm learns on a labeled dataset and analyzes the training data. These labeled data sets have inputs and expected outputs.
Unsupervised Learning is when the algorithm works with unlabeled data, inferring hidden structures to produce accurate and reliable outputs.
When the dataset is labeled, we know the correct answers. Therefore the algorithm can iteratively make predictions on the training data. One way to remember Supervised Learning is to imagine a teacher supervising the learning process.
The teacher knows the answers, and the student (the algorithm) is learning the process (by processing the labeled data). The teacher will repeatedly ask the student to correct the output produced, and the learning process doesn’t stop until the student (the algorithm) achieves a specific level of accuracy.
The difference with Unsupervised Learning - the data has no specific structure, causing the student (the algorithm) to use AI to perform the logical classification of the data. This process of training the student (the algorithm) does not require any help from the teacher (labeled and structured data).
History of Deep Learning
Frank Rosenblatt is an American psychologist who was well-known in the Artificial Intelligence field and is sometimes referred to as the father of deep learning. He was responsible for developing the basic elements of the deep learning systems used today.
The first working learning algorithm for supervised, deep, feedforward, multilayer perceptrons was published by Alexey Ivakhnenko, a Soviet and Ukrainian mathematician, in 1967.
Later on, in 1989, Yann LeCun, a French computer scientist, managed to apply the standard backpropagation algorithm to a deep neural network. His aim was to be able to recognize handwritten ZIP codes on mail. LeCun was eventually successful. However, the training required 3 days to complete, though.
From this moment, more and more people start coming forward to put their skills to the test. They aimed to improve the foundations of deep learning laid by Rosenblatt and develop them to bring innovative methods to the world.
Deep Learning vs. Machine Learning: Learn the Difference
Knowing the difference between Deep Learning and Machine Learning is important to your learning process.
We know that Machine Learning and Deep Learning are subsets of Artificial Intelligence. The simplest way to remember the difference is that Deep Learning has the ability to do everything that Machine Learning does essentially, but not the other way around.
Let’s go through the actual differences…
Machine Learning:
Data Size: Can train on smaller datasets
Learning Process: Uses both supervised and unsupervised learning to aid the learning and correction process
Training Duration: Shorter training time
Accuracy: Lower accuracy
Correlations: Simple, linear
Computational power requirements: Can train on a regular CPU
Deep Learning:
Data Size: Requires a large amount of data
Learning Process: Learning process is based on its environment and using past mistakes to improve
Training Duration: Longer training time
Accuracy: Higher accuracy
Correlations: Complex, non-linear
Computational power requirements: Requires a specialized GPU to train
The reason there has been so much hype around Deep Learning is that Deep learning models have been able to achieve higher levels of recognition accuracy than ever before, where some have exceeded human-level performance in tasks such as image recognition.
Artificial Deep Neural Networks
Now that we understand the basics of Deep Learning, how does it actually work? Deep Learning models use artificial neural networks to extract information. It is sometimes also referred to as a Deep Neural Network, where the term ‘Deep’ relates to the number of hidden layers in the neural network.
What is a Neural Network?
A Neural Network is a network of biological neurons. Deep learning is the process of teaching computers to process data in the same way that neural networks in the human brain do.
To mimic our brains, Artificial Neural Networks contain artificial neurons (or nodes).
Artificial Neural Networks (ANNs) are made up of neurons containing three layers: an input layer, one or more hidden layers, and an output layer. Each of these neurons is connected to another neuron, which is where computation happens.
Input Layer - receives the input data.
Hidden Layer(s) - perform mathematical computations on the input data
Output Layer - returns the output data.
In order for an Artificial Neural Network to be considered a Deep Learning network, it has to have more than one hidden layer.
Deep Neural Network
Weight and Bias
To understand the process of Deep Neural Networks, we need to cover Weight and Bias.
As you can see from the image representation of a Deep Neural Network above, you can see that there is a lot of movement in a lot of directions. Weight and Bias are responsible for the movement within an Artificial Neural Network.
Weight controls the strength of the connection between two neurons. It decides how much influence the input has on the output.
Bias will always have a value of 1 and is an additional input into the next layer. The aim of bias is to guarantee that there will always be activation in the neurons, even if the input is 0.
Hidden Layer
The Hidden layer(s) is what brings your Deep Neural Network to life. These layers receive input from either the input layer or another hidden layer. Hidden layers also provide the output to other layers, either hidden or output layers.
The reason these layers are called “hidden” is that we don’t know the true values of their nodes in the training dataset.
Activation Function
An Activation Function decides whether a neuron should be activated or not. It uses mathematical concepts to decide whether a neuron’s input will make a difference to the network or not.
It can also be referred to as a ‘Transfer Function’ as it transfers the next move. Activation Functions are very beneficial to neural networks as they help them to learn complex patterns in data and decide what pattern and information are vital to be fired to the next neuron. They control how well the neural network model learns on the training dataset.
It is important to note that there are different types of Activation Functions, so choosing an Activation Function has a large impact on the performance of the neural network. You can use different Activation Functions for different parts of the model to help achieve better performance levels.
This is what an Artificial Neuron looks like:
Artificial Neuron
How does Deep Learning work?
Now that we have covered the elements of an Artificial Neural Network, we can dive into how Deep Learning works.
The input layer is the first layer that receives data. Weights are assigned to each of the variables of the data.
These variables are multiplied by their respective weights and then are all summed up.
Bias will then be added to the summed value.
The summed value is then passed through an activation function, which will determine the output.
If the output meets or exceeds the specific threshold, the neuron will be activated, and the data will be passed from one connected layer to another in the network.
If the neuron is activated, the output becomes the input for the next neuron, or if it is at the end of the process, it is the overall output.
This process is known as Feedforward Network, as the information always moves in one direction (forward). If the information moves backward, this is called Backpropagation, sometimes abbreviated as "backprop". Backpropagation acts as a messenger, informing the Neural Network whether it made a mistake when making a prediction.
Backpropagation goes through the following steps:
The Neural Network makes a guess about the data
The Neural Network is measured with a loss function, also referred to as a cost function.
If an error occurs, it will be backpropagated to be adjusted and corrected.
If the network makes a wrong guess about the data, Backpropagation will take this error and adjust the neural network’s parameters in the direction of less error. Gradient Descent does this.
Gradient Descent
Gradient Descent is an optimization algorithm, A method that is iteratively applied to compare different solutions and outputs in order to find the optimum solution. It is used to train models and neural networks to help them learn and reduce the Cost Function.
The Cost Function measures how efficiently a neural network performs in relation to the given training sample and the expected output. Ideally, we would want a Cost Function of 0, which tells us that outputs produced through AI are the same as the training data set outputs.
A gradient is a slope that represents a relationship between two variables: “y over x”. In Neural Networks, the value ‘y’ represents the error produced, and the value ‘x’ represents the parameter of the Neural Network.
There is a relationship between the parameters and the error, therefore adjusting the parameters can either increase or decrease the error. In order to do this, you have to change the weights in small increments after each data set iteration.
The weights are automatically updated, allowing us to see a visual representation of which direction is the lowest error.
Gradient Descent
Types of Neural Networks
There are different types of neural networks you can use in Deep Learning. They differ depending on their structure, data flow, density, layers, activations fillers, etc.
Feedforward Artificial Neural Network
We have already mentioned this type of neural network above. As the name suggests, a Feedforward artificial neural network is when data moves in a forward direction from the input to the output nodes.
The data moves through the different layers in the forward direction and will never move backward. Due to its one-way movement, this type of neural network model is relatively simple. Common applications of this type of neural networksnetwork include data compression, pattern recognition, and computer vision tasks.
Perceptron and Multilayer Perceptron
The Perceptron model was proposed by Minsky and Papert and is known for being one of the oldest and simplest models of neural networks. A Perceptron model is a binary classifier, separating data into two different classifications. Due to this, it is sometimes referred to as TLU (threshold logic unit). Common applications of this type of neural network include data compression, streaming encoding, social media, music streaming, and online video platforms.
Multilayer Perceptron neural networks increase the density and complexity of simpler perceptron neural networks thanks to their many hidden layers. They are fully connected networks as each node on a specific layer is connected to every node on the next layer. They are used for complex tasks such as voice recognition, machine translation, and complex classification.
Convolutional Neural Network
Convolutional Neural Network (CNN), sometimes called ConvNet, is a feed-forward Neural Network. It contains a three-dimensional arrangement of neurons instead of the standard two-dimensional array. CNN's are typically used in image recognition by processing pixelated data to detect and classify objects in an image. A CNN consists of the following multiple layers:
Convolution Layer - This layer extracts features from the input image by transforming it.
Rectified Linear Unit (ReLU) - a non-linear function is applied to a multi-layer neural network to represent the data in a non-linear form.
Pooling Layer - This layer is used to reduce the dimensions of the feature maps by reducing the number of parameters the model learns and the computational power used in the network.
Fully Connected Layer - This comprises a series of connected layers that link every neuron in one layer to every neuron in another layer. The flattened matrix from the pooling layer is fed as an input used to classify and identify the images.
Generative Adversarial Networks (GANs)
Generative Adversarial Networks use two neural networks. The word ‘adversarial’ in the name is because these two networks compete with one another. One network is called ‘the generator’ and learns to generate fake data. The other network is called ‘the discriminator’ and learns from that fake information.
During the initial training phase, The Generator starts by learning to create the fake data. The Discriminator aims to learn the difference between the real sample data and the fake data which The Generator has generated. The network will send these results to the Generator and the Discriminator in order to continuously keep updating the model.
Generative Adversarial Networks
How to create and train Deep Learning Models?
Below are simple steps on how to create and train a Deep Learning model.
Choose the Correct Architecture
As mentioned above, there are different types of neural networks for different tasks. You could be working on a simple classification task on data, which can use the Perceptron neural network, or you may want to create an image recognition model, in which you can use CNN.
Quantity and Quality Data
Deep Learning models require a larger volume of data compared to Machine Learning models. You must ensure that your model has enough data to learn from and produce accurate predictions. Although the quantity of your data is important, do not forget the quality. Correctly labeled data will help your Deep Learning model learn effectively and make the process much smoother.
Feeding the Data into the Neural Network
Once you have gathered your high-quality and quantity dataset, your next step will be to feed this data into the model. The model will learn the dataset and make predictions based on its knowledge. To improve your model, you will need to find out what your prediction error is, for example, calculating the cost function using gradient descent.
Change the Weights and Repeat
Once you have a better understanding of the relationship between the parameters and the error, you can choose the correct weight that will help you achieve a minimum loss function. You will need to repeatedly compute the cost function in order to achieve the optimum weight with the lowest cost function.
Learn more!
Discover how training data can make or break your AI projects, and how to implement the Data Centric AI philosophy in your ML projects.
Deep Learning: Applications and Limitations
Applications
Fraud Detection - Deep Learning can use attributes from various data sources, such as device location, purchasing pattern, and more, to detect anomalies in user transactions.
Computer Vision - Image recognition can be used to detect objects such as airplanes, faces, and guns and help reduce potential risks.
Agriculture - The combination of machine learning, computer vision, and robotics has made the agriculture sector more efficient. For example, powering self-driving machinery, detecting intrusive wild animals, and forecasting crop yields.
Natural Language Processing - Deep Learning has been used to power chatbots with the ability to respond to large volumes of messages and provide more accurate responses.
Limitations
Data - Deep Learning requires large volumes of data in order to help the models' learning process. If data is of bad quality or insufficient data, the model will fail to learn and lack generalization.
Complex Models - Building a Deep Learning model can be a complex and tedious task. It will require a lot of fine-tuning to obtain the best-performing Deep Learning model.
Incapable of performing different tasks - Deep Learning models only have the ability to perform tasks for which they have been trained.
Computation/Hardware - Deep Learning models are computationally expensive as they require high-performing graphics processing units (GPUs) and tensor processing units (TPUs) in order to train the models in a shorter period of time effectively.
Tools and Resources
To successfully master anything in life, you need the right tools and resources to help you get there. Deep Learning is very popular, but it can be nerve-wracking to find the best resources to achieve success in it.
Tools
Here is a list of 10 ML frameworks that will help you be successful in Deep Learning:
Other Resources
Other resources that can help you master the art of deep learning are books and online courses. We would recommend Deep Learning - An MIT Press book, a book that helps students swiftly enter the field of Machine Learning with a focus on Deep Learning. The online version of the book is available online for free.
You may have also heard of Andrew Ng - Founder & CEO of Landing AI, Founder of deeplearning.ai, Co-Chairman and Co-Founder of Coursera. He has put together a Deep Learning Specialization course. It is made up of 5 sub-courses:
Deep Learning and Machine Learning (free) courses and classes: our recommendations
I will start off with the free courses.
Free Courses on Machine Learning and Deep Learning
Understanding Machine Learning: From Theory to Algorithms, a free e-book that is split into 4 sections: Part I: Foundations, Part II: From Theory to Algorithms, Part III: Additional Learning Models, and Part IV: Advanced Theory. This ebook provides you with the foundations required to move into Deep Learning.
Udacity offers an Introduction to Machine Learning Course, which will take you approximately 10 weeks to complete. You can then move on to their Nanodegree programs.
If you are not a bookworm and prefer online YouTube courses, Simplilearn offers a free Artificial Intelligence And Deep Learning Full Course on YouTube. The course covers the basics of AI, the Future of AI, AI of Detail, What is Deep Learning, Neural Networks, Tensorflow Object Detection, Recurrent Neural Networks, What are GANs, Keras Tutorial, OpenCV, and Deep Learning Interview Questions.
freeCodeCamp offers a free YouTube course on PyTorch for Deep Learning & Machine Learning. This course will go through the foundations of Machine Learning and Deep Learning with PyTorch. you can find the course materials here.
IBM offers a wide range of free courses, but one in particular that will help you excel in Machine Learning and apply it to Deep Learning is their Machine Learning with Python: A Practical Introduction course. You will cover an intro to Machine Learning, Regression, Classification, Unsupervised Learning, and Recommender Systems.
Dive into Deep Learning provides you with an extensive overview of deep learning, however, the plus with this ebook is that it starts with the foundations of machine learning to improve your learning process.
Paid Courses
DeepLearning.AI, by Andrew Ng, offers a Supervised Machine Learning: Regression and Classification course on Coursera. This is the first course in this Machine Learning specialization, you will then move on to Advanced Learning Algorithms and then Unsupervised Learning, Recommenders, Reinforcement Learning.
Udemy offers a great course called Machine Learning A-Z™: Python & R in Data Science, with 46 sections and 382 lectures to help you master Machine Learning in Python and R. If you want to dive deeper into Deep Learning, they offer a Deep Learning A-Z™: Hands-On Artificial Neural Networks course.
If you want a course that covers Data Science, Machine Learning and Deep Learning, Udemy offers a Machine Learning, Data Science and Deep Learning with Python course.
Deep Learning Experts to Follow
Another way to learn is by keeping up with influences and experts in the industry.
Here is a list of YouTubers I would recommend to learn more about Machine Learning and Deep Learning:
If you are also looking at updating your Instagram feed, here is a list of pages to follow always to have Machine Learning and Deep Learning knowledge running through your mind:
@machinelearning
@python.learning
@madewithcode
@learn.machinelearning
@neuralnet.ai
Deep Learning: In a Nutshell (Key Takeaways)
Deep Learning has the ability to do everything that Machine Learning does essentially, but not the other way around.
Deep learning models have been able to achieve higher levels of recognition accuracy, where some have exceeded human-level performance.
Artificial Neural Networks are made up of neurons that contain three different layer types: an input layer, one or more hidden layers, and an output layer.
The reason why hidden layers are called “hidden” is that we don’t know the true values of their nodes in the training dataset.
The Activation Function uses mathematical concepts to decide whether a neuron’s input will make a difference and decides whether a neuron should be activated or not.
Gradient Descent is an optimization algorithm used to train models and neural networks to help learn and reduce the Cost Function.